Artificial Intelligence and Cybersecurity

Artificial Intelligence and Cybersecurity

The Crossroads of Artificial Intelligence, Machine Learning, and Deep Learning

by Chrissa Constantine


This article is a part of the FREE PREVIEW of our issue "Machine Learning, Deep Learning, and Cybersecurity". Register on our website to get the access to more of our free content on a regular basis.


What is Artificial Intelligence, Deep Learning, and Machine Learning?

Think of artificial intelligence (AI), deep learning (DL) and machine learning (ML) as the layers of an onion. Starting with the outer layer of the onion (Figure 1) as AI, as you move through the layers, you encounter ML, and then DL, which is a subset of machine learning.

The term artificial intelligence is frequently used as a marketing product term by many cybersecurity companies without consensus about what it means. Part of the issue in defining AI is that it relates back to human intelligence, which is hard to describe. Different researchers in the field of intelligence focus on various aspects of intelligence in their definitions (Sternberg, 2018).

Generally speaking, AI refers to a “broad field of science encompassing not only computer science but also psychology, philosophy, linguistics and other areas” (Bakhshi, 2017). However, some definitions of AI relate to computerized systems or machines as exhibiting behavior or performing tasks requiring intelligence like that of a human. In these definitions, intelligence refers to an “ability to plan, reason and learn, sense and build some kind of perception of knowledge and communicate in natural language” (Bakhshi, 2017).

Currently, AI can only do the task it was designed to perform, and specific algorithms are developed to solve problems. At this point, AI does not understand what it was trained to do, but the future of AI is to design systems that can learn and then solve any problem.

AI is a collection of technology that a spectrum of industries utilizes, such as agriculture, healthcare for medical diagnosis systems, transportation and logistics, finance, and cybersecurity. A powerful technique used for cybersecurity technology is machine learning.

Historical Background

In 1956, John McCarthy coined the term artificial intelligence. AI uses algorithms or machines to perform tasks that intersect characteristics of human intelligence, such as planning, understanding natural languages, recognizing objects, learning and problem-solving (McClelland, 2017). In 1959, Arthur Samuel described AI as “the ability to learn without being explicitly programmed.” Prior descriptions of AI systems include, “machine-driven decision engines that can achieve near-human-level intelligence” (Chio, 2018). However, there is contention about the definition of artificial intelligence due to varying definitions of what constitutes human intelligence. There are different thoughts about how AI should be understood. Some researchers and scientists believed that it would make the most sense to build systems that respond to rules and logic and that make their inner workings transparent. Others maintained that biology would be the inspiration of machine and that the machine would create the program. In today’s systems, the evolution of AI has been for machines to program themselves. The rise of big data and dependence on computerization for many verticals has inspired the development of machine learning techniques that are very powerful. Machine Learning (ML) are mathematical techniques that enable information mining, pattern discovery and drawing inferences from data (Chio,2018). Think about machine learning as a part of AI, but AI does not always utilize machine learning methods. ML can discern patterns from examining raw data and can then use models to make predictions.

Machine learning refers to an algorithm that can create abstractions (models) by training on a dataset and is a method of training an algorithm to accomplish a task. Training involves providing large data sets to the algorithm so the algorithm can adjust and improve. Machine learning modifies itself when exposed to more data. The learning part of machine learning refers to ML algorithms optimizing along a dimension, such as trying to minimize error or enhance the likelihood of predictions becoming true (Chio, 2018).

The functionality of the human brain is what inspired deep learning. DL uses algorithms and artificial neural networks to create models and uses multiple layers of networks to improve with training or iteration (Storkey, 2017). Artificial neural networks are algorithms that mimic the biologic structure of the brain and have discrete layers and connections. These layers are what give deep learning its name, and DL algorithms require vast amounts of data to obtain results.

Deep artificial neural networks can solve issues in image or sound recognition and in detecting fraud in the finance industry. Some current applications include recommendation systems or activity recognition. Google started to use deep learning for the Google Brain project in 2011, but branched out and extended use of DL in over 1,000 projects. Microsoft uses DL for commercial speech recognition projects such as X-Box, search rankings, photo search, and translation systems. Facebook uses DL neural networks to translate 2 billion user posts per day in over 40 languages (McClelland, 2017). Deep learning can be used to detect and prevent insider threats and is used in anomaly detection because DL can identify patterns in data that has little consistency between sources.

Deep neural networks have thousands of simulated neurons arranged into various interconnected layers. Layers of inputs fed into each successive layer generate final outputs. These layers within the deep learning network enable it to recognize objects at multiple levels of abstraction. To capture and explain what is happening in DL, Google researchers modified a deep learning-based image recognition algorithm, Deep Dream, to generate or alter images. These images show how different DL is from human perception. Refer to Figure 2 for an example of a Deep Dream image made from a file upload by the author.

On a daily basis, machine learning is used to enhance smartphones, smartwatches, home devices, and even in online searches. If you perform a search on Google, and it comes back with “Did you mean…?” that is a result of machine learning algorithms in Google search. Machine learning techniques are used to determine what activity is performed by a user based upon GPS, gyroscope and accelerometer sensors in a user’s phone. Applications based upon ML algorithms can be used to tell how far you walked, how many calories you burned, where you went, or give you directions and track your movements. Other examples of ML algorithms in use include image processing, which uses ML techniques for facial recognition or biometric recognition software. Machine learning can be used to retrieve data from image or medical applications and has many applications across a spectrum of industries. Moreover, it is more and more prevalent in our personal lives through the use of mobile phones, gaming consoles, and other computing or internetenabled devices.

Is AI Ready?

Rapid advances in big data, data analytics, and machine learning are used to convert millions of scattered data points into databases for use in various cybersecurity arenas, such as threat intelligence analysis. AI continues to evolve and has a wide variety of applications. As AI develops capabilities to handle large, complex and unstructured data, it may be able to outperform people in areas such as threat intelligence.

The investment poured into the field of data science, and specifically AI, means that AI is featured in the news regularly. However, the key lies in being able to discover how AI can help corporations align with their strategic objectives. Any company using new technology must blend experience, knowledge, and insight into the integration of the tool into business practices. Some companies are struggling to implement organizational and process changes to integrate machine intelligence analysis into core business processes. A lot of what happens behind ML is a black box; unless an executive has an advanced math degree, it can be challenging to understand how to adopt the new technology. The real struggle is to understand which machine intelligence capabilities to incorporate into the business.

Currently, AI is expensive and difficult to implement fully into businesses, and, at this time, AI is not ready to fully meet the demands of cybersecurity. The science fiction style concept of AI, the ability for a machine to mimic intelligent human behavior, does not exist at this time. However, machine learning can still be leveraged to support cybersecurity initiatives. The technology stack using machine learning is growing. Large tech companies rely on machine intelligence and have products that depend upon AI or machine learning. Some of these companies have launched open source libraries and research. For example, OpenAI offers the public access to research and environments. Google’s TensorFlow (uses machine learning) and Google Cloud AutoML were introduced to make AI accessible to businesses. Cloud AutoML uses machine learning and neural architecture technologies. Also, Microsoft, in partnership with Amazon Web Services, offers Gluon, an open-source deep-learning library for developers. Other solutions, like IBM’s Watson, use X-Force to learn security language and analyze information, and call their solution AI enabled. If the general premise is that AI must have a large dataset to provide a satisfactory answer, then IBM’s Watson, which ingests tens of thousands of documented software vulnerabilities, security research papers and data from blogs, could be considered an expert system – a system narrowly focused on a particular problem. In traditional AI, expert systems were often used to support a medical diagnosis (Martin, 2016). The evolvement of technology means that attack techniques are also evolving and are becoming more sophisticated in penetrating systems and evading traditional signature-based approaches to cybersecurity. However, ML can be used to offer a solution to these threats due to its ability to adapt and learn in new and unknown circumstances.

Cyber Threats – Offensive and Defensive Measures

In 2016, there were notable advancements in AI, and we also saw an increase in ransomware, malware attack vectors and other forms of attack from cybercriminals. Many organizations are turning to machine learning to provide a better deterrent against attack and to support cybersecurity analysis. The goal in utilizing AI systems would be to scale security operations, improve responsiveness in response to attacks or breaches, assist security personnel in decision making, and to minimize exposure to emergent threats. While there are some defensive applications of machine learning analysis and automation, there are also a rising number of offensive uses of the same technology.

Defenders and cybersecurity personnel attempt to use ML to detect attacks, but nothing prevents adversaries from also using this technology to their advantage. Attackers can use machine learning to evade spam filters or to learn more about a target to craft a perfect social engineering email or scam.

A survey by Cylance at Black Hat USA 2017 showed the majority of information security professionals (62 percent) believe that AI is going to be weaponized by hackers (Elazari, 2017). In a DEFCON 2017 lecture, a data scientist from Endgame demonstrated and publicly released a malware manipulation environment for OpenAI Gym, an open-source toolkit for learning algorithms (Elazari, 2017).

The systems that machine learning algorithms rely on may be vulnerable to hacking. Machine learning algorithms can be susceptible to an attack due to a lack of security design. (Chio, 2018) In other cases, a hacker may determine the data used as a training dataset and manipulate the input data to the algorithm. In day-to-day examples, search engine ML algorithms have been manipulated to boost ranking. Senders of spam try to trick the spam-filtering algorithm by using misspellings or by adding unrelated words or sentences to make them seem like a legitimate email.

While these examples happen daily, there can be more dangerous consequences. The credit card and financial industries are using machine learning to identify fraud. If an attacker knows the pattern of a shopper, then fraudulent purchases may occur that deviate only slightly from normal behavior, which would be undetected by a fraud-based anomaly detection system. Therefore, businesses seeking to leverage machine learning enabled technology need to threat model and perform risk assessments when creating machine learning systems for cybersecurity purposes. Other vulnerabilities in these systems come from flawed designs, algorithmic limitations or a combination of both (Chio, 2018).

ML in cybersecurity falls into two categories – anomaly detection and pattern recognition. Both tasks are related but look at the issue from different perspectives. For pattern recognition, datasets are examined to discover various characteristics hidden within the data. These characteristics can then be used to teach an algorithm to recognize other data with similar features. Anomaly detection is used to establish a baseline of normalcy for describing a dataset and occurs when there are deviations from the norm.

While these examples happen daily, there can be more dangerous consequences. The credit card and financial industries are using machine learning to identify fraud. If an attacker knows the pattern of a shopper, then fraudulent purchases may occur that deviate only slightly from normal behavior, which would be undetected by a fraud-based anomaly detection system. Therefore, businesses seeking to leverage machine learning enabled technology need to threat model and perform risk assessments when creating machine learning systems for cybersecurity purposes. Other vulnerabilities in these systems come from flawed designs, algorithmic limitations or a combination of both (Chio, 2018).

ML in cybersecurity falls into two categories – anomaly detection and pattern recognition. Both tasks are related but look at the issue from different perspectives. For pattern recognition, datasets are examined to discover various characteristics hidden within the data. These characteristics can then be used to teach an algorithm to recognize other data with similar features. Anomaly detection is used to establish a baseline of normalcy for describing a dataset and occurs when there are deviations from the norm.

Spam filtering, malware detection, and botnet detection have used machine learning algorithms to aid cybersecurity analysts. Access control is another area where ML can be used to detect and defend against breaches or information theft. ML, in this case, can include unsupervised learning and anomaly detection. These systems can infer access patterns by users or roles and can engage in various actions when an unexpected pattern is detected (Chio, 2018).

There are multiple attack vectors against machine learning, such as attacks that alter the learning process (attack thetraining dataset), attacks on integrity or availability, and targeted attacks. Problems such as where the machine learning algorithms have made a difference and ones where machine learning has tried but failed to yield usable results, are two use cases for cybersecurity. The following sections describe areas where machine learning has made improvements in cybersecurity. There are many appealing aspects to using AI for cybersecurity, including minimizing human bias, assessing risk, and developing predictive capabilities to automate operational tasks.

Taxonomy of Cybersecurity Machine Learning

Machine learning is a scientific discipline, a form of artificial intelligence, and a sub-field of computer science (Dykstra,
2015). Algorithms for machine learning, “learn” because they do not need to be explicitly re-programmed when exposed to new datasets. Algorithms are more accurate when they have large datasets to process than when there is limited data. The algorithm is only as good at prediction as the quality of the datasets used for training.

ML systems improve with experience and can learn from previous observations to make inferences about future behavior and predictions about how to apply behaviors to new situations. Mathematics, statistics and the algorithms used to discover patterns, anomalies, and correlations within datasets vary in complexity and are the foundation of machine learning algorithms.

Supervised vs. Unsupervised

Two methods are used to train an algorithm, supervised and unsupervised. The data or inputs accepted by supervised and unsupervised learning are differentiators for each technique. From a supervised perspective, the data provided to the algorithm is labeled and structured. Supervised data is historical data, and predictions must be made to create labels on future data.

Supervised learning algorithms are provided with labeled training data and tasked with learning what differentiates the labels. By learning what makes a category unique, the algorithm can be trained to apply a correct label to new, unlabeled data (Kanal, 2017). The criteria for choosing training data in supervised learning is data that is a representative training set. For example, if the ML algorithm must train to identify photos of fruit correctly, but the provided training set are animals, the algorithm cannot correctly identify and label the data.

Email spam detection algorithms can be used to understand one application of machine learning. ML enhances spam filtering because it can compare verified spam with a verified legitimate email to determine what is present in one or the other. This process of automatically inferring a label is called classification (Kanal, 2017).

Unsupervised methods use unlabeled datasets and apply these to new data to draw abstractions. (Chio, 2018) Historical data does not have a label, and there may be cases where it is unclear what label is being predicted, such as in instances of malware or botnets.

Forms of supervised learning include classification and regression. Forms of unsupervised learning include clustering (Chio, 2018). Machine learning analysis of large datasets incorporates clustering, dimensionality reduction (a technique of reducing and simplifying inputs) and association rule learning (rule-based method of ML to discover relationships between variables in large databases) (Marty, 2018).

Unsupervised learning refers to algorithms provided with unlabeled training data. If there is a lot of unlabeled data to examine, it may be challenging for people to determine what label to assign. However, it may be easier to have machines separate data into groups because they can identify patterns in large datasets better than humans. Data separation assumes relevant data is present. An example where it would be better to have a machine label data would be the case of network flow data. For this type of dataset, the data features would have to be assigned, such as IP address, network port, packet contents, timestamp, or other relevant data. Useful features are a prerequisite for applying machine learning techniques (Kanal, 2017). Too many non-informative features can lead to algorithm degradation, and too much noise
hides relevant information.

Classification vs. Regression

Supervised learning uses classification or regression methods (Chio, 2018). Classification is considered learning where a training set of correctly identified observations is available. Classification determines which of a set of categories a new observation belongs, by a training set of data with observations whose category membership is known (Wikipedia, 2018).

Regression or prediction is used to learn the relationship between features of data based on existing knowledge about the dataset. In cybersecurity, regression is used for traffic analysis, user behavior analytics or fraud detection.

Classification can also be used to detect malicious network activity. The behavior can be used to identify types of activities such as scanning or spoofing. It can be applied on a web application firewall to detect various attack types, such as OWASP Top 10. There are many applications within cybersecurity for this type of analysis and use of regression or classification.

Forecasting is another conventional technique that uses historical data to predict future behavior and is a process of making predictions by analyzing trends in data. An example would be using the Holt-Winters algorithm to perform network anomaly detection.

Clustering

In unsupervised learning, data grouped into categories based upon a measure of similarity or distance is called clustering, and either a person or the machine learning algorithm is trying to find structure in unlabeled data. An example of clustering is finding malware families using executables and no other metadata (Dykstra, 2015).

Feature Engineering

There is a branch of machine learning called feature engineering, which is used to extract maximum information from features to maximize the ability to categorize or predict unknown data. For brevity, these techniques are not in this article.

Requirements

Machine learning tools typically require the following:

1. Data collection. Most ML techniques collect data ahead of time and create a model with stored data.

2. Data cleansing. Raw data is often unusable for ML. Missing data, inconsistent data and mixed numeric and nonnumeric data can create issues. This step requires combining multiple data sources into a single usable source.

3. Feature engineering. Once data is ready for use, the maximum information must be extracted from the data using features. Feature engineering occurs before the creation of the machine learning algorithm (Kanal, 2017).

4. Model building and validation. This step works on building the model to test to ensure it works on unlabeled data. Statistical techniques are used to validate the model. Models are predictions used by the ML system. Bayesian Analysis methods are used to train the system to create a better model. This phase may be run again and again to fine tune the system. During this phase, the system makes small adjustments over and over to get the model right.

5. Deployment. Machine learning deployment usually requires tuning and refinements, which is especially true in cases of network traffic, where historical observations do not typically match future activity.

6. Monitoring. After deployment, ML models must be monitored and run through previous steps to ensure accuracy.

The type of problem presented by cybersecurity needs to be analyzed to determine which machine learning algorithm can solve the issue. So, depending on whether the company wants to address an issue with malware or spam, various types of machine learning algorithm are better suited than the others. The steps above outline a basic model of how to incorporate or use machine learning. For example, if the issue is malware, the dataset needs to be collected from various security data sources, such as a SIEM or varied sources of log files, network traffic, email content or user behavior. Gather training data, and then the data is cleansed, normalized and readied for use. The machine learning model is then selected, the system tuned and the data used with an operational focus, such as creating visualizations and notifications and used for monitoring and management of devices. Meanwhile, the machine learning algorithm is tuned and refined over time, and the steps above are iterated over to ensure the accuracy of the model.


This article is a part of the FREE PREVIEW of our issue "Machine Learning, Deep Learning, and Cybersecurity". Register on our website to get the access to more of our free content on a regular basis.


May 23, 2019

Leave a Reply

avatar

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  Subscribe  
Notify of

© HAKIN9 MEDIA SP. Z O.O. SP. K. 2013