Adversarial Machine Learning, New Cybersecurity Threats
March 15, 2022 | 5 minutes read
Through the development and implementation of machine learning algorithms, software engineers have been able to reach new heights as it pertains to artificial intelligence and technology. However, as is the case with any other method or technique that is used within a particular business or industry, the same processes that allow software engineers to create machine learning algorithms also enable hackers and cybercriminals to take advantage of such processes for nefarious purposes. To this point, adversarial machine learning is a technique that can be used to fool machine learning algorithms with deceptive data, in what has become a new form of cyberattack in recent years. Adversarial attacks are subtle manipulations that force machine learning systems to fail in unexpected ways.
To provide a real-world example of such attacks, experimental research that was conducted by vehicle manufacturer and technology company Tesla in 2018 illustrated that simply placing a few small stickers on the ground of a busy intersection could result in self-driving cars making abnormal and unexpected mistakes that would have otherwise not occurred. As such vehicles are trained in accordance with datasets pertaining to objects and information one would expect to see when driving a car, these stickers represented information that was outside of the scope of the vehicle’s training. As such, while Tesla can continue to expand the datasets they use to train their self-driving cars, machine learning algorithms that are used in other contexts within society are at greater risk of such attacks.
How does adversarial machine learning work?
Just as there are many different machine learning algorithms that can be used to create technological solutions, there are also a number of approaches that cybercriminals can implore when looking to launch an adversarial attack on a particular AI system. However, irrespective of the specific method or technique that is employed, adversarial machine learning attacks generally function on the basis of attempting to fool machine learning algorithms into making incorrect or detrimental decisions. As the entire premise of artificial intelligence is the creation of machines and systems that can function without the need for human interference, such attacks present an enormous challenge, as adversarial machine learning algorithms could be viewed as the technological equivalent of poisoning a public water system. Once the system has been poisoned, all of the water within would effectively become contaminated.
To provide an example of the methods that can be used to propel an adversarial machine learning attack, the FastGradient Sign method or FGSM can be used to fool image classification systems that are predicated on machine learning algorithms. As image recognition and classification systems function in accordance with the identification of specific features within images, such as the pixels within a particular photo, the FastGradient Sign method can be used to slightly alter the pixels within said photos in order to fool the algorithm into classifying them into a category that does not align with the training data that was used to create the said algorithm. For instance, an image classification system that is used to identify a photo of a dog could be fooled into identifying a photo of a cat after an adversarial attack, even though the two pictures that the system analyzed would both appear to be dogs to the human eye.
What can be done to defend algorithms against adversarial attacks?
While reducing or mitigating the effects of an adversarial machine learning algorithm can be extremely difficult once the attack has occurred, there are certain preventative measures that software developers can take to avoid such attacks altogether. One such approach is adversarial training, which as the name suggests, focuses on generating and using adversarial examples when training machine learning algorithms. Just as a school teacher would take their class on practice fire drills at the beginning of the school year to prepare their students for such an occurrence, adversarial examples can be introduced when training a machine learning model to ensure that the model will be able to deal with such attacks once the algorithm has been completed.
Conversely, defensive distillation is another method that can be used to thwart adversarial machine learning attacks. In keeping with the example of a school teacher practicing a fire drill with the children in their class, while such an approach is undoubtedly effective, the teacher must physically conduct the fire drills, and watch the children to ensure that they are following all directions and procedures. In the context of software development, adversarial training represents a brute force tactic, as software engineers will introduce as many adversarial training examples as possible to protect their respective algorithms. However, defensive distillation adds flexibility to these preventative approaches, as the technique hinges on training a machine learning model to predict different probabilities in relation to adversarial machine learning attacks as opposed to making specific hardline decisions.
When using defensive distillation techniques to safeguard a machine learning model, a software engineer will first obtain probabilities from a particular machine learning algorithm, such as a labeled data set in the case of supervised machine learning. As this model would be representative of the models that cybercriminals would attempt to attack in real-time, software engineers using defensive distillation techniques could incorporate these predictive algorithms into a new algorithm, effectively strengthening the defenses of the said new algorithm, as it would be created in accordance with two different sets of predictions. In turn, this new algorithm would be able to detect potential adversarial machine learning attacks in a more efficient manner.
Despite the immense level of complexity and nuance that is involved in the development of artificial intelligence systems and machine learning algorithms, these technological advancements are still subject to cyberattacks. Just as software developers continue to develop new techniques and methods that can be used to create new and innovative products and services, cybercriminals are simultaneously working to launch attacks against these new products and services. As such, while the advent of adversarial machine learning attacks may only be being discussed in the deepest of tech circles at the current moment, these attacks are sure to increase in frequency as machine learning algorithms and artificial intelligence continue to become more common in mainstream society.