Monthly Magazine "April 2021"

Published on 20-May-2021 16:20:00



What is data poisoning?

Data poisoning or model poisoning attacks involve polluting a machine learning model's training data. Data poisoning is considered an integrity attack because tampering with the training data impacts the model's ability to output correct predictions. Other types of attacks can be similarly classified based on their impact: Availability, where the attackers disguise their inputs to trick the model in order to evade correct classification Replication, where attackers can reverse-engineer the model in order to replicate it and analyze it locally to prepare attacks or exploit it for their own financial gain

Types of Data Poisoning

Marcus Comiter produced a useful report in late 2019 for the Belfer Center on the vulnerabilities of AI systems that Confidentiality, where the attackers can infer potentially confidential breaks down attacks into 2 broad categories: Input Attacks – where the data fed into an AI system is manipulated to affect the outputs in a way that suits the attacker; Poisoning Attacks – these occur earlier in the process when the AI system is being developed and typically involves corrupting the data used in the training information about the training data by feeding inputs to the model stage. 3 April 2021 Data poisoning attacks can take several forms. They may involve corrupting a valid or clean dataset by, for example, mislabelling images or files so that the AI algorithm produces incorrect answers. Images of a particular person could be labelled “John” when they are actually “Mary”. Perhaps more seriously, images of specific military vehicles could be wrongly categorised to favour a hostile foreign power. Figure 1 illustrates this, supporting the computer science trope, “Garbage In – Garbage Out”.

Difference between an attack and Poisoning

The difference between an attack that is meant to evade a model's prediction or classification and a poisoning attack is persistence: with poisoning, the attacker's goal is to get their inputs to be accepted as training data. The length of the attack also differs because it depends on the model's training cycle; it might take weeks for the attacker to achieve their poisoning goal. Data poisoning can be achieved either in a blackbox scenario against classifiers that rely on user feedback to update their learning or in a whitebox scenario where the attacker gains access to the model and its private training data, possibly somewhere in the supply chain if the training data is collected from multiple sources.

Data Poisoning Example

For example, the Open Images and the Amazon Products datasets contain approximately 9 million and 233 million samples, respectively, that are scraped from a wide range of potentially insecure, and in many cases unknown, sources. At this scale, it is often infeasible to properly vet content. Furthermore, many practitioners create datasets by harvesting system inputs (e.g., emails received, files uploaded) or scraping user-created content (e.g., profiles, text messages, advertisements) without any mechanisms to bar malicious actors from contributing data. The dependence of industrial AI systems on datasets that are not manually inspected has led to fear that corrupted training data could produce 4 April 2021 faulty models. In fact, a recent survey of 28 industry organizations found that these companies are significantly more afraid of data poisoning than other threats from adversarial machine learning. want to do will not be flagged as anomalous anymore. This is also known as model skewing. A real-world example of this is attacks against the spam filters used by email providers. In a 2018 a blog post on In a cybersecurity context, the target could be a system that uses machine learning to detect network anomalies that could indicate suspicious activity. If an attacker understands that such a model is in place, they can attempt to slowly introduce data points that decrease the accuracy of that model, so that eventually the things that they machine learning attacks, Elie Bursztein, who leads the anti-abuse research team at Google said: "In practice, we regularly see some of the most advanced spammer groups trying to throw the Gmail filter off track by reporting massive amounts of spam emails as not spam [...] Between the end of Nov 2017 and 5 April 2021 early 2018, there were at least four malicious large-scale attempts to skew our classifier."

Click Here to Read the Full Magazine