NIST Releases Guidance on AI Cyberattacks

Summary
On January 4, 2024, NIST released a publication titled Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations (NIST.AI.100-2), which identifies threats to AI and machine learning systems along with potential mitigation strategies. The guidance aims to help developers and users understand attacks on AI, which are grouped into four categories: • Evasion attacks that attempt to alter an input to change how the system responds to it, such as causing an autonomous vehicle to misinterpret visual cues. • Poisoning attacks that, e.g., introduce corrupted or untrustworthy data during training by creating many copies of incorrect information to cause the model to rely on that information. • Privacy attacks that attempt to learn sensitive information about an AI model, or its training data, to misuse the model. These attacks include, e.g., reverse engineering prompts to reveal model weaknesses. • Abuse attacks that compromise a generative AI tool to force it to carry out malicious acts that overcome model safeguards, such as promoting hate speech or enabling cyberattacks. The document includes mitigation guidance for various types of attack, but also notes the limitations of mitigation techniques and the need for ongoing efforts to identify risks and potential defensive strategies.