The Department of Defense is working to address vulnerabilities in AI systems that can be exploited by attackers using visual tricks or manipulated signals. Their research program, Guaranteed AI Robustness Against Deception (GARD), has been investigating these “adversarial attacks” since 2022.
Researchers have shown that seemingly innocuous patterns can trick AI into misidentifying objects, with dire consequences on the battlefield. For example, an AI could mistake a bus full of passengers for a tank if tagged with the appropriate “visual noise.”
These concerns come amid public anxiety over the Pentagon's development of autonomous weapons. To address this, the Department of Defense recently updated its AI development rules to emphasize “responsible conduct” and require approval for all systems deployed.
The modestly funded GARD program has made progress in developing defenses against such attacks. They are also providing some tools to the newly created Department of Defense Chief Digital and AI Office (CDAO).
But some advocacy groups remain concerned. They worry that even if no one is intentionally manipulating the signals, AI-powered weapons could misinterpret the situation and attack for no reason. They argue that such weapons could lead to unintended escalation, especially in tense areas.
The Department of Defense is actively working to modernize its arsenal with autonomous weapons and emphasizes the urgency of addressing these vulnerabilities and ensuring responsible development of this technology.
According to a statement from the Defense Advanced Research Projects Agency, GARD researchers from Two Six Technologies, IBM, MITRE, the University of Chicago, and Google Research have produced the following virtual testbeds, toolboxes, benchmark datasets, and training materials, and are currently Now available to a wider audience. Research community:
- The Armory virtual platform, available on GitHub, serves as a “testbed” for researchers who require repeatable, scalable, and robust evaluation of adversarial defenses.
- Adversarial Robustness Toolbox (ART) provides tools for developers and researchers to defend and evaluate ML models and applications against numerous adversarial threats.
- The Adversarial Patches Rearranged In COnText (APRICOT) dataset enables reproducible studies of the real-world effectiveness of physical adversarial patch attacks against object detection systems.
- The Google Research Self-Study repository contains “test dummies” that represent common ideas and approaches to building defenses.