TWC: Small: Automatic Techniques for Evaluating and Hardening Machine Learning Classifiers in the Presence of Adversaries
New security exploits emerge far faster than manual analysts can analyze them, driving growing interest in automated machine learning tools for computer security. Classifiers based on machine learning algorithms have shown promising results for many security tasks including malware classification and network intrusion detection, but classic machine learning algorithms are not designed to operate in the presence of adversaries. Intelligent and adaptive adversaries may actively manipulate the information they present in attempts to evade a trained classifier, leading to a competition between the designers of learning systems and attackers who wish to evade them. This project is developing automated techniques for predicting how well classifiers will resist the evasions of adversaries, along with general methods to automatically harden machine-learning classifiers against adversarial evasion attacks.
At the junction between machine learning and computer security, this project involves two main tasks: (1) developing a framework that can automatically assess the robustness of a classifier by using evolutionary techniques to simulate an adversary’s efforts to evade that classifier; and (2) improving the robustness of classifiers by developing generic machine learning architectures that employ randomized models and co-evolution to automatically harden machine-learning classifiers against adversaries. Our system aims to allow a classifier designer to understand how the classification performance of a model degrades under evasion attacks, enabling better-informed and more secure design choices. The framework is general and scalable, and takes advantage of the latest advances in machine learning and computer security.