Train, but Verify: Towards Practical AI Robustness
November 2020 • Presentation
This presentation describes efforts to train AI systems to enforce at least two security policies and verify security by testing against realistic threat models.
Software Engineering Institute
In this “Train, but Verify” project, we attempt to address the gap in the state of the art on secure training of ML systems with two objectives:
- Train secure AI systems by training ML models to enforce at least two security policies.
- Verify the security of AI systems by testing against declarative, realistic threat models.
We consider security policies from the Beieler taxonomy: ensure that an ML system does not learn the wrong thing during training (e.g., data poisoning), do the wrong thing during operation (e.g., adversarial examples), or reveal the wrong thing during operation (e.g., model inversion or membership inference).