How to automate static-analysis configuration through machine learning
Leading static analysis tools have reached high precision in finding security bugs once adapted to the codebase. However, running them on new codebases does not work “out-of-the-box” and produces many false warnings due to incorrect configuration, e.g. wrong configurations of source and sink APIs. The adaptation requires a lot of effort from security experts.
This talk focuses on classification approaches for learning security APIs. We will explain how code information can be used to automatically detect those APIs and what level of automation can be achieved. We will also demonstrate an active learning tool that enables developers and security experts to adapt their tools with low effort.
This is a beginners’ level talk. The audience will be introduced to the machine-learning foundations required for understanding the main message of the talk. Prior knowledge on classification problems and general understanding on taint-style vulnerabilities, such as SQL injection is helpful, but not required.
The audience will become familiar with the state-of-the-art machine-learning approaches used to adapt security tools to developers’ needs. They will learn how classification techniques are used to detect security-relevant APIs in code, which are important for the reduction of false warnings by static analysis tools. Moreover, they will get insights of active learning approach that enabled developers to train their security tools for better reports.