Microsoft has released an open access automation framework called PyRIT (short for Python Risk Identification Tool) to proactively identify risks in generative artificial intelligence (AI) systems.
“The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model,” Microsoft said.
The company said PyRIT could be used to assess the robustness of large language model (LLM) endpoints against different harm categories such as fabrication (e.g., hallucination), misuse (e.g., bias), and prohibited content (e.g., harassment).
It can also be used to identify security harms ranging from malware generation to jailbreaking, as well as privacy harms like identity theft.