Microsoft says its automated ai red teaming tool finds malicious content in a matter of hours – Breaking News & Latest Updates 2026
Skip to main content

From ChatGPT to Gemini: how AI is rewriting the internet

See all Stories

E
Microsoft says its automated AI red teaming tool finds malicious content “in a matter of hours.”

PyRIT, or Python Risk Identification Toolkit, can point human evaluators to “hot spot” categories in AI that might generate harmful prompt results.

Microsoft used PyRIT while redteaming (the process of intentionally trying to get AI systems to go against safety protocols) its Copilot services to write thousands of malicious prompts and score the response based on potential harm in categories that security teams can now focus on.

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.
Comments
Loading comments
Getting the conversation ready...