DeepTeam: Open-source LLM red teaming framework

Security teams are pushing large language models into products faster than they can test them, which makes any new red teaming method worth paying attention to. DeepTeam is an open-source framework built to probe these systems before they reach users, and it takes a direct approach to exposing weaknesses.

DeepTeam

The tool runs on a local machine and uses language models to simulate attacks as well as evaluate the results. It applies techniques drawn from recent research on jailbreaking and prompt injection, which gives teams a way to uncover issues such as bias or exposure of personal data. Once DeepTeam finds a problem, it offers guardrails that can be added to production systems to block similar issues.

DeepTeam supports a range of model setups. It can test retrieval augmented generation pipelines, chatbots, agents, and base models. The goal is to show where a system might behave in unsafe ways long before it is deployed.

The project includes more than 80 vulnerability types. These are used to scan an application for different risks. Users who need to test for a specific issue can define their own vulnerability type. DeepTeam registers these additions automatically and keeps a record of each custom item. Built in and custom tests can be used together without extra configuration. If a user does not provide a prompt for a new vulnerability, the framework supplies a template.

DeepTeam is available for free on GitHub.

Must read:

Subscribe to the Help Net Security ad-free monthly newsletter to stay informed on the essential open-source cybersecurity tools. Subscribe here!

Don't miss