Betterleaks: Open-source secrets scanner

Secrets scanning has become standard practice across engineering organizations, and Gitleaks has been one of the most widely used tools in that space. The author of that project has now released a new tool called Betterleaks, which is designed to scan git repositories, directories, and standard input for leaked credentials, API keys, tokens, and passwords.

Zach Rice, who wrote the original Gitleaks code approximately eight years ago and now serves as Head of Secrets Scanning at Aikido Security, is the project lead for Betterleaks.

Betterleaks

Scan times compared against Gitleaks on real-world repos (Source: Betterleaks GitHub page)

Why a new project

Rice explained in a post that he no longer has full control over the Gitleaks repository and name, which prompted him to start a new project. Betterleaks is built to function as a drop-in replacement for Gitleaks, meaning existing CLI flags and configuration files carry over without modification.

The most significant technical change in Betterleaks is its approach to filtering candidate secrets. Gitleaks, like many scanners, relies on Shannon entropy to identify strings that are likely secrets. Betterleaks introduces a different technique called Token Efficiency, based on byte pair encoding (BPE) tokenization.

The approach measures how a BPE tokenizer compresses a given string. Natural language compresses well into longer tokens, producing high token efficiency. Secrets and random strings compress poorly, producing many short tokens and low token efficiency. Betterleaks uses this as a signal to filter out false positives. Against the CredData dataset, Token Efficiency achieved 98.6 percent recall compared to 70.4 percent for entropy, according to Rice.

Validation logic in Betterleaks is written using the Common Expression Language (CEL), giving rule authors programmatic control over what counts as a confirmed secret.

The tool also handles doubly and triply encoded secrets by default, and supports parallelized git scanning to reduce scan times. It is built in pure Go without CGO, removing the dependency on Hyperscan and allowing deployment across environments without native library requirements. The tool supports scanning archives, including nested archives, and outputs results in JSON, CSV, JUnit, SARIF, and custom template formats.

Planned v2 features

The project roadmap includes several capabilities not in the current v1 release. Rice described plans for LLM-assisted classification, where anonymized candidate secrets are passed to a local or remote language model for additional context. Auto-revocation support is planned for providers that expose credential revocation APIs. The team also intends to add permissions mapping, which would show what access a detected secret actually carries.

AI agent usage

The tool is designed with flag-based output control so that AI coding agents can run it as a subprocess and consume its output without excess token overhead. Rice noted that agents operating in tools like Claude Code or Cursor reach for CLI utilities with controllable output, and Betterleaks is built to meet that pattern.

Betterleaks is available for free on GitHub.

Must read:

Subscribe to the Help Net Security ad-free monthly newsletter to stay informed on the essential open-source cybersecurity tools. Subscribe here!

Don't miss