Does Anthropic deserve the trust of the cybersecurity community?

The cybersecurity industry runs on trust. The belief that when a vendor says they will behave a certain way, they will, that critical CVEs are in fact critical, or when companies say they’re GDPR compliant, they really are. But earning trust is not a one-and-done thing.

Anthropic understood this better than any AI company. As OpenAI moved fast and broke things, Anthropic published a Responsible Scaling Policy (RSP), a framework for addressing catastrophic risks. Although it’s a voluntary pledge, combined with its leadership’s earnest ambition to lead an industry-wide “race to the top” on safety, Anthropic quickly became the poster child for trustworthy AI.

In January 2026, the Anthropic-OpenAI rivalry spilled out into the public and things very quickly got weird.

In the lead up to the Super Bowl, Anthropic spent millions on ads mocking OpenAI for launching ads in ChatGPT. With headlines reading “Deception,” “Betrayal,” “Treachery,” and “Violation,” Anthropic brazenly claimed the moral high ground. Claude, the ads declared, would never use your most intimate conversations to serve you ads.

On February 20th, Anthropic launched Claude Code Security, spooking the cybersecurity industry and cybersecurity stocks.

Then, on February 24th, it quietly published RSP 3.0. It framed the rewrite as a maturation of the policy, and in some ways, it is. But buried in the copy is a structural shift that security professionals should not ignore.

The previous RSP committed to keeping Anthropic’s absolute risk levels below acceptable thresholds regardless of what competitors did. Version 3.0 explicitly abandons that framing. Now Anthropic’s formerly stringent safety commitments are relative. If its competitors aren’t pausing, they won’t either, as all pausing would do is hand the wheel to the least responsible driver. Understandable, but move the goal posts, and your brand moves with it. And rumors about a rift with the Pentagon were already swirling when RSP 3.0 dropped.

On February 26th, Anthropic CEO Dario Amodei published a statement holding firm that Anthropic would not contract with the Pentagon unless it agreed not to use its models for mass surveillance of Americans or to build fully autonomous weapons.

On February 27th, the Pentagon fired Anthropic and labeled them a supply-chain risk, a designation normally reserved for foreign adversaries.

On February 28th, OpenAI swooped in to replace Anthropic, initially claiming it would hold the military to Anthropic’s red lines. Soon after (and true to form), it admitted that wasn’t exactly the case.

On March 9th, Anthropic sued the Pentagon and More than 30 employees from OpenAI and Google DeepMind filed an amicus brief in support. And the public rallied around Anthropic as well.

That’s a lot, I know. Let’s take a step back and connect the dots:

In five weeks, Anthropic very publicly staked its entire brand identity on being trustworthy. After launching its first security tools, it softened the specific, verifiable commitments that made it trustworthy. It then found its spine and held firm against the Pentagon, overshadowing the fact that it had already moved the goalposts on its internal safety framework. Then it got fired in a spectacularly disrespectful fashion.

This is often how trust erodes, not in one dramatic betrayal but in a chain of individually defensible decisions. Security professionals know this pattern intimately, because it’s how attackers operate. They stealthily hop from one seemingly innocuous exposure to the next, exposures that when chained together, create exploitable attack paths.

To be clear, I’m not accusing Anthropic of being devious or evil. I agree 100% with the basis of its suit, that contractual restrictions on AI use are a critical safeguard in the absence of public law. And, I hope it wins.

But prolonged litigation has a way of sucking the soul out of litigants. It’s sure to get ugly and many beloved vendors have fallen from grace before, from Kodak to Enron to Blackberry to Boeing.The stakes here are different in degree, but not in kind.

AI models are now embedded in code review, vulnerability management, and security architecture at a pace governance frameworks haven’t caught up to. Practitioners are extending trust to AI vendors at levels with no precedent in enterprise software history.

Anthropic was supposed to be our canary in the AI coal mine. The entire value of a canary is that it doesn’t negotiate. It doesn’t check what the other canaries are doing before it reacts, it simply passes out. That’s how you know it’s telling the truth about the air, whether the coal company likes it or not.

RSP 3.0 changed that. Now the canary needs to check the competitive landscape before entering the mine. That’s not a safety system, it’s a press release.

So, I’ll end where I began: does Anthropic deserve the trust of the cybersecurity community?

Right now, I’d say yes. But look how fast things changed in six weeks. I’d pause before allowing a vendor steeped in ambiguity to be so deeply entrenched in my company’s source code and security stack.

But it’s not my call, it’s yours.

More about

Does Anthropic deserve the trust of the cybersecurity community?

Featured news

Resources

Don't miss