A Chinese AI Lab Claims It Matches Anthropic's Restricted Cyber Model

A Chinese AI lab is needling US policy at its most sensitive point. Z.ai — the Beijing company formerly known as Zhipu AI, one of China's top model developers — says its latest system rivals Anthropic's Mythos at hunting software vulnerabilities, as The Verge reported. That is precisely the skill that prompted Washington to restrict Mythos in the first place.

Why this is pointed

A bit of recent history. Earlier this month the US government ordered Anthropic to cut non-US users off from Mythos and its sibling Fable 5, citing national security — specifically the models' unusually strong ability to find security flaws in code, a dual-use power that helps defenders patch holes but could also help attackers exploit them. Boursel has covered the ban and the scramble by Asian labs to fill the gap.

Z.ai's answer is GLM-5.2, released the day after the curbs and — crucially — open-source, under a permissive license that lets anyone download, modify and run it on their own machines. Where Mythos is locked to vetted US users on Anthropic's servers, GLM-5.2 is, by design, ungated.

What the benchmark actually shows

The headline claim deserves a hard look. Z.ai's case leans on testing by Semgrep, a code-security firm, which pitted GLM-5.2 against Anthropic models on one specific task: spotting a common web flaw called an insecure direct object reference. On that test GLM-5.2 scored around 39%, ahead of an earlier Claude model's 28–32%.

But Semgrep itself piled on caveats, and they matter. The result covers a single vulnerability type, on one dataset, in one run — not a broad, head-to-head bake-off against the actual restricted Mythos model. Performance also swings with how a model is prompted, and Semgrep's own purpose-built tooling scored far higher (53–61%) than any bare model, a reminder that the raw model is only part of the story. In short: an interesting data point, not proof of parity. Independent, like-for-like benchmarks against Mythos don't yet exist publicly, and security researchers have openly doubted that GLM-5.2 truly matches it across the board.

The real significance

Whether or not GLM-5.2 equals Mythos, the episode exposes the limit of export controls on AI. Restricting one company's model accomplishes little if a capable, open-source substitute is a free download away — and China's labs are racing to provide exactly that. The same logic ran through the recent decision: cutting allies off from US models pushed them to seek alternatives, and gives Chinese developers a marketing opening they are happy to exploit.

There's a darker edge, too. Open models with strong vulnerability-finding skills are hard to keep on a leash: reports quickly emerged of users circulating "jailbreaks" to strip GLM-5.2's safety guardrails. That is the core tension of this whole contest — the same AI that could automate cyber-defense could automate cyber-offense, and an open release puts the capability in everyone's hands at once.

The bottom line

Treat the "matches Mythos" claim with skepticism: it's a narrow benchmark dressed up as a milestone. But treat the underlying message seriously. The US is trying to keep frontier AI — especially the kind that can break software — under tight national control. China's open-source push is a direct bet that it can't, and that the cutting edge of AI is about to become a lot harder for any one government to fence in. For the security industry, that is the development to watch, whatever the benchmark scores say.