Anthropic is redeploying Claude Fable 5, its most capable generally available model. On June 30, it announced that US export controls had lifted. The controls had covered Claude Fable 5 and Claude Mythos 5. Fable 5 returned to users globally on Wednesday, July 1. Mythos 5 access is restored to a set of US organizations.
The models were pulled on June 12. A US government directive restricted them to non-foreign-nationals. Anthropic could not verify nationality in real time. So it suspended both models for everyone.
This article explains what triggered the block. It covers the new safeguard and the proposed jailbreak framework. It also shows how Fable 5 compares to rivals like GLM-5.2.
Quick facts
- Model: Claude Fable 5 (a Mythos-class model made safe for general use)
- Event: Redeployed July 1, 2026 after export controls lifted
- Reason for pause: An Amazon report on a safeguard bypass
- Fix: A new safety classifier that blocks the reported technique
- Pricing: $10 per million input tokens, $50 per million output tokens
- Where: Claude Platform, Claude.ai, Claude Code, Claude Cowork
What happened: the timeline
Anthropic launched Fable 5 and Mythos 5 on June 9. Both share the same underlying model. Fable 5 ships with strong safeguards for general use. Mythos 5 has some safeguards lifted for defensive cybersecurity partners.
On June 12, the US government applied export controls. The order took effect immediately. Anthropic suspended access rather than risk non-compliance.
The trigger was a report from Amazon researchers. They found a method of bypassing Fable 5’s safeguards. The prompt made the model identify a number of software vulnerabilities. In one case, it produced code showing how to exploit one vulnerability.
By June 26, the government approved restoring Mythos 5 for some US organizations. On June 30, the controls were fully lifted.
Why Anthropic says the finding was not unique
Anthropic tested whether the finding was unique to Fable 5. It was not.
Less capable models identified the same vulnerabilities. That list includes Claude Opus 4.8, GPT-5.5, and Kimi K2.7.
For the single exploit demonstration, every tested model reproduced it. That set included Haiku 4.5, Sonnet 4.6, Opus 4.6, and Opus 4.7. It also covered Opus 4.8, GPT-5.4, GPT-5.5, and Kimi K2.7.
The Anthropic team states the technique exposed no unique Mythos-level cyber capabilities. It called the case a borderline one for Fable 5’s safeguards. The blocked behavior involved only routine defensive cybersecurity work.
How the new classifier works
Anthropic still moved to close the gap. It trained an improved safety classifier for the reported behavior.
The classifier blocks the specific technique in over 99% of cases. Blocked requests are not refused outright. They are routed to Claude Opus 4.8 instead. Users are notified when this fallback happens.
Researchers from the Department of Commerce’s CAISI tested both old and new safeguards. They agree the safeguards are extraordinarily strong. The tradeoff is more false positives during routine coding and debugging.
This reflects Anthropic’s ‘defense in depth’ design. Classifiers are smaller AI systems that detect harmful cyber tasks. A deliberate ‘safety margin’ also blocks some benign requests. Fable 5 uses a much larger safety margin than prior models.
The proposed jailbreak severity framework
The episode exposed a gap. The industry has no shared standard for scoring a ‘jailbreak,’ a technique that bypasses a model’s safeguards.
Anthropic is drafting one with Amazon, Microsoft, Google, and other Glasswing partners. The draft scores a jailbreak on four criteria:
- Capability gain — how far beyond existing tools it takes the user.
- Breadth of capability gain — how many distinct offensive tasks it unlocks.
- Ease of weaponization — how much human effort an attack still needs.
- Discoverability — how easily someone can obtain the technique.
For the most severe class, Anthropic will deploy preliminary mitigations immediately. It is also standing up 24/7 monitoring of jailbreak submission channels.
Interactive scorer
Try this embedded interactive scorer to see how these four criteria combine.
Jailbreak Severity Scorer
Score an AI jailbreak on the four criteria Anthropic proposed with Amazon, Microsoft, and Google. Move each slider to see the composite severity and the suggested response tier update live.
Illustrative tool. Anthropic’s framework is a published work in progress and does not define numeric thresholds. The score here is an equal-weighted average built only to demonstrate the four criteria; it is not an official Anthropic score. Criteria and response language are adapted from Anthropic’s June 30, 2026 post, “Redeploying Claude Fable 5.”

