TECHTechnology

Anthropic pivots on silent Claude Fable restrictions

Anthropic has apologized for implementing stealth guardrails on its new Claude Fable 5 model, a practice that drew criticism for undermining researchers and developers. The company is now reversing course, pledging to replace silent answer degradation with explicit notifications whenever a query triggers safety protocols.

June 12, 20263 reads0

The controversy centered on how Fable, the first release from Anthropic’s high-risk Mythos class, handled distillation—a technique used to train smaller models using larger ones. Anthropic previously admitted it would surreptitiously alter and degrade responses if it detected distillation attempts, leaving users unaware that their results had been compromised. Critics argued this lack of transparency hampered both research efforts and fair competition.

Under the new policy, queries deemed high-risk will no longer be silently degraded. Instead, the system will route these requests to Claude Opus 4.8, the company’s previous flagship model. Anthropic confirmed that users will receive a clear notification every time this redirection occurs, mirroring the existing handling of sensitive topics like cybersecurity, chemistry, and biology.

Anthropic pivots on silent Claude Fable restrictions

Comments (0)

Leave a comment