Feed

Researcher Claims Anthropic’s New Fable 5 Guardrails Have Already Been Bypassed

An AI researcher using the name “Pliny the Liberator” claims to have found ways around guardrails in Anthropic’s newly launched Fable 5. The claim highlights the continuing tension between AI safety controls and researchers who test their limits.

What happened?

An AI researcher using the name “Pliny the Liberator” claims to have found ways around guardrails in Anthropic’s newly launched Fable 5. The claim highlights the continuing tension between AI safety controls and researchers who test their limits.

Why it matters

For readers in crypto and technology markets, the episode is another reminder that AI infrastructure is still being tested in public. As AI tools become more common across trading, research, developer workflows, and content systems, the reliability of model safeguards remains a practical business concern.

An AI researcher known as “Pliny the Liberator” says he has already bypassed guardrails in Anthropic’s newly launched Fable 5 model. According to the source material, the researcher described the effort as “cleverly finding the holes in the fence that the thought police missed.”

The claim matters because guardrails are a central part of how AI companies present their systems as safer and more controlled. If those limits can be bypassed soon after launch, it raises fresh questions for companies relying on AI models to manage risk, compliance, and user trust.

Anthropic has positioned its AI systems around safety-focused design, making any reported jailbreak especially notable. The source material does not provide technical details of the bypass, independent verification, or Anthropic’s response.

For readers in crypto and technology markets, the episode is another reminder that AI infrastructure is still being tested in public. As AI tools become more common across trading, research, developer workflows, and content systems, the reliability of model safeguards remains a practical business concern.

The report is based on the researcher’s claim, and the source does not establish whether the bypass is repeatable or broadly exploitable. Without further confirmation, it should be treated as an early claim in the wider debate over AI guardrails rather than a settled security finding.

Source: Cointelegraph