DecryptJul 02, 11:14 PM2 min

Researchers Say New Jailbreak Method Can Make Chatbots Repeat Hidden Attack Prompts

AI researchers say they used a new jailbreak technique that caused chatbots to treat attacker-written text as part of their own reasoning, bypassing safety guardrails. The finding points to a deeper security weakness in how some models process prompts.

What happened?

Why it matters

For readers following the broader technology and crypto ecosystem, the finding is another reminder that AI safety remains an active and unresolved issue. As more platforms integrate AI into consumer products, trading tools, and moderation systems, weaknesses like this can create operational and trust risks.

Researchers say they have found a jailbreak technique that can make AI chatbots treat attacker-written text as if it were part of their own internal reasoning, allowing the systems to bypass safety guardrails. In tests described by the researchers, the method led models to share disallowed information, including cocaine recipes.

The development matters because it highlights a deeper security flaw in how AI systems interpret prompts and separate trusted instructions from injected text. For companies building or deploying AI tools, that kind of vulnerability raises concerns about content moderation, misuse, and the reliability of safety controls.

The technique appears to exploit the model’s tendency to absorb attacker-written text into its own chain of thought rather than recognizing it as external manipulation. That makes it different from simpler jailbreaks that rely on obvious prompt tricks.

The researchers framed the issue as more than a one-off workaround, arguing that it reveals a structural problem in model behavior. Their results suggest that better defenses may require changes to how AI systems handle context and instruction boundaries, not just stricter filters on outputs.

Researchers Say New Jailbreak Method Can Make Chatbots Repeat Hidden Attack Prompts

What happened?

Why it matters

Related stories

US Sanctions More Than 130 ISIS-Linked Crypto Wallets on Tron

Russia Says Digital Ruble Is on Track for Wider Use by September

IMF highlights tokenization’s potential and systemic risks