US Sanctions More Than 130 ISIS-Linked Crypto Wallets on Tron
The U.S. government sanctioned more than 130 Tron wallets connected to a Central Asian ISIS affiliate. Tether froze funds associated with the wallets.
ReadAI researchers say they used a new jailbreak technique that caused chatbots to treat attacker-written text as part of their own reasoning, bypassing safety guardrails. The finding points to a deeper security weakness in how some models process prompts.
AI researchers say they used a new jailbreak technique that caused chatbots to treat attacker-written text as part of their own reasoning, bypassing safety guardrails. The finding points to a deeper security weakness in how some models process prompts.
For readers following the broader technology and crypto ecosystem, the finding is another reminder that AI safety remains an active and unresolved issue. As more platforms integrate AI into consumer products, trading tools, and moderation systems, weaknesses like this can create operational and trust risks.
Researchers say they have found a jailbreak technique that can make AI chatbots treat attacker-written text as if it were part of their own internal reasoning, allowing the systems to bypass safety guardrails. In tests described by the researchers, the method led models to share disallowed information, including cocaine recipes.
The development matters because it highlights a deeper security flaw in how AI systems interpret prompts and separate trusted instructions from injected text. For companies building or deploying AI tools, that kind of vulnerability raises concerns about content moderation, misuse, and the reliability of safety controls.
The technique appears to exploit the model’s tendency to absorb attacker-written text into its own chain of thought rather than recognizing it as external manipulation. That makes it different from simpler jailbreaks that rely on obvious prompt tricks.
For readers following the broader technology and crypto ecosystem, the finding is another reminder that AI safety remains an active and unresolved issue. As more platforms integrate AI into consumer products, trading tools, and moderation systems, weaknesses like this can create operational and trust risks.
The researchers framed the issue as more than a one-off workaround, arguing that it reveals a structural problem in model behavior. Their results suggest that better defenses may require changes to how AI systems handle context and instruction boundaries, not just stricter filters on outputs.