DecryptJun 13, 09:27 PM1 min

Google's DiffusionGemma Reaches 1,000 Tokens Per Second

Google's DiffusionGemma is a free open AI model that reportedly generates text at up to 1,000 tokens per second by avoiding word-by-word output. The tradeoff is practical: the model still does not run on most people's machines.

What happened?

Why it matters

The development matters because speed is one of the main limits on how AI tools feel in real use. Faster text generation can make AI systems more responsive for companies, developers, and users building products around automated writing, coding, research, and support workflows.

Google has introduced DiffusionGemma, a free open AI model that can generate text at up to 1,000 tokens per second. The model reaches that speed by taking a different approach from typical text generators, avoiding the usual word-by-word generation process.

DiffusionGemma's core distinction is its generation method. Instead of producing text sequentially, it uses a diffusion-based approach, which is presented as the reason it can hit the 1,000-token-per-second mark.

That performance does not yet mean the model is accessible to everyone in practice. According to the source material, DiffusionGemma still does not run on most people's machines, limiting how broadly users can test or deploy it today.

For now, DiffusionGemma stands out as a high-speed, free open AI release from Google, but its impact will depend on whether the hardware barrier becomes less restrictive over time.

Google's DiffusionGemma Reaches 1,000 Tokens Per Second

What happened?

Why it matters

Related stories

Two Weeks Left for Clarity in U.S. Crypto Policy Debate

Sberbank plans crypto trading infrastructure launch by Dec. 1

Mira Murati’s Thinking Machines Debuts Inkling on OpenRouter