TECH & INNOVATION
Google's DiffusionGemma Generates 1,000+ Tokens a Second on a Local GPU
# Google# DiffusionGemma# diffusion models# local AI
Key Points
- The experimental model applies image-style diffusion to text generation
- It produces 256 tokens in parallel, up to 4x faster than autoregressive models
- Quality trails standard models; strengths are local inline editing and code completion
Analysis
Google's experimental DiffusionGemma ports image-diffusion techniques to text, generating 256 tokens in parallel for up to 4x speed — past 1,000 tokens a second on a local GPU. It trades some quality for speed, aiming at local inline editing and code completion.