2026/06/08/xiaomi-says-mimo-v2-5-pro-ultraspeed-reaches-1
Xiaomi says MiMo-V2.5-Pro-UltraSpeed reaches 1,000+ tokens per second for a 1T-parameter AI model
EDITOR BRIEF
Xiaomi released MiMo-V2.5-Pro-UltraSpeed with TileRT, claiming the first 1T-parameter model to exceed 1,000 tokens per second in decode speed, reaching up to about 1,200 tokens per second. The API is offered in a limited trial from June 9 to June 23, 2026, at 3x the MiMo-V2.5-Pro price while promising roughly 10x faster generation.
CONTEXT
The launch highlights how inference speed is becoming a key differentiator as large models move from batch-style outputs toward real-time collaboration and agentic workflows. Limited access and enterprise prioritization suggest high-speed inference remains resource-constrained, making infrastructure efficiency a competitive battleground.
ARTICLE
MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second
COMMENTS
Discussion
> geekhaus:~$ next read?


