Thursday, January 15, 2026

OpenAI Collaborates with Cerebras for Accelerated AI Inference – w.media

OpenAI and Cerebras have forged a multi-year partnership aiming to deploy 750 MW of Cerebras wafer-scale AI solutions, beginning in 2026 and continuing through 2028. This collaboration is set to revolutionize AI model outputs, with Cerebras’ unique systems dramatically enhancing response times. By combining vast compute, memory, and bandwidth on a single chip, these systems achieve inference speeds that are up to 15 times faster than traditional GPU-based setups. OpenAI’s Head of Compute Infrastructure, Sachin Katti, highlights the importance of integrating dedicated low-latency solutions for real-time AI interactions. Cerebras’ CEO, Andrew Feldman, notes this partnership as a transformative leap similar to the broadband revolution for the internet. Notably, cutting-edge models like GPT-OSS-120B and Llama 3.2-70B show exceptional performance, with speeds of 3,000 and 2,100 tokens per second, respectively, showcasing the potential for unprecedented advancements in AI applications and interactions.

Source link

Share

Read more

Local News