Tuesday, March 3, 2026

OpenAI Codex-Spark Delivers Lightning-Fast Coding Performance with Cerebras Hardware

OpenAI has introduced GPT-5.3-Codex-Spark, a groundbreaking AI model utilizing Cerebras wafer-scale chips instead of traditional Nvidia GPUs. This shift enhances throughput and reduces latency, facilitating a real-time coding environment for ChatGPT Pro users. The model processes around 1,000 tokens per second—15 times faster than previous versions—allowing for swift coding assistance and iterative tasks. Optimized for low-latency workflows, Codex-Spark maintains its predecessor’s capacity for long-running processes and has shown remarkable performance on software engineering benchmarks.

Significant enhancements include reduced client/server roundtrip overhead by 80% and a 50% improvement in time-to-first-token. While this model utilizes Cerebras accelerators, OpenAI clarifies it continues to integrate GPUs in its pipeline. User feedback reflects varied opinions on prioritizing speed versus accuracy, with discussions on the true extent of speed improvements. Codex-Spark supports a 128k context window, with plans for models featuring larger contexts based on developer insights.

Source link

Share

Read more

Local News