OpenAI Launches GPT-5.3-Codex-Spark on Cerebras Chips, Scaling GPU Diversification
OpenAI released GPT-5.3-Codex-Spark on February 13, 2026, its first code-completion model to run on Cerebras Systems' chips, marking diversification away from NVIDIA. The Codex-Spark is a streamlined version of Codex, prioritizing low-latency interactions over full inference capability. Key metrics: response speed up 15x to over 1,000 tokens per second; 128,000-token context window; client-server round-trip overhead down 80%, cost per token down 30%, and first-token time down 50%. It is available to ChatGPT Pro researchers via the Codex CLI and VS Code extension. Speed improvements come with trade-offs: it underperforms the full GPT-5.3-Codex on SWE-Bench Pro and Terminal-Bench 2.0. OpenAI plans to expand API access to select enterprise partners and later open to broader audiences based on workload feedback. The move follows a >$100B collaboration with Cerebras in January 2026 and ongoing partnerships with AMD and Broadcom, while NVIDIA's $1T deal with OpenAI has reportedly slowed. Analysts see Codex-Spark as a strategic pivot to reduce single-vendor risk and enhance developer productivity in a fiercely competitive coding assistant market.