OpenAI Seeks Inference Chip Alternatives to Nvidia, Citing Speed Concerns

OpenAI has grown dissatisfied with Nvidia’s AI chips for inference tasks and has pursued alternatives since 2025, eight sources say, signaling a strategic shift that could challenge Nvidia’s dominance in the AI hardware market. The issue centers on latency in handling specific workloads like coding assistance, where faster response times are critical. OpenAI is exploring chips with dense on-die SRAM memory—such as those from Cerebras and Groq—to accelerate inference, which demands more memory bandwidth than model training. While Nvidia remains OpenAI’s primary inference provider, the startup aims to source about 10% of future inference capacity from alternatives. Talks with Groq stalled after Nvidia signed a $20 billion non-exclusive licensing deal and hired key Groq engineers. Meanwhile, OpenAI struck a commercial agreement with Cerebras. Nvidia maintains its chips offer superior performance and cost efficiency at scale.

EditorTan Wei Jie