OpenAI Seeks AI Inference Chip Alternatives to Nvidia, Engages AMD, Cerebras, Groq

OpenAI has quietly evaluated AI inference chips from AMD, Cerebras, and Groq since last year, seeking alternatives to Nvidia amid dissatisfaction with response latency in certain applications, particularly software development and system integration, according to multiple sources cited by Reuters on February 2, 2026. The shift reflects OpenAI’s growing focus on inference—the process of generating real-time responses from trained AI models—as a critical performance bottleneck. While Nvidia dominates AI training hardware, its GPUs rely on external memory, which can increase latency compared to architectures with embedded SRAM, such as those from Cerebras. OpenAI aims to allocate roughly 10% of its inference workload to non-Nvidia hardware. Despite this, the company confirms Nvidia still powers the vast majority of its inference clusters. Meanwhile, Nvidia continues investment talks with OpenAI, though the proposed $100 billion deal remains delayed. Nvidia CEO Jensen Huang denies tensions, asserting its chips offer superior cost and performance for large-scale inference.

EditorTan Wei Jie