Technology

OpenAI unveils Jalapeño chip for AI inference servers

The Broadcom-built custom processor is designed to run ChatGPT-style requests and is expected to be deployed by the end of 2026.

James Whitfield

By James Whitfield · Staff Writer

3 min read

OpenAI unveils Jalapeño chip for AI inference servers
Photo: The Verge

OpenAI announced its first custom AI processor, a chip called Jalapeño that it developed with Broadcom for use in AI servers. The company says the chip is built for inference, the work of running model requests for services such as ChatGPT and Codex.

The announcement puts OpenAI among major technology companies designing their own AI chips as demand for data-center computing keeps rising. The Verge reported that the effort is also meant to reduce OpenAI’s reliance on Nvidia GPUs, which have been in short supply.

A chip for running models

OpenAI describes Jalapeño as an “intelligence processor” for current and future large language models. According to the company, the chip is an ASIC, or application-specific integrated circuit, meaning it was designed for a particular computing task rather than general use.

In this case, the task is AI inference. The Verge explained that inference is the stage where a model handles a user request, such as producing a ChatGPT answer or running an agent like Codex. Training is different: it involves feeding large volumes of data into a model so it can learn patterns used in later responses.

OpenAI said Jalapeño is the first part of what it calls a multi-generation compute platform. The company expects to put the chip into use by the end of 2026.

Broadcom partnership

The chip was built with Broadcom, following OpenAI’s disclosure nine months ago that the two companies would work together on custom AI silicon, according to The Verge. OpenAI did not give final performance figures in its announcement.

Broadcom CEO Hock Tan told Reuters that Jalapeño matches the performance of Nvidia’s Blackwell chips and Google’s Tensor processing units. OpenAI said it is still measuring final results, but early testing indicates Jalapeño will deliver much better performance per watt than current leading systems.

Performance per watt is a key measure for AI infrastructure because model serving requires large amounts of electricity across data centers. OpenAI’s claim remains preliminary, based on the company’s own statement that testing is not complete.

Part of a broader chip push

OpenAI is not alone in trying to design more of its AI hardware stack. The Verge reported that Microsoft, Meta and Amazon have also introduced custom AI chips for server workloads, including training and inference.

Those efforts are part of a wider race to control the computing hardware behind AI services. The Verge reported that these custom chips still trail Nvidia’s products in overall performance, even as companies seek alternatives for their own data centers.

For OpenAI, Jalapeño marks a move from relying on outside accelerators toward hardware tailored to its own models and products. The company says the processor is intended to support both today’s large language models and future generations.

This story draws on original reporting from The Verge.