Ask Runable forDesign-Driven General AI AgentTry Runable For Free
Runable
Back to Blog
Technology8 min read

OpenAI unveils first custom AI inference chip, Jalapeño, with Broadcom — and its development was sped-up with OpenAI's own models | VentureBeat

The companies attributed this speed to a deep software-hardware co-development process that actively used OpenAI’s own models to accelerate parts of the chip...

TechnologyInnovationBest PracticesGuideTutorial
OpenAI unveils first custom AI inference chip, Jalapeño, with Broadcom — and its development was sped-up with OpenAI's own models | VentureBeat
Listen to Article
0:00
0:00
0:00

Open AI unveils first custom AI inference chip, Jalapeño, with Broadcom — and its development was sped-up with Open AI's own models | Venture Beat

Overview

Open AI unveils first custom AI inference chip, Jalapeño, with Broadcom — and its development was sped-up with Open AI's own models

Credit: Venture Beat made with Open AI Chat GPT-Images-2.0

Details

Credit: Venture Beat made with Open AI Chat GPT-Images-2.0

Open AI and Broadcom this morning unveiled their first custom AI accelerator chip named "Jalapeño," positioning it is as a purpose-built processor for large language model (LLM) inference, rather than the more general GPUs offered by the likes of Nvidia or AMD.

According to its creators, Jalapeño is designed to support workloads behind Chat GPT, Codex, the API and future agentic products, though notably, Broadcom's news release positions it as a product that could be available to external AI firms as well — "built from the ground up for current and future LLMs across the industry." [Emphasis mine.]

Jalapeño's engineering timeline set a blistering pace for the semiconductor industry, moving from early schematics to fabrication readiness within a brief nine-month window, when new processor development cycles are typically measured in years. Indeed, the Open AI and Broadcom partnership itself was only publicly announced in October 2025.

The companies attributed this speed to a deep software-hardware co-development process that actively used Open AI’s own models to accelerate parts of the chip design.

After receiving an early physical model on Wednesday, Open AI outlined plans to begin rolling out these processors across active data centers by the end of this year. Open AI says it has already begun testing running at least one of its prior generation models, GPT‑5.3‑Codex‑Spark, on the chips at a production workload, though in a test environment.

The release marks a major strategic expansion for the Chat GPT creator as it attempts to build the full computational stack required to make advanced AI faster, more reliable, and more accessible.

There remain, of course, many outstanding questions — including how the new Jalapeño chip performs compared to direct competitors, its costs, and its manufacturing viability.

To understand why Open AI is moving into chip design, it helps to look at the architecture. Jalapeño is an Application-Specific Integrated Circuit, or ASIC.

Unlike a GPU, which can handle many types of workloads, an ASIC is tuned for narrower uses, as industry experts note. That narrower focus can make it cheaper and more efficient for specific AI tasks, though less adaptable than Nvidia-style GPUs.

In Jalapeño’s case, Open AI is starting from a clean design focused on modern LLM serving, instead of adapting a broader accelerator to fit its needs. The company says the architecture is shaped by its experience running large-scale AI products and is meant to reduce unnecessary data movement while better matching compute, memory and networking resources.

Broadcom is contributing core silicon implementation and networking technology, including Tomahawk networking silicon, while Celestica is helping with board, rack and system integration. The goal is to move the chip closer to its practical performance ceiling in real workloads, not just improve theoretical benchmarks.

However, Open AI's pivot into proprietary hardware is not just as a quest for technical supremacy: it may also make its core unit economics far more sustainable.

Audited financial documents posted recently by AI critic and AI public relations specialist Ed Zitron revealed that while Opena AI generated an impressive

13.07billioninrevenuethroughout2025,itstotaloperationalexpensesfortheyearballoonedto13.07 billion in revenue throughout 2025, its total operational expenses for the year ballooned to
34 billion, resulting in an operating loss of nearly $20.92 billion.

The primary culprit behind this cash hemorrhage involved pure compute requirements, though more is likely due to training than inference.

In 2025 alone, research and development costs—driven largely by the infrastructure required to train and serve massive language models—accounted for

19.18billion,orapproximately56percentofthecompanysentirespendingfootprint.Furthermore,OpenAIreportedlypaidMicrosoftover19.18 billion, or approximately 56 percent of the company's entire spending footprint. Furthermore, Open AI reportedly paid Microsoft over
10.59 billion just for R&D and compute infrastructure last year.

Still, as Open AI lays the groundwork for a heavily anticipated public offering in 2026, the Jalapeño inference chip may offer some reassurance to private investors and public markets that Open AI has a plan for digging itself out of the financial hole and moving toward profitability. If it can drive down the costs of AI inference, then maybe it can recoup some of the losses spent on costly training runs.

"By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access," said Greg Brockman, Open AI's president and co-founder, in a statement included in Broadcom's release.

What Does This Mean for Nvidia and All of Open AI's Other Chip Providers?

The introduction of Jalapeño immediately raises questions about Open AI's strategic positioning within the fiercely competitive semiconductor and GPU market.

Since kicking off the generative AI boom in late 2022, Open AI has remained one of the largest customers of GPU market leader Nvidia's premium products, but has also taken billions in investment dollars from the firm (engendering accusations of "circular dealing"), and expanded to work with other rival chipmakers to fuel its appetites.

Nvidia: In February 2026, Nvidia finalized a

30billiondirectinvestmentintoOpenAIaspartofamassive30 billion direct investment into Open AI as part of a massive
110 billion funding round. This deal secured an agreement to deploy 10 gigawatts of computing systems—including 3 gigawatts of dedicated inference capacity and 2 gigawatts of training capacity—utilizing Nvidia's next-generation Vera Rubin platform.

Nvidia: In February 2026, Nvidia finalized a

30billiondirectinvestmentintoOpenAIaspartofamassive30 billion direct investment into Open AI as part of a massive
110 billion funding round. This deal secured an agreement to deploy 10 gigawatts of computing systems—including 3 gigawatts of dedicated inference capacity and 2 gigawatts of training capacity—utilizing Nvidia's next-generation Vera Rubin platform.

Amazon Web Services (AWS): As part of the same February 2026 funding round, Amazon invested $50 billion into Open AI. This deal included a commitment for Open AI to consume approximately two gigawatts of AWS's proprietary Trainium computing capacity over the next eight years.

Amazon Web Services (AWS): As part of the same February 2026 funding round, Amazon invested $50 billion into Open AI. This deal included a commitment for Open AI to consume approximately two gigawatts of AWS's proprietary Trainium computing capacity over the next eight years.

Advanced Micro Devices (AMD): Open AI signed agreements with Nvidia's chief hardware rival, AMD for the former's usage of the latter's AMD Instinct™ MI450 Series GPUs.

Advanced Micro Devices (AMD): Open AI signed agreements with Nvidia's chief hardware rival, AMD for the former's usage of the latter's AMD Instinct™ MI450 Series GPUs.

Cerebras: The company also struck a pact with Cerebras, an AI chipmaker that executed its initial public offering in May 2026.

Cerebras: The company also struck a pact with Cerebras, an AI chipmaker that executed its initial public offering in May 2026.

This sprawling web of vendor agreements highlights the sheer scale of Open AI's infrastructural ambitions. The ultimate goal of the Open AI and Broadcom partnership involves deploying gigawatt-scale data centers with Microsoft and other partners beginning in 2026 — that is, data centers with compute requiring energy on the order of cities.

For Broadcom, the partnership acts as a massive reputational catalyst. The company has been among the biggest beneficiaries of the generative AI boom, helping hyperscalers and frontier labs engineer custom silicon.

Broadcom shares reflect this momentum, demonstrating an 18% year-over-year increase in the first part of 2026 and a nearly 7X boost since the end of 2022, according to CNBC.

Ultimately, Jalapeño confirms that Open AI believes it is ready to move beyond software and code into the realm of real-world, custom hardware.

By controlling the physics of its inference pipeline—while simultaneously leveraging the capital and hardware of Nvidia, Amazon, AMD, and Cerebras—Open AI is attempting to rapidly rewrite its future unit economics of AI.

Deep insights for enterprise AI, data, and security leaders

By submitting your email, you agree to our Terms and Privacy Notice.

Key Takeaways

  • Open AI unveils first custom AI inference chip, Jalapeño, with Broadcom — and its development was sped-up with Open AI's own models

  • Credit: Venture Beat made with Open AI Chat GPT-Images-2

  • Credit: Venture Beat made with Open AI Chat GPT-Images-2

  • Open AI and Broadcom this morning unveiled their first custom AI accelerator chip named "Jalapeño," positioning it is as a purpose-built processor for large language model (LLM) inference, rather than the more general GPUs offered by the likes of Nvidia or AMD

  • According to its creators, Jalapeño is designed to support workloads behind Chat GPT, Codex, the API and future agentic products, though notably, Broadcom's news release positions it as a product that could be available to external AI firms as well — "built from the ground up for current and future LLMs across the industry

Cut Costs with Runable

Cost savings are based on average monthly price per user for each app.

Which apps do you use?

Apps to replace

ChatGPTChatGPT
$20 / month
LovableLovable
$25 / month
Gamma AIGamma AI
$25 / month
HiggsFieldHiggsField
$49 / month
Leonardo AILeonardo AI
$12 / month
TOTAL$131 / month

Runable price = $9 / month

Saves $122 / month

Runable can save upto $1464 per year compared to the non-enterprise price of your apps.