Business

Sail Research raises $80 million to cut AI agent computing costs

The startup, founded by former Apple engineers, is building inference infrastructure for AI agents that run for hours rather than chats.

Maya Lindqvist

By Maya Lindqvist · Senior Technology Correspondent

3 min read

Sail Research raises $80 million to cut AI agent computing costs
Photo: Fortune

Sail Research has come out of stealth with $80 million in seed and Series A funding at a $450 million valuation, Fortune reported. The startup is targeting a costly shift in artificial intelligence: companies are using agents that work for hours, which can consume far more computing capacity than chatbot-style exchanges.

Fortune reported that Kleiner Perkins led Sail’s Series A round. Sequoia, Redpoint, Theory Ventures, Vine Ventures and CRV also joined the financing, according to Fortune.

The company was founded by Neil Movva, a former Apple engineer, and co-founder and CTO Menon, who also worked at Apple, Fortune reported. The two met as Stanford freshmen, where they took the same classes and had the same academic counselor, according to Fortune.

A bet on longer-running AI work

Sail is building software that controls how AI models run on existing chips, Fortune reported. Movva’s argument is that much AI infrastructure was tuned for fast exchanges, while enterprise users are increasingly deploying agents that examine codebases, screen job applicants or conduct research with limited human direction.

Fortune cited outside data showing the cost pressure behind that shift. The publication reported that enterprise AI bills have tripled even as per-token prices have fallen, and that agentic workflows can use 50 to 500 times as many tokens as basic chat. Goldman Sachs has forecast a 24-fold increase in token consumption by 2030, Fortune reported.

Movva told Fortune that Sail is built around efficiency rather than instant response times. He said it is hard to build an inference engine that maximizes both throughput and low latency, and that Sail focuses on throughput while others prioritize latency.

That choice limits where Sail can be used, according to Fortune. The system is not aimed at live chatbots or voice assistants, but at background agents that can run for several hours. Movva told Fortune customers often see cost improvements of three to 10 times compared with similar options.

From Apple chips to inference software

Movva worked across several parts of the AI technology stack, Fortune reported. He observed Nvidia’s move from gaming processors toward AI chips in 2016 and 2017, then joined Apple to work on computer-vision chips used in iPhones, according to Fortune.

Fortune reported that Movva later worked at Together AI, an open-source model inference provider. His experience there helped shape Sail’s thesis that long-running AI agents needed infrastructure designed around different priorities than interactive applications, according to Fortune.

Kleiner Perkins partner Aditya Naganath told Fortune he believed the next major AI wave would involve software that acts autonomously across many tasks for long periods. After meeting Movva, Naganath said the need for a dedicated inference platform for those workloads seemed clear to both of them.

Sail launched its inference service in March and is now processing trillions of tokens each week, Fortune reported. Detail.dev, an early customer, uses Sail to run code-review agents that spend three to four hours or longer examining full codebases for bugs, according to Fortune.

Fortune reported that Sail faces competition from Together AI, which is also backed by Kleiner Perkins, as well as from frontier AI labs such as Anthropic, OpenAI and Google that are developing their own inference systems. Naganath told Fortune he sees Sail and Together as focused on different markets: chat-based work for Together, and long-running agent workloads for Sail.

This story draws on original reporting from Fortune.