Salesforce researcher says business AI needs reliability over AGI
Silvio Savarese argues companies should judge AI agents by consistency and business usefulness, as models still miss basic common-sense problems.
By Sofia Marchetti · World Affairs Correspondent
3 min read
AI agents are advancing fast, but Salesforce’s chief AI scientist says their weak grasp of common sense remains a problem for companies that need dependable software. Silvio Savarese told Fortune that businesses should focus less on artificial general intelligence and more on whether AI agents can perform work accurately and consistently.
Savarese, chief scientist at Salesforce AI Research, discussed what former OpenAI researcher and Tesla AI executive Andrej Karpathy has called “jagged intelligence,” according to Fortune. The term describes a gap in current AI systems: models can do difficult tasks, yet still fail at problems that appear straightforward to people.
Fortune cited a familiar river-crossing riddle used in Salesforce research to show the issue. In the version Savarese described, a man must move a fox, a chicken and a sack of corn across a river, and his boat can carry him plus all three items. The minimum solution is one trip, because the stated boat capacity allows everything to travel together.
Salesforce researchers found that a ChatGPT model released last year did not give that answer, according to Savarese’s account to Fortune. Instead, the model offered a longer, traditional solution involving repeated crossings, even though the wording of the riddle made those extra steps unnecessary.
Savarese told Fortune that AI agents need four core abilities: memory, reasoning, interaction with the real world and communication through channels such as text or voice. He said large language models are becoming more capable across many tasks and research uses, but still struggle with reasoning and common sense.
A business-focused benchmark
For companies, Savarese argued, the question should not center on whether an AI system qualifies as artificial general intelligence. He told Fortune that AGI is difficult to define because new tasks can keep shifting the target.
Savarese proposed a different measure: Enterprise General Intelligence, or EGI. Fortune reported that Salesforce defines EGI as AI built for business use: capable, reliable over repeated use and able to work with existing systems, including in complex settings.
Under Savarese’s framework, Fortune reported, AI agents should be evaluated on two axes: whether they can solve complex business problems and whether they can do so consistently. He said the goal for enterprise AI is not proving math theorems or answering science questions, but addressing critical business tasks.
A sales assistant agent, for example, would need to remember earlier actions, factor in prior conversations and outcomes, and produce accurate responses that users can trust, Savarese told Fortune. He said reaching both capability and consistency would amount to achieving EGI.
Controls while agents improve
Savarese also cautioned that AI agents remain unfinished technology, according to Fortune. Salesforce customers using the company’s Agentforce platform can access trust and security filters designed to prevent agents from taking certain actions or performing certain tasks.
Salesforce is also researching how to give AI models stronger common-sense abilities at the model level, Savarese told Fortune. The effort reflects a practical concern for businesses: an agent that chooses a needlessly complicated plan can create errors, delays or user distrust, even when it appears powerful on harder benchmarks.
The comments came as companies are pushing AI agents into more workplace uses. Fortune reported in the same newsletter that Anthropic introduced Claude Opus 4 and Claude Sonnet 4, OpenAI announced Stargate UAE, JPMorgan agreed to provide more than $7 billion in financing for companies building an OpenAI data center campus in Abilene, Texas, and Meta introduced a Llama Startup Program for early-stage U.S. startups.
This story draws on original reporting from Fortune.