Carnegie Mellon Research Insights • The Register

Gartner estimates that over 40% of agentic AI projects will be canceled by 2027 due to high costs, unclear business value, or inadequate risk controls, yet 60% may continue, despite the low success rates for AI agents completing multi-step tasks (30-35%). Many AI vendors misrepresent their offerings, contributing to “agent washing,” as only about 130 of thousands genuinely exhibit agentic capabilities. Researchers from Carnegie Mellon University developed a benchmark, TheAgentCompany, revealing that even the best AI agents only completed around 30% of tasks, highlighting significant limitations such as failures in communication and task execution. Similarly, Salesforce’s benchmarking on CRM tasks showed performance degradation from 58% in single-turn to 35% in multi-turn interactions, underscoring the inadequacy of current models. Gartner predicts that by 2028, AI agents could autonomously handle 15% of daily work decisions, suggesting potential growth in useful applications, despite current shortcomings.

Source link

News

Company:

Join our community of SUBSCRIBERS and be part of the conversation.

MIT’s Recursive Language Models Break Through LLM Context Window Limit – StartupHub.ai

Will OpenAI Ads Resonate with Users?

Google Compares Gemini’s Gradient Design to the Iconic Smiling Mac of 1984

Microsoft’s 27% Stake in OpenAI Valued at $203 Billion

The Wild Musk-OpenAI Legal Battle Escalates with Astronomical Financial Stakes

Transforming Character.ai: How DigitalOcean and AMD Achieved a 2x Boost in Production Inference Performance

Integrating AI into Daily Life: 10 Practical Examples and Essential Privacy Tips

AI Boom Sparks Déjà Vu for Those Who Anticipated Previous Market Crashes

Struggling with Slow AI Adoption? It’s Your Culture, Not Your Engineers, Holding You Back.

Universal Agent Interoperability Protocol: The TCP/IP Framework for the AI Economy

Carnegie Mellon Research Insights • The Register

Important Update: ChatGPT’s Free Version Comes with a New Catch – Fast Company

Join Our Beta Program: Get Early Access to Humonos

OSS: Let’s Have the Conversation

OpenAI Launches Ad Tests for ChatGPT in the U.S. as Part of New Revenue Strategy – The News International

TSMC Must Rely on Sunny AI Customer Forecasts for Strategic Decisions

Local News

MIT’s Recursive Language Models Break Through LLM Context Window Limit – StartupHub.ai

Will OpenAI Ads Resonate with Users?

Transforming Character.ai: How DigitalOcean and AMD Achieved a 2x Boost in Production Inference Performance

Integrating AI into Daily Life: 10 Practical Examples and Essential Privacy Tips

MIT’s Recursive Language Models Break Through LLM Context Window Limit – StartupHub.ai

Will OpenAI Ads Resonate with Users?

Transforming Character.ai: How DigitalOcean and AMD Achieved a 2x Boost in Production Inference Performance