The AI industry just dropped some numbers that should terrify every executive who’s betting their company’s future on AI agents. Carnegie Mellon researchers put these systems through real workplace tasks, and the results are brutal. OpenAI’s flagship GPT-4o? Failed 91% of the time. Amazon’s Nova? A catastrophic 98% failure rate. Even Google’s best-performing agent failed 7 out of 10 basic office tasks. While VCs poured $131 billion into AI this year alone, the dirty secret is that these systems can’t even handle tasks your intern could complete. Are we witnessing the most expensive tech failure in history, or is there something deeper going on here?
The numbers don’t lie, folks. While Silicon Valley has been screaming about AI agents replacing all of us, Carnegie Mellon just published the most comprehensive study yet on how these systems actually perform in real workplaces. The results should be a wake-up call for every business leader who’s been drinking the AI Kool-Aid.