AI agents are getting better. But while ChatGPT's o3 model is the best, there remain issues, and the best humans still outperform generative AI research solutions.
ChatGPT’s recent o3 AI model beat Anthropic’s Claude, Google’s Gemini, and Hangzhou’s Deepseek in a test of AI agents for web research. But there’s still a considerable gap between human capabilities and the best AI agents.
The highest performance achieved was .51 on a scale where an estimated “perfect” agent would hit about .8. Which means that even the best AI agents available now are relatively easily outperformed by humans.Still, AI agents are rapidly improving. Based on the year-old ChatGPT -4-Turbo’s score of 0.27, researchers say that “about 45% of the gap between smart generalist researchers and frontier agents” was closed within a year of development.
United States Latest News, United States Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Claude’s big upgrade supposedly does deep research faster than ChatGPTAnthropic launched two new features for Claude AI, a fast Research tool and Google Workspace integration for Gmail, Calendar, and Docs data.
Read more »
Researchers Find Easy Way to Jailbreak Every Major AI, From ChatGPT to ClaudeScience and Technology News and Videos
Read more »
ChatGPT Does Not Want You Cheating on It With ClaudeChatGPT does not want you cheating on it with Claude. Big AI companies are using new, personalized features to lock you in, John Herrman writes.
Read more »
Beats Pill bluetooth speaker slashed by $20 in rare Walmart deal, beating Best Buy, Amazon, TargetWalmart beats competitors with lowest price on Beats Pill speaker.
Read more »
It sure looks like ChatGPT Free is getting ads soonOpenAI is apparently considering bringing ads to the ChatGPT Free tier, and it might happen sooner than you think.
Read more »
Stablecoins Could Bring 'ChatGPT' Moment for Blockchain Adoption, Hit $3.7T by 2030: CitiStablecoin issuers could become one of the top U.S. Treasury holders, surpassing major sovereign nations, the report projected.
Read more »