AI dangerously close to solving test that only the brightest minds on Earth could: 'Human expertise still matters'

📆 3/30/2026 10:15 AM

Tech News

Artificial Intelligence, Chatgpt, Google

📆 3/30/2026 10:15 AM
📰 nypost

⏱ Reading Time:
258 sec. here
10 min. at publisher
📊 Quality Score:
News: 121%
Publisher: 67%

Tech experts claim they’re a year away from beating “Humanity’s Last Exam” (HLE) — a supposedly unsolvable test for bots as well as most people.

Alphabet CEO Sundar Pichai speaks about Google DeepMind at a Google I/O event in Mountain View, Calif., May 10, 2023.Artificial intelligence is already outperforming humans at various intelligence-based activities ranging from chess to pattern recognition.

Now, experts claim they’re a year away from beating “Humanity’s Last Exam” — a supposedly unsolvable test that only our best and brightest can pass. “Model builders have really done a great job at improving these reasoning models,” Calvin Zhang, the research lead at Scale, the AI firm behind HLE,“Humanity’s Last Exam stands as one of the clearest assessments of the gap between AI and human intelligence,” declared Dr. Tung Nguyen, a computer science and engineering professor at Texas A&M who contributed 73 of the questions .Developed to see how close AI is to the “frontiers of human expertise,” this intelligence benchmark is comprised of 2,500 questions spanning over 100 highly specialized fields, ranging from mythology to rocket science. Over 1,000 authorities from across the sciences, humanities and arts contributed to the HLE, which was designed to required PHD-levels of comprehension to ace — just beyond the expertise of AI,AI chatbots are prone to frequent fawning and flattery— and are giving users bad advice: studyZhang said the ultimate goal was to create a “closed-ended academic benchmark, set to the frontier of expert humans, that only a handful of people on Earth can really solve.” Nonetheless, AI’s performance on the HLE has improved at exponential speeds within a short period of time. While ChatGPT answered fewer than 3% of questions correctly during its first attempt in 2024, its rival Google Gemini got 18.8% of the questions right within months.Anti-AI activist groups led a ”March Against The Machines” through King’s Cross in the UK to advocate for a global pause on the development of advanced artificial intelligence.Zhang believes that AI could approach full marks — anyone scoring close to 100% is defined as a “universal expert” within a year. “If we truly cared about this as the only thing in life, I think we could get to it pretty quickly,” boasted Kate Olszewska, a product manager at Google DeepMind., agrees: “If we truly cared about this as the only thing in life, I think we could get to it pretty quickly.” This light-speed progress is impressive given the pains Scale took to make the HLE AI-proof. The test-makers reportedly offered a $500,000 prize to experts who could contribute questions that could not be easily answered via web search, eventually drawing over 70,000 responses.Any questions that could be answered by existing models were discarded until the exam was whittled down to 2,500 of the most AI-ironclad queries. For instance, testees might be asked to translate ancient Palmyrene inscriptions or to identify microanatomical structures in birds during the course of the test exam, To further ensure the test was AI-ironclad, the team kept most of the answers hidden so that later models couldn’t memorize them. “Humanity’s Last Exam stands as one of the clearest assessments of the gap between AI and human intelligence,” declared Dr. Tung Nguyen, a computer science and engineering professor at Texas A&M who contributed 73 of the questions . He argued that while some of the aforementioned models performed well, the poor scores of the rest illustrate that the chasms between AI and human intelligence remain “wide.” “When AI systems start performing extremely well on human benchmarks, it’s tempting to think they’re approaching human‑level understanding,” Nguyen said. “But HLE reminds us that intelligence isn’t just about pattern recognition — it’s about depth, context and specialized expertise.”The techspert said that the ultimate goal wasn’t to stump “AI,” but to rather to illustrate the systems’ strengths and weaknesses. In turn, this would help us build “safer, more reliable technologies” while also demonstrating “why human expertise still matters” — an important goal in a world where AI seems to be replacing us in every sector from fast food toThat being said, AI has displayed a surprisingly humanlike aptitude for problem solving, demonstrating that its processing powers aren’t relegate to rote memory.From this, researchers deduced that the machine learners “develop human-like conceptual representations of objects.” “Further analysis showed strong alignment between model embeddings and neural activity patterns” in the region of the brain associated with memory and scene recognition.Hundreds of protesters swarm proposed NYC men's homeless shelter site, physically block construction truck Taylor Frankie Paul and rumored ‘Bachelorette’ winner Doug Mason turn heads with social media exchangeHe killed her, then disappeared: The $250K reward for a decade-old murder | Forgotten Fugitives "Humanity’s Last Exam stands as one of the clearest assessments of the gap between AI and human intelligence,” declared Dr. Tung Nguyen, a computer science and engineering professor at Texas A&M who contributed 73 of the questions .Anti-AI activist groups led a ''March Against The Machines'' through King's Cross in the UK to advocate for a global pause on the development of advanced artificial intelligence.Stream It Or Skip It: 'Jo Nesbø's Detective Hole' On Netflix, Where A Troubled Detective Tracks Down A Serial Killer Who Is Terrorizing Oslo

We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

Artificial Intelligence Chatgpt Google Study Says

Write Comment

United States Latest News, United States Headlines

Similar News:You can also read news stories similar to this one that we have collected from other news sources.

Earth Day Thrift Shift: Goodwill Central Texas launches 'Swap Your Shop' Challenges Earth Day approaches, Goodwill Central Texas is encouraging shoppers to make one simple change that could have a big environmental impact.The organization is
Read more »

What would happen to Earth if the sun suddenly vanished?Jesse Steinmetz is a freelance reporter and public radio producer based in Massachusetts. His stories have covered everything from seaweed farmers to a minimalist smartphone company to the big business of online scammers and much more. His work has appeared in Inc.
Read more »

China's New Sodium-Ion EV Battery Can Fully Charge In Just 11 MinutesAs a tech enthusiast, Alvin started a personal tech blog in 2018 and began his professional writing career a year later, in 2019, when he worked as a contributor for Kenyan-based TechTrendsKE and Tech Arena, writing news, features, how-to guides, and reviews in the consumer tech space.
Read more »

Swap Your Shop: A Simple Earth Day Action with Big Environmental ImpactGoodwill Central Texas encourages shoppers to embrace secondhand purchases to reduce waste and carbon emissions. Even one swap can significantly impact the environment, potentially removing millions of pounds of waste and billions of pounds of carbon emissions.
Read more »

This Android-Based OS Isn't Backing Down When It Comes To User PrivacyAs a tech enthusiast, Alvin started a personal tech blog in 2018 and began his professional writing career a year later, in 2019, when he worked as a contributor for Kenyan-based TechTrendsKE and Tech Arena, writing news, features, how-to guides, and reviews in the consumer tech space.
Read more »

Critical minerals and rare earth supplyHow supply chain concentration, geopolitical competition, and surging demand are reshaping the global landscape for strategic minerals - and what it means for the decade ahead.
Read more »