Questions over tests of AI math abilities suggest we may never know how capable intelligent machines computers can become.
Sign up for our email newsletter for the latest science newsBack in 2019, a group of computer scientists performed a now-famous experiment with far-reaching consequences for artificial intelligence research. At the time, machine vision algorithms were becoming capable of recognizing a wide range of objects with some recording spectacular results in the standard tests used to assess their abilities.
All that changed when a team from the University of California, Berkeley, created a new dataset of carefully labelled images that they knew the algorithms could not have seen. They then asked the algorithms to identify the objects in the images and found they weren’t as good as everyone had claimed. GSM8K is a “dataset of 8.5K high quality linguistically diverse grade school math word problems created by human problem writers.” It consists of some 7500 questions for training an AI system and 1000 questions to test the system.
Following the lead by the Berkeley researchers, the Scale AI team decided to test this idea by developing their own mathematics test of 1250 questions. They call this GSM1k and have carefully ensured that it closely resembles the GSM8K test but has never been published.
United States Latest News, United States Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Using Autodiff to Estimate Posterior Moments, Marginals and Samples: Experimental Datasets and ModelImportance weighting allows us to reweight samples drawn from a proposal in order to compute expectations of a different distribution.
Read more »
Performance Analysis of Diverse Hateful Meme Detection DatasetsDelve into the evaluation and analysis of a probing-based approach for detecting hateful memes.
Read more »
Preparing Complex Datasets for Amazon's Recommender System StudyLearn about data engineering strategies and efficient computation techniques for large-scale data processing.
Read more »
Maserati Recalls Ghibli And Quattroporte For Leaky Fuel SensorsThe fuel line sensors in 58 Maserati Ghiblis and Quattroportes may leak, leading to engine issues and potential fires
Read more »
Sealing homes’ leaky HVAC systems is a sneaky good climate solutionThe problem, however, is that energy efficiency pays back over time, but it comes with high upfront costs.
Read more »
Woodford Reserve tried to undermine unionization effort at its Kentucky distillery, judge rulesA federal judge has ruled that Woodford Reserve undermined unionization efforts at its Kentucky distillery by awarding pay raises, relaxing its vacation policy and handing out bottles of whiskey to workers before a vote on whether to unionize.
Read more »