ChatGPT's New AI Model Solves Complex Problems

📆 9/13/2024 4:49 AM

Technology News

AI, Chatgpt, Openai

📆9/13/2024 4:49 AM

📰ForbesTech

⏱293 sec. here / 12 min. at publisher

📊News: 141% · Publisher: 59%

OpenAI unveiled its new o1 AI model, designed to solve complex problems in science, coding, and math. Unlike previous models, o1 can reason through tasks, recognize mistakes, and learn from them, making it a formidable opponent even for Britain's toughest TV quiz, Only Connect.

is widely regarded as Britain’s toughest TV quiz – but it’s no match for ChatGPT’s new problem-solving AI model. OpenAI yesterday unveiled a preview of its new o1 AI model, which the company claims is designed to “reason through complex tasks and solve harder problems than previous models in science coding, and math”.

Whereas the company’s previous AI models have often stumbled over quite basic questions, such as how many times the letter R appears in the word Strawberry, the new model is designed to respond more like a person would. “Through training, they learn to refine their thinking process, try different strategies, and recognize their mistakes,” OpenAI claims in a blog post announcing theOnly Connect . For those not familiar with the show, it’s no ordinary quiz. Contestants have to find complex connections between words or phrases, complete sequences, or solve phrases where all the vowels have been removed from the answers. The quiz is unapologetically pitched at intellectuals and is something of a cult hit in Britain, where it’s broadcast on the BBC.The first round of the quiz is Connections. Contestants are given four clues in turn and they must find the connection between them. In the quiz, the contestants are given the clues one by one, scoring more points for getting the answer right with fewer clues. For this test, I gave ChatGPT all four clues at once.After nine seconds of thinking, it correctly worked out that all of these words have no positive opposite in the English language. It was even faster at working out the connection between manufacturing gunpowder, Roman mouthwash, thickening wool and marking territory was that they were all processes that historically involved the use of urine.Got it? They all involve Tom, Dick and Harry. I asked the ChatGPT o1-preview five Connections questions and it scored a perfectSequences is similar to Connections, in that all the words are linked by a common theme. However, in this round, contestants must work out what the fourth item in the sequence will be, without knowing what the theme is.ChatGPT took 12 seconds to think before correctly identifying they were the historical periods used for the first three series of the British TV comedy,It took 11 seconds to think, before correctly working out these were iconic moments in successive Olympic opening ceremonies, and that it should have been 2,008 Fou drummers for the 2008 Olympics in Beijing, not 2,088. It added the fourth answer as Queen Elizabeth II parachuting with James Bond for London 2012. Pretty amazing, given the question was partly incorrect!The AI worked out the sequence was numbers related. The theme is how numbers are spoken in French . ChatGPT seemed to figure this out, but then suggested the next word in sequence should be “hive”, which is an English homophone for “five”, not say “sank”, which would have been a correct homophone for “cinq”. In this round, then, it scoredThe next round is arguably the hardest for the AI to solve. The contestants are given 16 words on a 4x4 grid and asked to separate the words into four groups of four. Each group has a common theme. To make life even harder, red herrings are thrown into the mix: some words could be part of two groups. Here, for example, are the 16 words from one of the Connecting Walls I asked ChatGPT to solve: 1. Priest 2. Lawford 3. Knight 4. Pope 5. Kremlin 6. Sinatra 7. Martin 8. Bishop 9. Child 10. Deacon 11. Grand Slam 12. Hopman 13. Sister 14. Davis 15. Canon 16. Neighbour These puzzles really caused the AI to ponder, and while it was processing, it partly revealed its “thought process”. On the above, for example, it said “mapping job titles” as it began to work through potential links. After 88 seconds it delivered the correct answer. Group 1 is Rat Pack members . That group could also have included Bishop, for Joey Bishop, but ChatGPT realized that should be in group 2, clergy members Group 3 was tennis competitions and group 4 could all be completed with “hood” . Canon and Sister could also have been in the clergy, of course. That’s a pretty stiff challenge, but the AI pulled it off. As it did for another Connecting Wall I threw at it. It came close to making it three out of three, but got two groups wrong, falling for a red herring and failing to notice a series of words that could be followed with the word “stop”. Still, it would still score some points in the quiz for getting two groups and three connections correct. Overall, it would have scoredThe final round of the quiz is Missing Vowels, where contestants are given a theme and must identify the clues, which have had the vowels stripped from them. To make it harder, the words are inconsistently spaced. So, for example, if the answer was:I thought this would be the easiest round for the AI to crack, but on its first attempt it got all four answers wrong. And then I realized... I had accidentally reverted to the older GPT-4o model and not the o1-preview. When given the same set of words on the new model, it scored a perfect four out of four.I’d love to have tested ChatGPT on more Missing Vowels rounds, but by this time I’d burned through all my preview credits, meaning I’ll have to wait until next week to use the model again. So, in a brief test, it scoredThe ability of the AI to solve even fairly complex word problems is genuinely staggering. Equally impressive is the way the AI shows its thinking as it’s working through problems, ruling out some theories and going back to others, until it finds the correct answer. It’s not flawless, but it’s a huge step forward in sophistication from the previous model. And plenty smart enough to be a winning contestant on Britain’s toughest TV quiz.Our community is about connecting people through open and thoughtful conversations. We want our readers to share their views and exchange ideas and facts in a safe space.Insults, profanity, incoherent, obscene or inflammatory language or threats of any kindContinuous attempts to re-post comments that have been previously moderated/rejectedAttempts or tactics that put the site security at riskProtect your community.

We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

AI Chatgpt Openai Problem-Solving Only Connect Quiz

Write Comment

United States Latest News, United States Headlines

Similar News:You can also read news stories similar to this one that we have collected from other news sources.

Cheapest Tesla Model 3 With LFP Battery Impresses In Real-World Efficiency TestAt highway speeds, the refreshed Model 3 bests the outgoing model, the previous efficiency record holder.
Read more »

Aston Martin Teases 2025 Vanquish’s V12 SoundtrackThe new model will outmuscle the new Ferrari 12Cilindri's 6.5-liter naturally-aspirated V12
Read more »

Tesla Upgrades Model Y, Downgrades Model 3 In MexicoTesla keeps tweaking its products to meet customer demand. Changes are coming for the Model Y and Mexico get a cheaper Model 3.
Read more »

Rumors about OpenAI 'Strawberry' model indicate a more 'thoughtful' generative AI to come to ChatGPTIzzy, a tech enthusiast and a key part of the PhoneArena team, specializes in delivering the latest mobile tech news and finding the best tech deals. Her interests extend to cybersecurity, phone design innovations, and camera capabilities.
Read more »

Mercedes-AMG’s Electric CLA Is Coming To Divide OpinionsThis compelling new model will take the fight to the Tesla Model 3 Performance
Read more »

OpenAI Announces a New AI Model, Code-Named Strawberry, That Solves Difficult Problems Step by StepThe ChatGPT maker reveals details of what's officially known as OpenAI-o1, which shows that AI needs more than scale to advance.
Read more »