ChatGPT-4 outperforms GPT-3.5 and Google Bard in neurosurgery oral board exam

📆 4/19/2023 2:09 AM

United States News News

United States Latest News,United States Headlines

📆 4/19/2023 2:09 AM
📰 NewsMedical

⏱ Reading Time:
60 sec. here
2 min. at publisher
📊 Quality Score:
News: 27%
Publisher: 71%

ChatGPT-4 outperforms GPT-3.5 and Google Bard in neurosurgery oral board exam Neurosurgery LanguageModels GPT4 ChatGPT GoogleBard ArtificialIntelligence HigherOrderManagement SurgicalIndications DecisionMaking AI medrxivpreprint

By Neha MathurApr 19 2023Reviewed by Lily Ramsey, LLM In a recent study posted to the medRxiv* preprint server, researchers in the United States assessed the performance of three general Large Language Models , ChatGPT , GPT-4, and Google Bard, on higher-order questions, specifically representing the American Board of Neurological Surgery oral board examination. In addition, they interpreted the differences in their performance and accuracy after varying question characteristics.

Background Related StoriesAll three LLMs assessed in this study have shown the capability to pass medical board exams with multiple-choice questions. However, no previous studies have tested or compared the performance of multiple LLMs on predominantly higher-order questions from a high-stake medical subspecialty domain, e.g., neurosurgery.

About the study In the present study, researchers assessed the performance of GPT-3.5, GPT-4, and Google Bard on a 149-question module imitating the neurosurgery oral board exam. Study findings On a 149-question bank of mainly higher-order diagnostic and management multiple-choice questions designed for neurosurgery oral board exams, GPT-4 attained a score of 82.6% and outperformed ChatGPT's score of 62.4%. Additionally, GPT-4 demonstrated a markedly better performance than ChatGPT in the Spine subspecialty .

Conclusions There is an urgent need to develop more trust in LLM systems, thus, rigorous validation of their performance on increasingly higher-order and open-ended scenarios should continue. It would ensure the safe and effective integration of these LLMs into clinical decision-making processes.

We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

United States Latest News, United States Headlines

Similar News:You can also read news stories similar to this one that we have collected from other news sources.

Tinder's 'most swiped right man' charges £120 to find matches'Mr Tinder' is using Chat GPT to help 'genetically non-blessed people' find love Curiously
Read more »

Adobe's new AI tools could be the ChatGPT moment for video editingVideo editing is about to get a whole lot easier (and quicker)
Read more »

ChatGPT-like models may predict stock prices, public opinionPredict stocks, foresee public opinion, all kinda possible with ChatGPT-like models
Read more »

Google boss Sundar Pichai admits AI dangers 'keep me up at night'Google has started rolling out its own chatbot called Bard. It's a large language model, trained on huge amounts of data, to help it understand text inputs and respond. Its direct competition is Microsoft's revamped Bing, which has rolled OpenAI's ChatGPT technology into its search engine.
Read more »

Google is right to be spooked by Bing and ChatGPT – but it shouldn’t rush into AIGoogle doubles down on AI in bid to fend off Microsoft Bing
Read more »

Google warns billions over bank-raiding messages – check your inboxGOOGLE has issued important advice on how to spot a dangerous type of cyberattack. These sinister attempts to steal your money or private info often come via email. The “phishing” attac…
Read more »