During its annual event called “The Check Up,” Google has announced GPT-4 for healthcare, MedPaLM 2, along with new health initiatives and partnerships.
The MedPaLM 2 team reported that their model achieved an 85% score on medical exam questions (USMLE MedQA), which is comparable to an “expert” doctor’s level. This is an 18% improvement from the previous performance of Med-PaLM, surpassing similar AI models such as GPT-4 and others.
In addition to USMLE MedQA, the team tested the model’s performance on other benchmarks, including MedMCQA and MMLU clinical topics. Evaluators, consisting of clinicians and non-clinicians from diverse backgrounds and countries, tested the models against 14 criteria, including scientific accuracy, exactness, conformity with medical consensus, logical thinking, partiality, and potential for harm.
Google has acknowledged the significant disparities in healthcare services and pledged to collaborate with researchers and healthcare professionals to narrow these gaps and improve healthcare outcomes.
In December 2022, Google Research and DeepMind released the initial version of their AI model, MedPaLM. The model was evaluated using a new open-source medical question-answering benchmark called MultiMedQA.
Remarkably, the AI system achieved a passing score of over 60% on multiple-choice style questions, similar to the ones used in U.S. medical licensing exams. This was the first time an AI system had successfully accomplished this feat.
To create their model, the researchers used PaLM, a large language model with 540 billion parameters, and its instruction-tuned variation called Flan-PaLM. They used these models to assess other large language models using MultiMedQA.
In an exciting new development, Google launched the PaLM API just before OpenAI’s GPT-4. The latest API allows businesses and developers to build applications using Google’s state-of-the-art (SOTA) large language model, which is the same model used in Search, YouTube, and Gmail. This marks the first time that Google is providing access to its underlying models.