AI Chatbots Miss More Than Half of Medical Diagnoses, Study Finds

Although chatbots and large language models can answer a slew of everyday questions, they shouldn’t be the first place you turn to for medical advice, a new study from the scientific journal Nature Medicine shows.

During the study, 1,298 participants in the UK were asked to use a large language model, such as ChatGPT or Meta’s Llama 3, for medical advice. When used in this way, the LLM correctly identified medical conditions in fewer than 34.5% of cases.

How LLMs performed in the study

The study acknowledged that LLMs now achieve scores on medical knowledge benchmarks comparable to passing the US Medical Licensing Exam, and that clinical documents from LLMs “are rated as equivalent to or better than those written by doctors.”

However, a problem was revealed when the study’s participants tried to get the same results by asking the LLM questions but were not successful. This is because users often didn’t provide enough information, the study found. It reports that in 16 of 30 sampled interactions, initial messages contained only partial information.

“In two cases, LLMs provided initially correct responses but added new and incorrect responses after the users added additional details,” the study said, suggesting that conversing more with the chatbots did not improve the probability of receiving a correct medical diagnosis.

After the initial diagnosis, the LLMs provided the correct follow-up steps to the person just 44.2% of the time.

Meta’s Llama 3 was one of the large language models used in the study.

SOPA Images/Getty Images

How often are people using chatbots for medical advice?

According to a survey by OpenAI, which owns ChatGPT, 3 in 5 US adults report using AI for health. “They are using AI to get information when they first feel unwell, consulting it to prepare for their visits with their clinicians, and using it to better comprehend patient instructions and recommendations,” OpenAI stated.

ChatGPT for Self-Diagnosis: AI Is Changing the Way We Answer Our Own Health Questions

And although there’s a small disclaimer on ChatGPT’s website that reads, “ChatGPT can make mistakes. Check important info,” many people do take the chatbot’s word for fact.

The study serves as a reminder that ChatGPT and similar chatbots should not be relied upon for medical guidance, particularly in serious situations.

Read the full article here