WHO’s new AI-powered chatbot is giving wrong medical answers
The World Health Organization is wading into the world of AI to provide basic health information through a human-like avatar. But while the bot responds sympathetically to users’ facial expressions, it doesn’t always know what it’s talking about.
SARAH, short for Smart AI Resource Assistant for Health, is a virtual health worker that’s available to talk 24/7 in eight different languages to explain topics like mental health, tobacco use and healthy eating. It’s part of the WHO’s campaign
RAVI KRISHNAN to find technology that can both educate people and fill staffing gaps with the world facing a health-care worker shortage.
WHO warns on its website that this early prototype, introduced on 2 April, provides responses that “may not always be accurate.” Some of SARAH’s AI training is years behind the latest data. And the bot occasionally provides bizarre answers, known as hallucinations in AI models, that can spread misinformation about public health.
SARAH doesn’t have a diagnostic feature like WebMD or Google. In fact, the bot is prooften
FOR ANY QUERIES/DELIVERY ISSUES
MANHAR KAPOOR grammed to not talk about anything outside of the WHO’s purview, including questions on specific drugs. So SARAH sends people to a WHO website or says that users should “consult with your health-care provider.”
“It lacks depth,” Ramin Javan, radiologist and researcher at George Washington University, said. “But I think it’s because they just don’t want to overstep their boundaries and this is just the first step.”
WHO says SARAH is meant to work in partnership with researchers and governments to provide accurate public health. The agency is asking them for advice on how to improve the bot and use it in emergency health situations.
But it emphasizes its AI assistant is still a work in progress.
“These technologies are not at the point where they are substitutes for interacting with a professional or getting medical advice from an actual trained clinician or health provider,” Alain Labrique, the director of digital health and innovation at WHO, said.
SARAH was trained on OpenAI’s ChatGPT 3.5, which used data through September 2021, so the bot doesn’t have up-to-date information on medical advisories or news events. When asked whether the US Food & Drug Administration has approved the Alzheimer’s drug Lecanemab, for example, SARAH said the drug is still in clinical trials when in fact it was approved for early disease treatment in January 2023. Even the WHO’s own data can trip SARAH up. When asked whether hepatitis deaths are increasing, it could not immediately provide details from a recent WHO report until promoted a second time to check the agency’s website for updated statistics. The agency said it’s checking on whether there’s a lag in updates.
And sometimes the AI bot draws a blank. None of this is unusual in the early days of AI development. In a study last year looking at how ChatGPT responded to 284 medical questions, researchers at Vanderbilt University Medical Center found that while it provided correct answers most of the time, there were multiple instances where the chatbot was “surprisingly wrong.”
SARAH was trained on OpenAI’s ChatGPT 3.5, which used data through September 2021