The Dallas Morning News
Will artificial intelligence replace your physician?
CHATGPT passed exams and fell short when interacting with patients, but it has a place in medicine
To practice medicine in the United States, one needs to pass three Medical Licensing Exams. Medical students spend countless hours studying to pass these exams, also known as USMLES. During the first exam or Step 1, for example, a typical medical student will study 500 to 600 hours over six to eight weeks, averaging 10 hours a day.
Recently, the Generative Pretrained Transformer, CHATGPT, a generative artificial intelligence developed at the Openai research laboratory, passed or nearly met the passing threshold for all three medical exams without any specialized training, human input or reinforcement. In the study, “Performance of CHATGPT on USMLE: Potential for Ai-assisted medical education using large language models” published in PLOS Digital Health, the authors found CHATGPT demonstrated a high level of consistency and insight into its explanations for choosing the correct answer.
In another study titled “CHATGPT and antimicrobial advice: The end of the consulting infection doctor?” published in England’s premier medical journal, The Lancet, the authors asked CHATGPT eight hypothetical questions about infectious disease scenarios. They found CHATGPT recommended treatments that were appropriate for the diagnosis, prescribed antibiotics only if medically necessary, and recognized the implications of clinical response by providing management options, disclaimers and sources of advice.
In certain scenarios, however, CHATGPT ignored its own advice and missed patient safety clues, recommending inappropriate management plans. More worrisome, CHATGPT entered “failure modes” during which dangerous advice was repeatedly given despite prompting and redirecting.
While CHATGPT can pass licensing exams, it does have limitations when interacting directly in patient care.
Regardless, these limitations will not prevent CHATGPT from being further used by both doctors and patients. What started out as “Doctor Google” has been transformed into something more influential. From a single prompt, CHATGPT can examine millions of data points and prioritize a few hundred words that are most relevant to the questions by integrating information from various sources into a concrete statement. In addition, AI can mitigate many human limitations in decision-making: mainly, decisions that are driven by emotions such as fear and greed, and various cognitive biases.
However, as the AI model continues to improve, there will be a predisposition toward relying on an automated system, while reducing human cognitive and emotional elements. That is, without human input, Aigenerated answers will produce overconfidence in results, a phenomenon known as “automation bias.”
As AI continues to advance human knowledge, there needs to be a continued human presence. The authors of the study in PLOS Digital Health discovered some of their best results when CHATGPT was used to augment the medical education process by helping humans learn.
Specifically, CHATGPT taught best when concepts were non-obvious and not in the learners’ sphere of awareness, which is a significant result when determining how to use AI in medical education and medical practice.
In a recent Wall Street Journal opinion piece by Henry Kissinger, Eric Schmidt and Daniel Huttenlocher, they discuss the philosophical and practical challenges that AI presents, ultimately positing that “AI, when coupled with human reason, stands to be a more powerful means of discovery than human reason alone.”
This is an oft-repeated phrase that carries weight as we navigate how to apply AI in the modern world. However, when considering how AI applies to medicine and patient care, AI is poor at deciphering how humans have different worldviews, emotions and needs — concepts that require a human touch.
Ultimately, CHATGPT has shown that AI can augment human reasoning in medical education and medical practice, but the results of early studies have shown the reverse is also true: human reasoning and emotion, when coupled with AI, can be more powerful than AI alone.