USA TODAY US Edition

ChatGPT is poised to change medical care

AI technology can empower patients, but glitches make some wary

- Karen Weintraub

It’s almost hard to remember a time before people could turn to “Dr. Google” for medical advice. Some of the informatio­n was wrong. Much of it was terrifying. But it helped empower patients who could, for the first time, research their own symptoms and learn more about their conditions.

Now, ChatGPT and similar language processing tools promise to upend medical care again, providing patients with more data than a simple online search and explaining conditions and treatments in language nonexperts can understand.

For clinicians, these chatbots might provide a brainstorm­ing tool, guard against mistakes and relieve some of the burden of filling out paperwork, which could alleviate burnout and allow more facetime with patients.

But – and it’s a big “but” – the informatio­n these digital assistants provide might be more inaccurate and misleading than basic internet searches.

“I see no potential for it in medicine,” said Emily Bender, a linguistic­s professor at the University of Washington. By their very design, these large-language technologi­es are inappropri­ate sources of medical informatio­n, she said.

Others argue that large language models could supplement, though not replace, primary care.

“A human in the loop is still very much needed,” said Katie Link, a machine learning engineer at Hugging Face, a company that develops collaborat­ive machine learning tools.

Link, who specialize­s in health care and biomedicin­e, thinks chatbots will be useful in medicine someday, but it isn’t yet ready.

And whether this technology should be available to patients, as well as doctors and researcher­s, and how much it should be regulated remain open questions.

Regardless of the debate, there’s little doubt such technologi­es are coming – and fast. ChatGPT launched its research preview on a Monday in December. By that Wednesday, it reportedly already had 1 million users. In February, both Microsoft and Google announced plans to include AI programs similar to ChatGPT in their search engines.

“The idea that we would tell patients they shouldn’t use these tools seems implausibl­e. They’re going to use these tools,” said Dr. Ateev Mehrotra, a professor of health care policy at Harvard Medical School and a hospitalis­t at Beth Israel Deaconess Medical Center in Boston.

“The best thing we can do for patients and the general public is (say), ‘hey, this may be a useful resource, it has a lot of useful informatio­n – but it often will make a mistake and don’t act on this informatio­n only in your decision-making process,’” he said.

How ChatGPT it works

ChatGPT – the GPT stands for Generative Pre-trained Transforme­r – is an artificial intelligen­ce platform from San Francisco-based startup OpenAI. The free online tool, trained on millions of pages of data from across the internet, generates responses to questions in a conversati­onal tone.

Other chatbots offer similar approaches with updates coming all the time.

These text synthesis machines might be relatively safe to use for novice writers looking to get past initial writer’s block, but they aren’t appropriat­e for medical informatio­n, Bender said.

“It isn’t a machine that knows things,” she said. “All it knows is the informatio­n about the distributi­on of words.”

Given a series of words, the models predict which words are likely to come next.

So, if someone asks “what’s the best treatment for diabetes?” the technology might respond with the name of the diabetes drug “metformin” – not because it’s necessaril­y the best but because it’s a word that often appears alongside “diabetes treatment.”

Such a calculatio­n is not the same as a reasoned response, Bender said, and her concern is that people will take this “output as if it were informatio­n and make decisions based on that.”

Bender also worries about the racism and other biases that may be embedded in the data these programs are based on. “Language models are very sensitive to this kind of pattern and very good at reproducin­g them,” she said.

The way the models work also means they can’t reveal their scientific sources – because they don’t have any.

Modern medicine is based on academic literature, studies run by researcher­s published in peer-reviewed journals. Some chatbots are being trained on that body of literature. But others, like ChatGPT and public search engines, rely on large swaths of the internet, potentiall­y including flagrantly wrong informatio­n and medical scams.

With today’s search engines, users can decide whether to read or consider informatio­n based on its source: a random blog or the prestigiou­s New England Journal of Medicine, for instance.

But with chatbot search engines, where there is no identifiab­le source, readers won’t have any clues about whether the advice is legitimate.

“Understand­ing where is the underlying informatio­n coming from is going to be really useful,” Mehrotra said. “If you do have that, you’re going to feel more confident.”

Potential for doctors and patients

Mehrotra recently conducted an informal study that boosted his faith in these large language models.

He and his colleagues tested ChatGPT on a number of hypothetic­al vignettes – the type he’s likely to ask first-year medical residents. It provided the correct diagnosis and appropriat­e triage recommenda­tions about as well as doctors did and far better than the online symptom checkers that the team tested in previous research.

“If you gave me those answers, I’d give you a good grade in terms of your knowledge and how thoughtful you were,” Mehrotra said.

But it also changed its answers somewhat depending on how the researcher­s worded the question, said co-author Ruth Hailu. It might list potential diagnoses in a different order or the tone of the response might change, she said.

Mehrotra, who recently saw a patient with a confusing spectrum of symptoms, said he could envision asking ChatGPT or a similar tool for possible diagnoses.

“Most of the time it probably won’t give me a very useful answer,” he said, “but if one out of 10 times it tells me something – ‘oh, I didn’t think about that. That’s a really intriguing idea!’ Then maybe it can make me a better doctor.”

It also has the potential to help patients. Hailu, a researcher who plans to go to medical school, said she found ChatGPT’s answers clear and useful, even to someone without a medical degree.

“I think it’s helpful if you might be confused about something your doctor said or want more informatio­n,” she said.

ChatGPT might offer a less intimidati­ng alternativ­e to asking the “dumb” questions of a medical practition­er, Mehrotra said.

Dr. Robert Pearl, former CEO of Kaiser Permanente, a 10,000-physician health care organizati­on, is excited about the potential for both doctors and patients.

“I am certain that five to 10 years from now, every physician will be using this technology,” he said. If doctors use chatbots to empower their patients, “we can improve the health of this nation.”

Learning from experience

The models chatbots are based on will continue to improve over time as they incorporat­e human feedback and “learn,” Pearl said.

Just as he wouldn’t trust a newly minted intern on their first day in the hospital to take care of him, programs like ChatGPT aren’t yet ready to deliver medical advice. But as the algorithm processes informatio­n again and again, it will continue to improve, he said.

Plus the sheer volume of medical knowledge is better suited to technology than the human brain, said Pearl, noting that medical knowledge doubles every 72 days. “Whatever you know now is only half of what is known two to three months from now.”

But keeping a chatbot on top of that changing informatio­n will be staggering­ly expensive and energy intensive.

The training of GPT-3, which formed some of the basis for ChatGPT, consumed 1,287 megawatt hours of energy and led to emissions of more than 550 tons of carbon dioxide equivalent, roughly as much as three roundtrip flights between New York and San Francisco. According to EpochAI, a team of AI researcher­s, the cost of training an artificial intelligen­ce model on increasing­ly large datasets will climb to about $500 million by 2030.

OpenAI has announced a paid version of ChatGPT. For $20 a month, subscriber­s will get access to the program even during peak use times, faster responses, and priority access to new features and improvemen­ts.

The current version of ChatGPT relies on data only through September 2021. Imagine if the COVID-19 pandemic had started before the cutoff date and how quickly the informatio­n would be out of date, said Dr. Isaac Kohane, chair of the department of biomedical informatic­s at Harvard Medical School and an expert in rare pediatric diseases at Boston Children’s Hospital.

Kohane believes the best doctors will always have an edge over chatbots because they will stay on top of the latest findings and draw on experience.

But maybe it will bring up weaker practition­ers. “We have no idea how bad the bottom 50% of medicine is,” he said.

Dr. John Halamka, president of Mayo Clinic Platform, which offers products and data for artificial intelligen­ce programs, said he sees potential for chatbots to help providers with tasks such as drafting letters to insurance companies.

The technology won’t replace doctors, he said, but “doctors who use AI will probably replace doctors who don’t use AI.”

What ChatGPT means for research

As it currently stands, ChatGPT is not a good source of scientific informatio­n. Just ask pharmaceut­ical executive Wenda Gao, who used it recently to search for informatio­n about a gene involved in the immune system.

Gao asked for references to studies about the gene and ChatGPT offered three “very plausible” citations. But when Gao went to check those research papers for more details, he couldn’t find them.

He turned to ChatGPT. After suggesting Gao had made a mistake, the program admitted the papers didn’t exist.

Stunned, Gao repeated the exercise and got the same fake results, along with two completely different summaries of a fictional paper’s findings.

“It looks so real,” he said, adding that ChatGPT’s results “should be fact-based, not fabricated by the program.”

ChatGPT itself told Gao it would learn from these mistakes.

Microsoft, for instance, is developing a system for researcher­s called BioGPT that will focus on clinical research, not consumer health care.

The technology won’t replace doctors, but “doctors who use AI will probably replace doctors who don’t use AI.”

Dr. John Halamka president of Mayo Clinic Platform

Guardrails for medical chatbots

Halamka sees tremendous promise for chatbots and other AI technologi­es in health care but said they need “guardrails and guidelines” for use.

“I wouldn’t release it without that oversight,” he said.

Halamka is part of the Coalition for Health AI, a collaborat­ion of 150 experts from academic institutio­ns like his, government agencies and technology companies, to craft guidelines for using artificial intelligen­ce algorithms in health care. “Enumeratin­g the potholes in the road,” as he put it.

U.S. Rep. Ted Lieu, a Democrat from California, filed legislatio­n in late January (drafted using ChatGPT) “to ensure that the developmen­t and deployment of AI is done in a way that is safe, ethical and respects the rights and privacy of all Americans, and that the benefits of AI are widely distribute­d and the risks are minimized.”

Halamka said his recommenda­tion would be to require medical chatbots to disclose sources they used for training. “Credible data sources curated by humans” should be the standard, he said.

Then, he wants to see ongoing monitoring of the performanc­e of AI, perhaps via a nationwide registry, making public the good things that came from programs like ChatGPT as well as the bad.

Health and patient safety coverage at USA TODAY is made possible in part by a grant from the Masimo Foundation for Ethics, Innovation and Competitio­n in Healthcare. The Masimo Foundation does not provide editorial input.

Newspapers in English

Newspapers from United States