Mass General Brigham Study Finds ChatGPT 4 Excels at Picking the Right Imaging Tests

Jun 22, 2023

Researchers detail the first study to date finding that ChatGPT can support the clinical decision-making process, including when picking the correct radiological imaging tests for breast cancer screening or breast pain

A new study by investigators from Mass General Brigham has found that artificial intelligence (AI) language models like ChatGPT can accurately identify appropriate imaging services for two important clinical presentations: breast cancer screening and breast pain. Their results suggest that large language models have the potential to assist decision-making for primary care doctors and referring providers in evaluating patients and ordering imaging tests for breast pain and breast cancer screenings. Their results are published in the Journal of the American College of Radiology.

"In this scenario, ChatGPT's abilities were impressive," said corresponding author Marc D. Succi, MD, associate chair of Innovation and Commercialization at Mass General Brigham Radiology and executive director of the MESH Incubator. "I see it acting like a bridge between the referring healthcare professional and the expert radiologist — stepping in as a trained consultant to recommend the right imaging test at the point of care, without delay. This could reduce administrative time on both referring and consulting physicians in making these evidence-backed decisions, optimize workflow, reduce burnout, and reduce patient confusion and wait times."

ChatGPT is a large language model (LLM) built on data from the internet to answer questions in a human-like way. Since ChatGPT was introduced in November 2022, researchers worldwide are diving into learning how these AI tools can be used in medical scenarios. Published as a preprint on February 7, 2023, this study is the first of its kind to test ChatGPT's clinical decision-making abilities, and the first to test GPT 4 as opposed to older iterations.

When a primary care doctor orders specialized testing, say for a patient who complains of breast pain, they may not know the best imaging test to choose. It might be an MRI, an ultrasound, a mammogram, or another imaging test. Radiologists generally follow the American College of Radiology's Appropriateness Criteria to make these decisions. These evidence-backed guidelines are well-known to specialists, but less known for non-specialists who many need to pick the best imaging test during a patient’s visit. This can cause confusion on the patient's side and can lead to patients getting tests they don't need or getting the wrong tests.

The researchers asked OpenAI's ChatGPT 3.5 and 4 to help them decide what imaging tests to use for 21 made-up patient scenarios involving the need for breast cancer screening or the reporting of breast pain using the appropriateness criteria.

They asked the AI in an open-ended way and by giving ChatGPT a list of options. They tested ChatGPT 3.5 as well as ChatGPT 4, a newer, more advanced version. ChatGPT 4 outperformed 3.5, especially when given the available imaging options. For example, when asked about breast cancer screenings, and given multiple choice imaging options, ChatGPT 3.5 answered an average of 88.9% of prompts correctly, and ChatGPT 4 got about 98.4% right.

"This study doesn't compare ChatGPT to existing radiologists because the existing gold standard is actually a set of guidelines from the American College of Radiology, which is the comparison we performed,” Succi said. “This is purely an additive study, so we are not arguing that the AI is better than your doctor at choosing an imaging test but can be an excellent adjunct to optimize a doctor’s time on non-interpretive tasks."

Integrating AI into medical decision making could happen at the point of care. When a primary care doctor enters data into an electronic health record, the program could alert them to the best imaging options — providing an answer to the patient for what to expect when they go for the test and suggesting to the doctor the right test to order.

Researchers added that a more advanced medical AI could be created using datasets from hospitals and other research institutions to make it more specific to health-focused applications.

"We may be able to finetune ChatGPT with different patient and therapeutic data and knowledge sets to tailor it to specific patient populations," Succi said. "At Mass General Brigham, we have specialized centers of excellence where we care for patients with some of the most complex and rare diseases. We can leverage our experience and lessons learned from caring for these patient cases to train a model to provide support for rare and complex diagnoses and then make that model available to centers around the world, especially centers that may treat these conditions less frequently."

But before any AI would be involved in medical decision-making, it would need to be extensively tested for bias, privacy concerns, and approved for use in medical setting. New regulations around medical AI could also play a big role in what makes it into patient care interactions.

Disclosures: The authors declare that there are no competing interests.

Funding: The project described was supported in part by award Number T32GM144273 from the National Institute of General Medical Sciences. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences or the National Institutes of Health.

Paper cited: Arya Rao, et al. “Evaluating GPT as an Adjunct for Radiologic Decision-Making: GPT4 vs GPT3.5 in a Breast Imaging Pilot” Journal of the American College of Radiology

Media contact

Tim Sullivan

Senior Program Manager, External Communications

tsullivan11@mgb.org

About Mass General Brigham

Mass General Brigham is an integrated academic health care system, uniting great minds to solve the hardest problems in medicine for our communities and the world. Mass General Brigham connects a full continuum of care across a system of academic medical centers, community and specialty hospitals, a health insurance plan, physician networks, community health centers, home care, and long-term care services. Mass General Brigham is a nonprofit organization committed to patient care, research, teaching, and service to the community. In addition, Mass General Brigham is one of the nation’s leading biomedical research organizations with several Harvard Medical School teaching hospitals. For more information, please visit massgeneralbrigham.org.

Related research about artificial intelligence

AI Screening for Heart Failure Clinical Trial Speeds Up Enrollment, Study Finds

published on Feb 17, 2025
Artificial Intelligence Drives New Approaches to Cancer Care

published on Feb 13, 2025
Using AI to Measure Prostate Cancer Lesions Could Aid Diagnosis and Treatment

published on Oct 29, 2024
Generative AI Model Study Shows No Racial or Sex Differences in Opioid Recommendations for Treating Pain

published on Sep 16, 2024
Artificial Intelligence and Digital Health in Radiology: A Guide for Innovators

published on Sep 13, 2024
Using AI for Early Detection of Lung Cancer

published on Sep 5, 2024
Using AI to Personalize Treatments for Non-melanoma Head and Neck Skin Cancers

published on Sep 5, 2024
AI Tool Offers More Accurate Detection of Immune-Related Adverse Events in Cancer Patients

published on Sep 4, 2024
Research Spotlight: Generative AI “Drift” and “Nondeterminism” Inconsistencies Are Important Considerations in Healthcare Applications

published on Aug 13, 2024