Where we live and work, our age, and the conditions in which we grew up can influence our health and lead to disparities, but these factors can be difficult for clinicians and researchers to capture and address.
A new study led by researchers at Mass General Brigham demonstrates that large language models (LLMs), a type of generative artificial intelligence (AI), can be trained to automatically extract information about social determinants of health (SDoH) based on clinician notes, which could increase efforts to identify patients who could benefit from resource support.
Results published in npj Digital Medicine show that finely tuned models could identify 93.8 percent of patients with adverse SDoH, while official diagnostic codes included this information in only 2 percent of cases. These specialized models were less prone to bias than general models such as GPT-4.
“Our goal is to identify patients who could benefit from resource and social work support and to draw attention to the under-documented impact of social factors on health outcomes,” said corresponding author Danielle Bitterman, MD, artificial intelligence in medicine faculty member. (AIM) at Mass General Brigham and a physician in the Department of Radiation Oncology at Brigham and Women’s Hospital.
“Algorithms that can pass major medical exams have received a lot of attention, but that’s not what doctors need in the clinic to better care for patients every day. Algorithms that can detect what doctors can missing in the ever-increasing volume of medical records will be more clinically relevant and therefore more powerful for improving health.”
Health disparities are largely linked to SDoH, including employment, housing, and other non-medical circumstances that impact medical care. For example, a cancer patient’s distance from a major medical center or the support they receive from a partner can significantly influence outcomes. Although clinicians can summarize relevant SDoH in their visit notes, this vital information is rarely systematically organized in the electronic health record (EHR).
To create LMs that could extract information about SDoH, researchers manually reviewed 800 clinician notes from 770 cancer patients who received radiation therapy in the radiation oncology department at Brigham and Women’s Hospital. They labeled phrases referring to one or more of six predetermined SDoH: employment status, housing, transportation, parental status (if the patient has a child under 18), relationships, and presence or absence of social support.
Using this “annotated” dataset, researchers trained existing LMs to identify references to SDoH in clinicians’ notes. They tested their models using 400 clinical notes from patients treated with immunotherapy at Dana-Farber Cancer Institute and patients admitted to intensive care units at Beth Israel Deaconess Medical Center.
Researchers found that refined LMs, particularly Flan-T5 LMs, could consistently identify rare references to SDoH in clinicians’ notes. The “learnability” of these models was limited by the paucity of SDoH documentation in the training set, where the researchers found that only 3% of sentences in the clinician’s notes contained any mention of SDoH.
To address this issue, the researchers used ChatGPT, another LM, to produce 900 additional synthetic examples of SDoH sentences that could be used as an additional training dataset.
One of the major criticisms of generative AI models in healthcare is that they can potentially perpetuate bias and increase health disparities. The researchers found that their fine-tuned LM was less likely than OpenAI’s GPT-4, a general-purpose LM, to change its SDoH determination based on individuals’ race/ethnicity and gender.
Researchers say it’s difficult to understand how biases are formed and deconstructed, both in human and computer models. Understanding the origins of algorithmic bias is ongoing work for researchers.
“If we don’t monitor algorithmic bias when we develop and implement large language models, we could make existing health disparities worse than they are now,” Bitterman said. “This study demonstrated that fine-tuning LMs can be a strategy to reduce algorithmic bias, but further research is needed in this area.”
More information:
Extended linguistic models to identify social determinants of health in electronic health records, npj Digital Medicine (2024). DOI: 10.1038/s41746-023-00970-0
Provided by Mass. General Brigham
Quote: Generative artificial intelligence models effectively highlight social determinants of health in doctors’ notes (January 11, 2024) retrieved January 11, 2024 from
This document is subject to copyright. Apart from fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for information only.