The first documented case of pancreatic cancer dates back to the 18th century. Since then, researchers have undertaken a long and difficult odyssey to understand this elusive and deadly disease. To date, there is no better treatment for cancer than early intervention. Unfortunately, the pancreas, nestled deep in the abdomen, is particularly difficult to detect when it comes to early detection.
Scientists at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), alongside Limor Appelbaum, a researcher in the Department of Radiation Oncology at Beth Israel Deaconess Medical Center (BIDMC), were eager to better identify potential patients at high risk. They set out to develop two machine learning models for the early detection of pancreatic ductal adenocarcinoma (PDAC), the most common form of cancer.
To access a large and diverse database, the team synchronized with a federated network company, using electronic health record data from various institutions across the United States. This large pool of data helped ensure the reliability and generalizability of the models, making them applicable to a wide range of populations, geographies and demographic groups.
Both models, the PRISM neural network and the logistic regression model (a statistical probability technique), outperformed current methods. The team’s comparison showed that although standard screening criteria identify about 10% of PDAC cases using a five-fold higher relative risk threshold, PRISM can detect 35% of PDAC cases at that same threshold.
The use of AI to detect cancer risk is not a new phenomenon: algorithms analyze mammograms, lung cancer CT scans and assist in the analysis of Pap tests and HPV tests, to cite only a few applications.
“PRISM models are notable for their development and validation on a large database of more than 5 million patients, surpassing the scale of most previous research in the field,” says Kai Jia, a Ph.D. at MIT. student in electrical and computer engineering (EECS), affiliated with MIT CSAIL and first author of an article in eBioMedicine describing the new work.
“The model uses routine clinical and laboratory data to make its predictions, and the diversity of the U.S. population is a significant advance over other PDAC models, which are typically confined to specific geographic regions, such as a few health centers In the United States, the use of a unique regularization technique in the training process improved the generalizability and interpretability of the models.
“This report presents a powerful approach to using big data and artificial intelligence algorithms to refine our approach to identifying cancer risk profiles,” says Harvard Medical School professor David Avigan, director of the Center for Cancer Risk. fight against cancer and head of the department of hematology and hematologic malignancies at BIDMC. , who did not participate in the study. “This approach could lead to new strategies for identifying patients at high risk of malignancy who may benefit from targeted screening with potential for early intervention.”
Prismatic Perspectives
The journey to developing PRISM began more than six years ago, fueled by first-hand experiences regarding the limitations of current diagnostic practices. “Approximately 80 to 85 percent of pancreatic cancer patients are diagnosed at advanced stages, where cure is no longer an option,” says lead author Appelbaum, who is also an instructor at Harvard Medical School as well as a radiation oncologist. “This clinical frustration sparked the idea of exploring the wealth of data available in electronic health records (EHRs).”
The CSAIL group’s close collaboration with Appelbaum provided insight into the combined medical and machine learning aspects of the problem, ultimately leading to a much more accurate and transparent model. “The hypothesis was that these recordings contained hidden clues, subtle signs and symptoms that could serve as early warning signals of pancreatic cancer,” she adds. “This guided our use of FEDERATED EHR networks in the development of these models, for a scalable approach to deploying risk prediction tools in healthcare.”
The PrismNN and PrismLR models analyze EHR data, including patient demographics, diagnoses, medications, and laboratory results, to assess PDAC risk. PrismNN uses artificial neural networks to detect complex patterns in data features such as age, medical history and laboratory results, resulting in a risk score for the likelihood of PDAC. PrismLR uses logistic regression for simpler analysis, generating a PDAC probability score based on these characteristics. Together, the models provide an in-depth evaluation of different approaches to predict PDAC risk from the same EHR data.
According to the team, a key point in gaining doctors’ trust is to better understand how models work, what is known in the field as interpretability. The scientists pointed out that although logistic regression models are inherently easier to interpret, recent advances have made deep neural networks a bit more transparent.
This helped the team refine the thousands of potentially predictive features derived from a single patient’s EHR down to approximately 85 critical indicators. These indicators, which include patient age, diabetes diagnosis and increased frequency of doctor visits, are automatically discovered by the model but correspond to doctors’ understanding of risk factors associated with pancreatic cancer.
The path to follow
Despite the promise of PRISM models, as with all research, some parts are still a work in progress. U.S. data alone constitutes the current regime of models, requiring testing and adaptation for global use. The way forward, the team notes, is to expand the model’s applicability to international datasets and incorporate additional biomarkers for more refined risk assessment.
“A further goal for us is to facilitate the implementation of the models in routine healthcare settings. The vision is to run these models seamlessly in the background of healthcare systems, automatically analyzing patient data and alerting doctors to high-risk cases without adding data. to their workload,” Jia explains.
“A machine learning model integrated into the EHR system could allow doctors to receive early alerts for high-risk patients, which could allow intervention well before symptoms manifest. We look forward to deploying our techniques in the real world to help all individuals enjoy longer, healthier lives. lives.”
Jia wrote the paper alongside Applebaum and Professor Martin Rinard of MIT EECS and CSAIL Principal Investigator, who are both lead authors of the paper.
More information:
Kai Jia et al, A pancreatic cancer risk prediction model (Prism) developed and validated on large-scale American clinical data, eBioMedicine (2023). DOI: 10.1016/j.ebiom.2023.104888
Provided by the Massachusetts Institute of Technology
This story is republished courtesy of MIT News (web.mit.edu/newsoffice/), a popular site that covers news about MIT research, innovation and education.
Quote: New hope for early intervention against pancreatic cancer via AI-based risk prediction (January 18, 2024) retrieved January 18, 2024 from
This document is subject to copyright. Apart from fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for information only.