Credit: Pixabay/CC0 Public domain
A common truism among statisticians is that “data doesn’t lie.” However, recent findings by Italian researchers might make those studying the data think twice before making such assumptions.
Giuseppe Giannaccare, an eye surgeon at the University of Cagliari in Italy, reports that ChatGPT has conjured up a wealth of convincing false data to support one eye surgical procedure over another.
“GPT-4 created a fake dataset of hundreds of patients in minutes,” Giannaccare said. “It was a surprising, but frightening, experience.”
There have been countless stories about the great achievements and potential of ChatGPT since the model was unveiled to the world a year ago. But alongside the positives, there were also stories of ChatGPT producing erroneous, inaccurate, or downright false information.
This month, the Cambridge Dictionary proclaimed “hallucinate,” the tendency of major linguistic models to spontaneously produce false information, as its word of the year.
For students researching articles, this fake data is a nuisance. They might receive failing grades. For two lawyers who unwittingly relied on ChatGPT last spring to produce case stories that turned out to be fabrications, the penalty was a $5,000 fine and court sanctions.
But with evidence that false data can infiltrate medical studies and influence medical procedures, the threat and its consequences are much more serious.
“It was one thing that generative AI could be used to generate text that would not be detectable using anti-plagiarism software, but the ability to create fake but realistic datasets is another level of concern,” says Elisabeth Bik, research integrity consultant. in San Francisco. “This will make it very easy for any researcher or group of researchers to create false measurements on non-existent patients, false responses to questionnaires or generate a large data set on animal experiments.”
Giannaccare and his team asked GPT-4, linked to an advanced Python-based data analysis model, to generate clinical trial data for two treatment approaches for a common eye disorder, keratoconus.
The model was fed massive amounts of “very complex” messages detailing eye conditions, subject statistics and a set of rules for achieving the results. They then asked it to produce “significantly better visual and topographical results” for one procedure compared to the other.
The result was a compelling argument for the preferred procedure, but based on entirely false information. According to previous real-world tests, there was no significant difference between the two approaches.
“It seems like it’s quite easy to create data sets that are at least superficially plausible,” said Jack Wilkinson, a biostatistician at the University of Manchester, UK. He said the GTP-4 results “to the untrained eye certainly look like a real data set.”
“The aim of this research was to shed light on the dark side of AI, by demonstrating how easy it is to create and manipulate data to deliberately obtain biased results and generate false medical evidence,” he said. declared Giannaccare. “A Pandora’s box is open and we do not yet know how the scientific community will respond to potential abuses and threats related to AI.”
The article entitled “Large Language Model Advanced Data Analysis Abuse to Create a Fake Data Set in Medical Research”, published in the journal JAMA Ophthalmology, acknowledges that a closer look at the data could reveal telltale signs of possible fabrication. One such example was the artificial number of fabricated subject ages ending in the numbers 7 or 8.
Giannaccare said that to the extent that AI-generated results contaminate evidence-based studies, AI can also contribute to the development of better fraud detection approaches.
“Appropriate use of AI can be very beneficial to scientific research,” he said, adding that it “will make a substantial difference to the future of academic integrity.”
More information:
Andrea Taloni et al, Abuse of advanced data analysis on a large language model to create a fake dataset in medical research, JAMA Ophthalmology (2023). DOI: 10.1001/jamaophthalmol.2023.5162
© 2023 Science X Network
Quote: ChatGPT creates convincing fake medical report (2023, November 28) retrieved November 29, 2023 from
This document is subject to copyright. Except for fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for information only.