Undermines real science
Always check: who did the research and who funded it
By Queensland University of Technology
![]() |
| The US government is becoming one of the worst spreaders of stupid "science" |
Researchers have built a machine learning tool
that spots signs of mass-produced science, and its first major test suggests
the scale could be startling. The system flagged more than 250,000 cancer
research papers that may be tied to so-called “paper mills,” groups that churn
out manuscripts for sale.
The study was led by QUT researcher
Professor Adrian Barnett from the School of Public Health and Social Work and
Australian Centre for Health Services and Innovation (AusHSI), working with an
international team. Reported in The BMJ, the project reviewed 2.6 million
cancer studies published from 1999 to 2024 and looked for repeated writing
habits linked to papers that were later withdrawn.
Instead of searching for obvious red flags like duplicated
figures or impossible data, the tool focuses on language itself. The
researchers found more than 250,000 papers whose writing patterns resembled
those seen in articles already retracted for suspected fabrication, suggesting
that template-driven writing can leave behind recognizable traces.
How Paper Mills Operate
“Paper mills are companies that sell fake or low-quality
scientific studies. They are producing ‘research’ on an industrial scale, and
our findings suggest the problem in cancer research is far larger than most
people realized,” Professor Barnett said.
Paper mills can offer everything from a paid author slot to
an entire completed manuscript. To produce work quickly, they may recycle
blocks of text, rely on unnatural phrasing, or invent supporting data and
images, creating papers that can look plausible at a glance while still being
unreliable.
“Most likely, they’re relying on boilerplate templates which
can be detected by large language models that analyze patterns in texts,”
Professor Barnett said.
To detect those patterns, Barnett’s team trained a language
model called BERT to recognize subtle textual “fingerprints” that show up again
and again in known paper mill products.
When the model was evaluated using verified examples, it
correctly identified suspicious papers 91 percent of the time, pointing to a
potential new way to help publishers and researchers decide what deserves
closer scrutiny.
“We’ve essentially built a scientific spam filter,”
Professor Barnett said.
“Just like your email system can spot unwanted messages, our
tool flags papers that match the writing style and structure we see in
retracted, fraudulent work.”
Key Trends and Areas of Concern
Key findings from the large-scale analysis include:
- Flagged
papers have increased dramatically over two decades, rising from around 1
percent in the early 2000s and peaking at over 16 percent in 2022.
- The
issue affects thousands of journals across major publishers, including
high-impact titles.
- The
problem is most concentrated in fields such as molecular cancer biology
and early-stage laboratory research.
- Some
cancer types, including gastric, liver, bone, and lung cancer, show
especially high rates of suspicious papers.
Three scientific journals are already piloting the tool as
part of their editorial screening. It will allow editors to identify
potentially fabricated manuscripts before they are sent for peer review.
The team plans to expand the tool to other fields of
research and improve the model as more confirmed cases of paper-mill activity
become available. They stress the findings are not confirmed cases of research
fraud and should be checked by human specialists.
“Cancer research influences clinical trials, drug
development, and patient care,” Professor Barnett said.
“If fabricated studies make their way into the evidence
base, they can mislead real scientists and ultimately slow progress for
patients. That’s why it’s vital we get ahead of this problem.”
Reference: “Machine learning based screening of potential
paper mill publications in cancer research: methodological and cross sectional
study” by Baptiste Scancar, Jennifer A Byrne, David Causeur and Adrian G
Barnett, 30 January 2026, BMJ.
DOI: 10.1136/bmj-2025-087581
