ABSTRACT
Background Integrity of academic publishing is increasingly undermined by fake science publications massively produced by commercial “editing services” (so-called “paper mills”). They use AI-supported, automated production techniques at scale and sell fake publications to students, scientists, and physicians under pressure to advance their careers. Because the scale of fake publications in biomedicine is unknown, we developed a simple method to red-flag them and estimate their number.
Methods To identify indicators able to red-flagged fake publications (RFPs), we sent questionnaires to authors. Based on author responses, three indicators were identified: “author’s private email”, “international co-author” and “hospital affiliation”. These were used to analyze 15,120 PubMed®-listed publications regarding date, journal, impact factor, and country of author and validated in a sample of 400 known fakes and 400 matched presumed non-fakes using classification (tallying) rules to red-flag potential fakes. For a subsample of 80 papers we used an additional indicator related to the percentage of RFP citations.
Results The classification rules using two (three) indicators had sensitivities of 86% (90%) and false alarm rates of 44% (37%). From 2010 to 2020 the RFP rate increased from 16% to 28%. Given the 1.3 million biomedical Scimago-listed publications in 2020, we estimate the scope of >300,000 RFPs annually. Countries with the highest RFP proportion are Russia, Turkey, China, Egypt, and India (39%-48%), with China, in absolute terms, as the largest contributor of all RFPs (55%).
Conclusions Potential fake publications can be red-flagged using simple-to-use, validated classification rules to earmark them for subsequent scrutiny. RFP rates are increasing, suggesting higher actual fake rates than previously reported. The scale and proliferation of fake publications in biomedicine can damage trust in science, endanger public health, and impact economic spending and security. Easy-to-apply fake detection methods, as proposed here, or more complex automated methods can help prevent further damage to the permanent scientific record and enable the retraction of fake publications at scale.
INTRODUCTION
Trust in the integrity of academic publishing is a foundation of science, and lack of it damages its reputation (Behl, 2021; Byrne, 2019; Seifert, 2021; Else & Van Norden, 2021; Else, 2022; Byrne et al., 2022). Well-known cases of scientific misconduct by individual researchers include ghost and “honorary” authorships (Flanagin et al., 1998; Wislar et al., 2011; Frederickson and Herzog, 2021), cherry-picking, abstract spin, plagiarism of images (Bik et al., 2016), and outright data fabrication (Bik, 2020; Byrne and Christopher, 2020; Park et al., 2022). While individual fraud has been recognized for centuries, the recent emergence of commercial production of fake publications is a new and unprecedented development (Flanagin et al, 1998; Wislar et al., 2011; Mavrogenis et al., 2018; Byrne, 2019; Byrne and Christopher, 2020; Else and Van Norden, 2021; Sabel and Seifert, 2021; Sabel, 2022; Chawla Singh, 2022; Candal-Pedreira et al., 2022). The major source of fake publications are 1,000+ “academic support” agencies – so-called “paper mills“ – located mainly in China, India, Russia, UK, and USA (Abalkina, 2021; Else, 2021; Pérez-Neri et al., 2022). Paper mills advertise writing and editing services via the internet and charge hefty fees to produce and publish fake articles in journals listed in the Science Citation Index (SCI) (Christopher, 2021; Else, 2022). Their services include manuscript production based on fabricated data, figures, tables, and text semi-automatically generated using artificial intelligence (AI). Manuscripts are subsequently edited by an army of scientifically trained professionals and ghostwriters.
Although their quality is relatively low (Cabanac and Labbé, 2021), fake publications nevertheless often pass peer review in established journals with low to medium impact factors (IF 1-6) (Seifert, 2021). Some governments, funding bodies, and academic publishers are on the alert (Cyranoski, 2018; Mallapaty, 2020; Else, 2022; Candal-Pedreira et al., 2022), yet many scientists, journal editors, and learned societies appear to be surprisingly unaware that such publications exist at all.
Paper mill customers – students and scientists – are pressured to publish in SCI publications by their academic or government institutions or university-affiliated hospitals (Pérez-Neri et al., 2022). For example, the Beijing municipal health authorities require a fixed number of first-authored SCI articles for physicians to qualify for promotion (Else and Van Norden, 2021). Academic policies that count publications and value impact factors as surrogates for scientific excellence can force graduate students to fulfill SCI publication requirements and pressure scientists and physicians to meet publication quotas to attain salary increases, promotion, and/or scientific reputation. Paper mills are ready to help and offer their services to accomplish these goals.
We are aware of several instances where paper mills tried to promote their business by inviting journal editors to collaborate, as shown by this unsolicited email in 2022 from a paper mill to one of us who is editor of a biomedical journal (see Tab. 1 for interview excerpts):
We are a well-known academic support institution from Guangzhou, China, which has been established for 8 years. … For reducing the publication time, we expect to cooperate with you in the future. Cooperation mode: we cite the content of your journal in our articles, thus increasing … your impact factor in 2022. You shall help us shorten the publication time. Payment: If an article is successfully published, we will pay for it at the price: IF*1,000 USD/article. For example, with IF=2.36, total payment=2.36*1,000 USD=2,360 USD. And this price is negotiable.
This attempted corruption motivated us to quantitatively analyze the global scope of fake publishing. Because the problem is still perceived to be small (an estimated 1 of 10,000 publications, Tab. 2), publishers and learned societies are just beginning to adjust editorial, peer-review, and publishing procedures. Yet the actual scale of fake publishing remains unknown, despite the fact that the number of reports on paper mills are increasing.
To be able to estimate the scope of fake publishing, a method is needed to identify potential (red-flag) fake publications (RFPs). We therefore looked for potential indicators of fakes that are easy to use by reviewers, editors, and publishers and tested their feasibility for randomly selected neuroscience and medical journals. We then developed classification (tallying) rules for screening for potential fakes and determined their sensitivity and false alarm rates. Here, we report the proportion of RFPs as an upper estimate of the true number of fake publications.
METHODS
Exploration
To search for potential fake indicators, in Study 1, one of us (editor of a neurology journal) sent a questionnaire to the corresponding authors of a sample of suspicious published articles and, for control, to those of a sample of unsuspicious articles (see Table 3). Based on the differential willingness to respond, we identified potential indicators for fake publications. These indicators should satisfy the following patterns (see below):
Authors of fake publications are reluctant to provide critical information as revealed by their response – or non-response – to the questionnaire by the editor,
the number of fake publications increases steadily over time, and
journals with a low to medium impact factor are most affected.
For Studies 1 to 6 we identified two easy-to-detect indicators, where a publication was labelled as RFP: if an author used a private email and had no international partner. We then tested these indicators in studies with 15,120 publications listed in PubMed® and Web of Science™ in the fields of neuroscience (including neurology) and (non-neuroscience) general medicine publications in an iterative way in a series of nine bibliographic studies (Fig. 1).
Study 2 applied potential indicators to five neuroscience publications; Study 3 expanded this sample to estimate RFP growth from 2010-2020, including five journals in the field of general medicine. To estimate the 2020 incidence of RFPs, we increased the sample size in Studies 4 and 5. Finally, Study 6 checked the RFP rate in three open access journals.
Validation
Based on the potential indicators identified in Studies 1-6, in Study 7 we applied tallying rules to the same three indicators to a sample of publications (n=400) which have been proven fake because of fake gene sequences…., text or image plagiarism, or retractions (for retractions it was not possible to differentiate whether they were voluntary or forced):
n=100 retractions (http://retractiondatabase.org/RetractionSearch.aspx?)
n=100 Tadpole paper mill items (https://docs.google.com/spreadsheets/d/1KXqTAyl4j-jVorFPMD2XRpr76LcIKJ0CVyIvRj0exYQ/edit#gid=0)
n=100 fake gene sequences; https://dbrech.irit.fr/pls/apex/f?p=9999:28
n=100 retractions from Journal of Cellular Biochemistry (Behl, 2021).
These were matched with 400 papers presumed non-fake sampled from the same journals by selecting a fake publication’s nearest-neighbor article (unless it was also a known fake). A limitation of this “matched sample” method is that we cannot be certain whether the publications presumed non-fake may not actually be fake as well. To test whether the efficacy of RFP detection can be increased, we explored in Study 8 citations of fake papers as a potential indicator by counting the number of RFPs in each reference list of 80 publications (40 proven fake and 40 presumed non-fake) (i.e., 2,594 additional publications were analyzed). For our subsequent sensitivity/ false-alarm rate analysis, we define a 10% RFP citation rate (“RFP citations in reference list >10%”) and this as an additional (third) indicator.
Estimating RFP incidence
To estimate the rate of RFPs during the period of 2010-2020, data from Study 3 was analyzed bi-annually for medicine and neuroscience separately. Finally, in Study 9, RFPs were counted in neuroscience and medical journals to establish their within-country and across-countries (global) 2020 incidence.
The quantitative analysis was complemented by a qualitative one, where we analyzed >1,000 websites retrieved from Baidu and Google that advertise various editing services (search terms: “SCI-publication or –service,” “essay writing service,” “journal writing service,” “SCI ghost writing”). We also interviewed a manager of a paper mill by email and in a subsequent (recorded) Zoom meeting (Tab. 1).
RESULTS
EXPLORATION OF FAKE INDICATORS
We searched for indicators that can be determined easily, quickly, and reliably by an editor on the basis of a submitted manuscript or publication alone. In an exploration phase, our search was guided by three hypotheses:
Hypothesis 1: Authors of fake publications will be unwilling to answer quality check surveys and provide original data
In Study 1, n=215 neurology articles were manually inspected by an experienced editor; 20.5% (n=44) were deemed suspicious. A questionnaire was sent to all authors and, for control, to 48 authors of non-suspicious papers. It contained questions that authors of fake papers might be reluctant to answer (e.g., “Are you willing to provide original data?” [only 1 author of 44 suspicious articles did] and “Did you engage a professional agency to help write your paper? [none did]; see Tab. 3). Despite repeated reminders with a warning that failure to reply – or replying inadequately – could trigger retraction, the response rate among suspected authors was only 45.4% (20/44) compared with 95.8% (46/48) for the control group. This survey provided the first indicators of red-flagged fake publications (RFP).
Hypothesis 2: Because paper mills are on the rise (Else and Van Norden, 2021), indicators uncovered in Study 1, if valid, should also increase each year
Study 2 analyzed the frequency of these indicators in five randomly chosen neuroscience journals, expanded in Study 3 to a larger sample of articles from those five neuroscience journals and an additional five medical journals bi-annually (2010-2020). The results show a rapid growth of RFPs over time in neuroscience (13.4% to 33.7%) and a somewhat smaller and more recent increase in medicine (19.4% to 24%) (Fig. 2). A cause of the greater rise of neuroscience RFPs may be that fake experiments (biochemistry, in vitro and in vivo animal studies) in basic science are easier to generate because they do not require clinical trial ethics approval by regulatory authorities.
Hypothesis 3: Because it is easier for paper mills to market their papers to journals with lower impact factors (IF; range 1-6) journals, our indicators, if valid, should also occur more frequently in such journals
Study 4 tested our indicators in an even larger sample of randomly selected journals included in the Neuroscience Peer Review Consortium. It red-flagged 366/3,500 (10.5%) potential fakes. In Study 5, RFPs were counted in 10 randomly chosen general (non-neuroscience) medical journals, where the 2020 rate was 23.8% in PubMed-listed journals with lower IF. Study 6 revealed that the RFP rate is even greater in open access (OA) journals (112/300 =40.3%) because it is often easier to publish there.
VALIDATION OF FAKE INDICATORS
To validate the fake indicators obtained in exploration studies 1-6, we computed sensitivity/false alarm rates by comparing a sample of known fakes (n=400) with a sample of presumed non-fakes (n=400). Then we combined the two best indicators (“author’s private email” and “hospital affiliation”) to form a classification (tallying) rule: “If both indicators are present, classify as a potential fake, otherwise not” (the “AND” rule) (Katsikopoulos et al., 2020). Its detection sensitivity was 0.86 and the false-alarm rate 0.44. An “OR” classification rule (“If any of the indicators are present, classify as fake, otherwise non-fake”) had a higher sensitivity (0.972) but also a high false-alarm rate (0.655). To explore possible improvements to our method for future studies, we added a third indicator (“RFP citations in reference list >10%”) to the “AND” rule and tested it with n=80 publications in Study 8. This increased the sensitivity to 0.90 and reduced the false-alarm rate to 0.37.
Note that the tallying rule identifies likely fakes, but it cannot determine with certainty whether a given publication is actually (legally) a fake. Nevertheless, it is a reliable tool to red-flag scientific reports for further analysis and is a rational basis to estimate the upper value of fake publishing in biomedicine. Detecting fake papers in disciplines outside of biomedicine may of course require other indicators.
ESTIMATING THE INCIDENCE OF POTENTIAL FAKE PUBLICATIONS
The estimated incidence of RFPs in the Study 9 sample was 589/4,001, of which 328 (55.8%) were from China (Fig. 3; Tab. 4). The within-country percentages of RFPs vary considerably. Leading countries are Russia (48.3%), Turkey (47.5%), China (43.9%), Egypt (40.0%), and India (38.8%), with China – in absolute terms – as the largest contributor globally (55.8%). Note the large differences in fake rate between countries, even between neighboring countries with similar histories such as Russia (48.3%) and Ukraine (3.1%), probably due to different government policies (Fig. 3).
Given the 2020 global publication output of 1.33 million publications (Scimago) and an average of 28.8% RFPs in both fields of Study 3 (but 23.8% in Study 5), the 2020 RFP-incidence is approximately 383,000.
Assuming an average $10,000 price tag for a fake publication, the estimated annual revenue of paper mills is up to $3-4 billion. This revenue does not include non-Scimago journals (classified as “predatory journals”), which are probably more polluted by fakes, nor the open access and publication fees charged by academic publishers (approx. = $1 billion or an estimated 25% of paper mill revenue). Although the incidence of actual fake publications is expected to be smaller than that of RFPs if the number of false alarms exceeds that of missed actual fakes, the overall estimate of the annual number of potential fake publications is considerable (>300,000), and it is on the rise.
QUALITATIVE ANALYSIS OF PAPER MILL STRATEGIES
A search on Baidu and Google uncovered that more than 1,000 paper mills openly advertise their services to “help prepare” academic term papers, dissertations, and articles intended for SCI publications. Most paper mills are located in China, India, UK, and USA, and some are multinational. These typically appear to use sophisticated, state-of-the-art AI-supported text generation, data and statistical manipulation and fabrication technologies, image and text pirating, and gift or purchased authorships. Paper mills fully prepare – and some guarantee – publication in an SCI journal and charge hefty fees ($1,000-$25,000; in Russia: $5,000) (Chawla, 2022) depending on the specific services ordered (topic, impact factor of target journal, with/without faking data by fake “experimentation”). An unsolicited meeting with a paper mill provided a rare and authentic inside view of their business practices (Tab. 1).
Paper mills employ science graduates, academicians, and (sometimes naïve) scientific consultants for editorial help who work in countries with high English aptitude (UK, USA, India). They also offer “rewards” (bribes) to editors for publishing their fabrications (Tab.1). We know of at least 12 such cases (two reported by editors, 10 acknowledged by an academic publisher who asked not to be identified). Here, editors were offered payment for each publication and were lured by a “citation booster” whereby paper mills offered to cite the “friendly” journal in their other fake articles. Although we do not know how many editors have received or accepted such bribes, it is an unprecedented and disturbing fraud-for-profit corruption of scholarly publishing.
DISCUSSION
The dramatic rise of fake science publishing is driven by an unscrupulously corrupt – and increasingly successful – paper mill industry responsible for an estimated 380,000 RFPs annually as of 2020. This indicates that the number of actual fakes is likely higher than currently known (Tab. 2). A 2020 estimate of 28.8% of RFPs in the present study indicates that the 2011 estimate of 0.1% fake publications in China (Hu and Wu, 2013) and the 1% listed in Table 1 is too low, and is closer to the 21-32% “honorary” and ghost authorship cases in biomedicine (Flanagin et al., 1998; Wislar et al., 2011) and the 5%-10% reported in a pharmacology (Seifert, 2021) and a cancer journal (Heck et al., 2021). It feeds a billion-dollar global industry, magnitudes higher than the $4.5 million monetary value estimated in 2011 (Hu and Wu, 2013).
Fake science publishing is known to originate mainly from China (Hu and Wu, 2013; Lei and Zhang, 2018; Mallapaty, 2020; Schneider, 2021), India (Elango, 2021), and Russia (Abalkina, 2021), and, as we showed, it has evolved into a rapidly growing industry of fake science publishing. Our analysis confirms the existence, continuous growth, and notable scope of fake publishing, with most red-flagged publications coming from China (55.8% in 2020).
The rapid rise of the fake science industry is driven by SCI publication pressure on scientists, who are tempted to use paper mills that offer ghostwriting services at $1,000 to25,000 per publication, a profitable business model (Hu & Wu, 2013). Academic publishers acknowledge that the problem exists and are beginning to explore detection tools (Else, 2022; see also COPE & STM Committee on Publication Ethics, 2022). Chinese authorities, although aware of the situation (Cyranoski, 2018), have not yet resolved the problem. If quantity of scientific output is the index for becoming the world leader in science, then paper mills contribute to reaching this goal. In fact, China has almost caught up with the US in publication output (see: www.nature.com/nature-index/country-territory-research-output).
Paper mills feed on the rising administrative practice to evaluate researchers mainly by the “publish-or-perish” criteria of counting papers and journal impact factors as a surrogate for evaluating actual research quality and content (Van Dalen and Henkens, 2012; Candal-Pedreira et al., 2022).
Fake academic publishing is a major driver of global science publishing growth and a growing problem for medical practice. For example, Byrne showed that 712 problematic papers were cited >17,000 times and estimated that about one quarter of them may misinform future development of human therapies (Park et al., 2022). Preclinical studies at biotech company Amgen, for example, could replicate the results of only 6/53 “landmark” articles, and at Bayer, only 14/67 were replicable in oncology, women’s health, and cardiovascular medicine. The “replication crisis” slows down the development of life-saving therapies with an estimated financial loss of $28 billion annually by the pharmaceutical industry (Gigerenzer, 2018). Another example of how scientific fraud can affect medical practice is the report by Avenell et al. (2019). After assessing the citations of 12 retracted clinical trial reports in 68 systematic reviews, meta-analyses, guidelines, and clinical trials, they concluded that 13 out of the 68 reviews would likely have to change their conclusions if the retracted publications were removed.
It is important to keep in mind that our indicators provide a red flag, not legal proof, that a given manuscript or publication might be fake. However, it is the authors’ burden of proof to demonstrate that their science can be trusted. Whether this type of scientific misconduct is a conspiracy to commit injurious falsehood or a crime is for others to decide.
Fake science publishing is possibly the biggest science scam of all times, wasting financial resources, slowing down medical progress, and possibly endangering lives. The damage already done is unknown, and a realistic impact assessment of fake science is not yet available. The emergence of Chat-GPT and more sophisticated large language models might amplify the production of fake papers at less cost, although it will be difficult for paper mills to invalidate the indicators identified in this study (although deception via fake institutional email addresses also occurs).
Halting this development requires an immediate response. But what can be done? First, our simple detection tallying method can be used by reviewers and editors to red-flag potential fakes with or without additional indicators. Second, the academic community should consider revising its common practice to judge scientists’ productivity mostly (or solely) on surrogate quantitative criteria (publication numbers, H-factors, citation metrics, etc.) and instead evaluate the quality and relevance of their research (Van Dalen and Henkens, 2012). The European Research Council (ERC) has already taken a first step by asking researchers to refrain from listing impact factors in their applications, consistent with the San Francisco Declaration on Research Assessment (DORA). Thirdly, we need an advanced system to check scientific integrity, independent of academic publishers. Finally, learned societies, funding agencies, and governmental bodies should consider sanctioning fake polluted journals and their publishers.
Until science publishing fraud is largely eradicated, the collateral damage of fake science poses the risk that scientific analyses, experiments, and clinical trials will more likely fail, public health information will be less accurate or (intentionally) misleading, and presumably effective and safe therapies may not deliver what was promised. It also runs the risk that the public loses its trust in the honesty of science itself. Simple detection of fake publications, as proposed here, or more complex automated methods can help prevent further damage to the permanent scientific record and enable the retraction of fake publications at scale. We propose a “call to action” to restore the integrity of our global knowledge base in biomedicine, science, and technology.
Data Availability
All data produced in the present study are available upon reasonable request to the authors