Key Summary: A large proportion of modern scientific research findings are very likely to be false. This is attributable to various factors including study design, misuse of statistical significance, sample size, competition among researchers, and bias. This paper analyzes mathematically why this occurs and explores how we can improve the reliability of research findings.
1. The Problem: Why Are So Many Research Findings False?
An increasing number of scientists and members of the public are concerned that "most modern published research findings may be false." In fact, even studies published in prestigious journals are frequently refuted or found to lack credibility over time. The author argues that this phenomenon should not be surprising.
"It can be proven that most claimed research findings are false."
This article specifically explores what factors create this phenomenon and what implications it has for interpreting research findings and advancing science.
2. Statistical Models and the Reliability of Research Findings
The Probability That a Research Finding Is True: PPV
The paper employs the probabilistic concept of Positive Predictive Value (PPV). PPV represents the probability that a research finding (e.g., "Drug A is effective for Disease B") is actually true.
PPV depends on the following factors:
- Pre-study probability (R): The prior credibility that the research hypothesis is true before the study even begins.
- Statistical power (1-beta): The probability that the study will detect a real effect.
- Significance level (alpha): The probability of incorrectly concluding an effect exists when it does not (typically 0.05, or 5%).
The smaller the sample size, the smaller the effect size, and the lower the pre-study probability, the lower the PPV becomes. In other words, the probability that a research finding is false becomes higher than the probability that it is true.
3. The Impact of Bias
Bias in Study Design and Reporting
Bias refers to all factors that distort results, whether intentionally or unconsciously, during the conduct of research or interpretation of findings. For example:
- Selective Reporting: Emphasizing only desired results or omitting or distorting critical findings.
- Arbitrary changes in analytical methods: Data mining, post-hoc analysis, etc.
- Conflicts of interest: Financial or reputational interests of researchers.
As bias increases, the probability of obtaining truthful research findings (PPV) decreases sharply.
"As bias levels increase, the probability that a research finding is true decreases considerably."
There is also the opposite case of reverse bias, where important facts are ignored, but this is rare and has been largely addressed in recent years through improvements in measurement technology.
4. Verification and Competition Among Multiple Research Teams
In modern life sciences and many other fields, multiple research teams repeatedly test the same hypothesis. However, this process paradoxically worsens the problem. The more teams that test the same hypothesis, the higher the likelihood that one team will produce a statistically significant result by chance alone. In other words, the more competitive and "hot" a research field is, the higher the probability that a study producing false but statistically significant results will emerge.
"The more teams that flock to a 'hot' research area, the lower the likelihood that the research findings are true."
5. Real-World Examples and Six Corollaries
The paper illustrates these principles using a real genomic research example. Assuming that out of 100,000 genetic variants, only 10 actually affect disease, the probability of concluding that a true finding exists is extremely low when statistical power and bias are taken into account.
The paper then presents the following six corollaries:
- The smaller the study, the less likely the research findings are to be true.
- The smaller the effect size, the less likely the research findings are to be true.
- The larger and more exploratory the study, or the more hypotheses tested with less pre-selection, the less likely the findings are to be true.
- The greater the flexibility in study design and interpretation (i.e., the more arbitrarily they can be changed), the lower the truthfulness of research findings.
- The greater the influence of stakeholders (financial or non-financial), the higher the likelihood of false findings.
- The more research teams competing in a field, the lower the probability of true findings being produced.
6. Are Most Research Findings Really False? And Why
When combining simulations with actual research conditions, in most study designs and fields it is difficult for the probability of a true research finding to exceed 50%. Unless the study is a randomized large-scale clinical trial or a high-quality meta-analysis, most observational and exploratory scientific papers are more likely to be "false (non-reproducible)."
7. Statistical Significance Ultimately Reflects Bias
Especially in fields like biomedicine where the pre-study probability (the probability that a real effect exists) is very low, reported effect sizes may not represent actual effects but rather an "accurate measurement" of the field's degree of bias. In other words, the more strongly statistically significant a result is, the more likely it may be a signal of greater bias.
"Effects that are too large and too significant may not be signals of great discoveries, but rather signals of serious bias. We must critically re-examine what went wrong with the data, analysis, and results."
8. What Should We Do?
There is no "gold standard" that can guarantee 100% truth, but we can attempt practical solutions:
- Strengthen study design and statistical power: Prioritize large, systematic studies (large-scale randomized clinical trials, standardized meta-analyses, etc.).
- Cultural transformation to prevent bias: Pre-registration of study designs, integrated analyses, and a culture centered on verification.
- Focus on meaningful questions: Concentrate research effort on fields with relatively high pre-study probability (topics where real effects are likely to exist).
- Do not blindly trust the statistical significance of a single study: Also consider the totality of evidence, past experience, and the degree of bias in adjacent fields.
- Reconsider the meaning of 'discovery': Statistical significance must be interpreted cautiously, accounting for the roles of bias, multiple testing, and chance.
Conclusion
This article persuasively demonstrates through formulas, figures, and real-world examples that many published research findings have a high probability of being false. When researchers stand before their "research findings," this provides an important lesson: they must continuously ask the critical question "Is this result really true?" and never lose humility and caution in study design, analysis, and interpretation of results alike.