Home

Computational Ecology and Software, 2022, 12(3): 80-122
[XML] [EndNote] [RefManager] [BibTex] [ Full PDF (2179K)] [Supplementary Material (1583K)] [Comment/Review Article]

Article

p-value based statistical significance tests: Concepts, misuses, critiques, solutions and beyond

WenJun Zhang
School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China

Received 28 April 2022;Accepted 12 May 2022;Published online 19 May 2022;Published 1 September 2022
IAEES

Abstract
The p-value is at the heart of statistical significance tests, a very important issue related to the role of statistical inference in advancing scientific discovery. Over the past few decades, p-value based statistical significance tests have been widely used in most statistics-related research papers, textbooks, and all statistical software around the world. Numerous scientists in various disciplines hold the p-value as the gold standard for statistical significance. However, in recent years, the p-value based statistical significance tests have been questioned unprecedentedly, mainly because the paradigm of significance tests is wrong, p-value is too sensitive, p-value is a dichotomous subjective index, and statistical significance is related to sample size, etc. Scientific research can only be falsified, not confirmed. p-value based statistical significance tests are one of the sources of false conclusions and research reproducibility crisis. For this reason, many statisticians advocate to abandon p-value based statistical significance tests and replace them with effect size, Bayesian methods, meta-analysis, etc. Scientific inference that combines statistical testing and multiple types of evidence is the basis for producing reliable conclusions. Reliable scientific inference requires appropriate experimental design, sampling design, and sample size; it also requires full control of the research process. For complex and time-varying problems, the network or systematic methods should be used instead of the reductionist methods to obtain and analyze data. To change the scientific research paradigm, the paradigm of multiple repeated experiments and multi-sample testing should be adopted, and multiple parties should verify each other to improve the authenticity and reproducibility of the results. In addition to writing, publishing and adopting new statistical monographs and textbooks, the most urgent task is to revise and distribute various statistical software in the new versions based on the new statistics for further use. Before the popularization of new statistics, what we can do is to improve data quality, strict p-value levels of statistical significance tests, use more reasonable analysis methods or testing standards, and combine statistical analysis and mechanism analysis, etc.

Keywords p-values;statistical significance tests;reproducibility;Bayesian methods;effect size;Bootstrap.



International Academy of Ecology and Environmental Sciences. E-mail: office@iaees.org
Copyright © 2009-2024 International Academy of Ecology and Environmental Sciences. All rights reserved.
Web administrator: office@iaees.org, website@iaees.org; Last modified: 2024/4/16


Translate page to: