Verifying that a statistically significant result is scientifically meaningful is not only good scientific practice, it is a natural way to control the Type I error rate. Here we introduce a novel extension of the p-value—a second-generation p-value ($p_δ$)–that formally accounts for scientific relevance and leverages this natural Type I Error control. The approach relies on a pre-specified interval null hypothesis that represents the collection of effect sizes that are scientifically uninteresting or are practically null. The second-generation p-value is the proportion of data-supported hypotheses that are also null hypotheses. As such, second-generation p-values indicate when the data are compatible with null hypotheses ($p_δ$ = 1), or with alternative hypotheses ($p_δ$ = 0), or when the data are inconclusive (0 < $p_δ$ < 1). Moreover, second-generation p-values provide a proper scientific adjustment for multiple comparisons and reduce false discovery rates. This is an advance for environments rich in data, where traditional p-value adjustments are needlessly punitive. Second-generation p-values promote transparency, rigor and reproducibility of scientific results by a priori specifying which candidate hypotheses are practically meaningful and by providing a more reliable statistical summary of when the data are compatible with alternative or null hypotheses.
In PLoS One,
Location bias occurs when a reader detects a false lesion in a subject with disease and the falsely detected lesion is considered a true positive. In this study, we examine the effect of location bias in two large MRMC ROC studies, comparing three ROC scoring methods. We compare one method that only uses the maximum confidence score and does not take location bias into account (maxROC), and two methods that take location bias into account: the region of interest ROC (ROI–ROC) and the free-response ROC (FROC). In both studies, when comparing two modalities’ ROC areas without adjusting for location bias, the effect size depends on the difference in the frequency of location bias between the two modalities. When the difference in frequency is small, the effect size is similar whether the location bias is corrected for or not. However, when the frequency of location bias is dissimilar, failure to correct for the location bias favors the modality with higher false positive rate. Location bias should be corrected when the next step in the clinical management of the patient depends on the specific location of the detected lesion and/or when the frequency of the bias is dissimilar between the two modalities.
In Statistics in Biopharmaceutical Research,