Power analysis is the most important discussion for statistical test results [6]. The results of the paired t test between N’ and N showed that the mean and the SD were 0.174 and 7.326, respectively, from 100 samples. Based on these difference parameters, when the probability of alpha error was 0.05, the effect size (d) and the power of (1 − ß) error probability were derived as 0.0238 and 0.056, respectively, by using G*Power software (Ver. 3.1.9.2) [7, 8].
In general statistics in Cohen’s paper [6], a required sample size n (the number of cases) for the paired t test is estimated as n = 90 when the effect size, the statistical significance, and the power were 0.3, 0.05, and 0.8, respectively, from an a priori analysis. The actual number of sample in this study was 100 to meet the result of this a priori analysis, but the actual effect size, d = 0.0238, was much smaller than the general effect size d = 0.3.
Based on the actual effect size, the number of cases to show the statistical difference between N’ and N is estimated as 13,859 with the power of 0.80 from another a priori analysis. This sample size is not a practical value because the average number of examination per year in our university hospital was approximately 100.
Figure 4 shows the relationship between statistical power and various effect sizes. Changes of statistical powers with an actual effect size of 0.0238 and three typical effect sizes of 0.1, 0.3, and 0.5 are shown as bold and dashed lines. The statistical power of this study was small (0.05) because of the small effect size of 0.0238. The result suggested that the statistical test might have been prone to the type II error. The effect size depended on the SD of the differences in counts between two measurements. Not only the differences of scintillation materials but also those of counting methods in different devices might have affected the effect size because fluctuation of radioactivity measurements was observed as SD.
We will assume that the power analysis is an important method to clarify the difference between measurements but not to clarify the equivalency between the measuring results from the two counting devices. Our equivalency result will have to depend on (1) the results of the correlation, (2) the system errors obtained by the Bland-Altman plot, and (3) the maximum difference between N’ and N.