Very early this year, a research group from the insurance giant Kaiser Permanente published a paper  concluding no evidence of harm in administering prenatal influenza vaccines. The study authors asserted that there was no relationship between those who received the flu shot during pregnancy and later autism spectrum disorder (ASD) diagnosis in the child. However, that proclamation was not consistent with the study’s results. Specifically, women who received the vaccine during their first trimester of pregnancy showed a 20% greater risk of having the child later develop ASD. This was based on a sampling of 13,477 women who received the maternal flu shot in the first trimester, resulting in 260 ASD cases, versus 151,698 “control” women who received no flu shot during pregnancy, resulting in 2,338 ASD cases. This result was statistically significant with a p value of 0.01, which in this case means that the possibility that this is a “chance” finding and not a “true” association was just 1%. In other words, the chances of this being a “true” association are 99%.

In statistics, the gold standard “cut-off” to determine statistical significance is actually a higher p value of 0.05, meaning that the possibility of a chance association is less than 5%. Thus, the first trimester flu shot – ASD relationship should have been deemed statistically significant, with p=0.01, and accordingly a policy change should have been made to suspend use of that vaccine, at least in the first trimester of pregnancy.

However, the study authors instead reached into their statistical “bag of tricks” and trotted out what is termed the “Bonferroni” adjustment. This adjustment is applied in statistics only under very specific instances, when multiple, unrelated statistical evaluations are made using a single data sampling. In this adjustment, simply, the p value is adjusted by multiplying its original value with the number of “independent” evaluations completed in the study of that single data set (Bland et al. 1995 BMJ 310:170). In the case of Zerbo et al. 2017, there were 8 evaluations completed (4 evaluations regarding the flu shot and 4 evaluations regarding women who actually contracted the flu during pregnancy) and thus the original p value of 0.01 was adjusted to 0.08, above the “cut off” value used for deeming “statistical significance.” The Zerbo et al. authors rounded the result up to p=0.1, further moving the result away from the “magic” 0.05 cut-off level, causing the significant result to disappear.

There’s a huge problem here, however, which I pointed out in my letter to the editor of the journal (Hooker 2017 JAMA Pediatrics 171:600) published in their June 2, 2017 edition. The Bonferroni adjustment, among other corrections for multiple, independent comparisons, should not be applied to statistics when there is any interdependence within the different evaluations completed within the data sample. In this case, 4 of the evaluations completed dealt specifically with the timing of the maternal flu shot (first, second and third trimesters, as well as overall risk at any point in pregnancy) and subsequent ASD incidence. So, not only were these four trials all focused on an ASD outcome, but they all dealt with different phases of pregnancy, which were then summed to develop an “overall” risk at any phase of pregnancy. By definition, these trials were anything but statistically independent.