Original Article Effect of genome-wide simultaneous hypotheses tests on the discovery rate

2011
An increasing number of genome-wide association studies are being performed in hundreds of thousands of single nucleotide polymorphisms (SNPs). Many of such studies carry on a second stage in which a selected number of SNPs are genotyped in new individuals in order to validate genome-wide findings. Unfortunately, a large proportion of such studies have been unable to validate the genome-wide findings. In this study we aim to better understand how to distinguish the truly associated features from the false positives in genome-wide scans. In order to achieve this goal we use empirical data to look at three aspects that may play a key role in determining which features are called to be associated with the phenotype. First, we examine the usual assumption of a uniform distribution on null p-valuesand assess whether or not it affects which features are called significant and the number of significant fea- tures. Second, we compare the global behavior of the p-valuedistribution genome-wide with the local behavior at regions such as chromosomes. Third, we look at the effect of minor allele frequencyin the p-valuedistribution. We show empirically that the uniform distribution is not a generally valid assumption and we find that as a consequence strikingly different conclusions can be drawn regarding what we call significant associations and the number of sig- nificant findings. We propose that in order to better assign significance to potential associations one needs to esti- mate the true distribution of null and non-null p-values.
    • Correction
    • Cite
    • Save
    29
    References
    0
    Citations
    NaN
    KQI
    []
    Baidu
    map