Prioritizing Crohn’s disease genes by integrating association signals with gene expression implicates monocyte subsets
2019
Genome-wide association studies have identified ~170 loci associated with Crohn’s disease (CD) and defining which
genes drivethese association signals is a major challenge. The primary aim of this study was to define which CD locus genes are most likely to be disease related. We developed a gene
prioritizationregression model (GPRM) by integrating complementary mRNA expression datasets, including bulk
RNA-Seqfrom the
terminal ileumof 302 newly diagnosed, untreated CD patients and controls, and in stimulated monocytes. Transcriptome-wide association and co-expression network analyses were performed on the ileal
RNA-Seqdatasets, identifying 40 genome-wide significant
genes.
Co-expression networkanalysis identified a single gene module, which was substantially enriched for CD locus genes and most highly expressed in monocytes. By including expression-based and epigenetic information, we refined likely CD genes to 2.5
prioritizedgenes per locus from an average of 7.8 total genes. We validated our model structure using cross-validation and our
prioritizationresults by protein-association network analyses, which demonstrated significantly higher CD gene interactions for
prioritizedcompared with non-
prioritizedgenes. Although individual datasets cannot convey all of the information relevant to a disease, combining data from multiple relevant expression-based datasets improves prediction of disease genes and helps to further understanding of disease pathogenesis.
Keywords:
-
Correction
-
Source
-
Cite
-
Save
70
References
8
Citations
NaN
KQI