Date of Award
9-6-2024
Degree Type
Thesis
Degree Name
Master of Science (MS)
Department
Bioinformatics & Computational Biology
First Advisor
Heather Wheeler
Abstract
Most genetic variants associated with complex human traits exist in non-coding regions, and thus the mechanism by which they affect a trait can be unclear. Genetic regulation of transcription and translation are key mechanisms through which genetic variants impact traits. Quantitative trait locus (QTL) mapping studies leverage data produced by advanced sequencing and assay technology to identify variants associated with the abundance of a molecular trait like RNA expression (eQTL) or protein levels (pQTL). While proximal genetic variants (cis-acting), like those in promoter or enhancer regions of genes, tend to have the largest effect sizes on RNA transcript and protein levels, distal genetic variation (trans-acting) still contributes considerably to regulating transcription and translation. However, trans-QTL can be difficult to discover due to the high multiple testing burden, their comparatively low effect sizes, and their tendency to have tissue- or cell type-specific effects. Methods that prioritize testing cis-eQTL for trans-acting associations have proven effective because they reduce the multiple testing burden and many trans-eQTL colocalize with cis-eQTL. For example, a transcriptome-wide association study (TWAS) that used observed gene expression as trait found more trans-acting genes than a comparable eQTL study. We hypothesized that performing a TWAS using protein levels as trait would be effective at identifying trans-pQTL because it prioritized cis-eQTL for testing trans-acting genes. While gene expression and protein levels have previously been shown to have a low correlation, we hypothesized that GReX would have a higher correlation with observed protein levels because it excludes variation in expression due to environmental factors. We used genotype and plasma protein measurements from individuals participating in the INTERVAL study for our TWAS and replicated these results with genotype and plasma protein measurements from individuals in the TOPMed MESA cohort. We used transcriptome prediction models from 49 tissues trained with GTEx Project genotype and RNA-Seq data. Furthermore, we used RNA-Seq data from the TOPMed MESA cohort to compare the correlation of observed expression levels with protein levels to the correlation of predicted expression levels with protein levels. We discovered many replicable cis- and trans-acting gene-protein relationships and found that predicted expression had a higher correlation and true positive rate than observed expression for significant association with protein levels. These results indicate that predicted gene expression may better uncover the genetic mechanisms underlying complex traits than observed expression.
Recommended Citation
Wittich, Henry, "Transcriptome-Wide Association Study of the Plasma Proteme Reveals Cis and Trans Regulatory Mechanisms Underlying Complex Traits" (2024). Master's Theses. 4555.
https://ecommons.luc.edu/luc_theses/4555