Manifold Learning via Diffusion with High-dimensional Genomic Data for Common Disease Prediction and Genetic Interaction Detection

Principal Investigator
Samuel Feng
Department
Mathematics
Focus Area
Healthcare
Manifold Learning via Diffusion with High-dimensional Genomic Data for Common Disease Prediction and Genetic Interaction Detection

Genomic data are crucial to the vision of predictive and personalized healthcare. Despite the massive amounts of genomic data already collected, their noise and high dimensionality (containing millions of variables) pose serious challenges to prediction beyond classical Mendelian diseases. Toward the automated data-driven prediction of common complex diseases (e.g., obesity and diabetes), there remains a severe lack of tools for adequately utilizing our genetic data. Similarly, researchers lack principled methods for fusing genomics with other modalities of biomedical data (e.g., medical imaging and blood chemistry), and for discovering interactions between multiple gene sites. Our research will incorporate manifold learning via diffusion maps into deep learning architectures, resulting in a suite of tools that can overcome these challenges with existing genomic data. Consequently, our approach will enable the efficient discovery of high-order gene interactions and better predictions for common disease. 

Manifold Learning via Diffusion with High-dimensional Genomic Data for Common Disease Prediction and Genetic Interaction Detection