Have a UROP opening you would like to submit?
Please fill out the form.
Predicting functional consequences of regulatory noncoding variants with large functional and experimental data set
6: Electrical Engineering and Computer Science
June 15, 2020
Contact via email to email@example.com with attached resume/CV and short statements of interest and time commitment.
[This work is virtually done] Understanding human genetic variations offers a great potential in uncovering underlying disease mechanisms by identifying new disease-associated variants and in discovering potential drug targets and therapeutics. In this project, we are leveraging functional genomics to understand functional consequences of noncoding variants, by predicting unobserved (or not yet characterized) variants using various ML methods with >10k functional data with experimental data such as massively parallel reporter assays and CRISPR screens. There are great room for creativity and method development using different ML/DL methods, feature engineering, optimization, and learning from noisy labels. From biological point of view, this project provides an excellent opportunity to work with diverse functional and experimental data set, including ENCODE and Roadmap data, mutagenesis data, etc. Relevant references are Kircher et al. Nature Genetics 2014; Smedley et al. AJHG 2016; Kircher et al. Nature Comm 2019; Hanna et al. biorxiv 2020. The work is to be done virtually with once a week virtual meeting, and you will have the chance to collaborate with multiple groups at MIT/Harvard/and NYGC; previous UROP students have co-authored the paper coming out form the project, and learned various skills through interdisciplinary fields from bioinformatics and statistics to genetics. PI sponsored funding is available. Please send your application to firstname.lastname@example.org with (1) CV, (2) time commitment (quantify as possible; X hours/week), including your availability for after summer, (4) interests, expectation, and relevant current skillsets. Prerequisites are loosely stated, and there is no strict requirement at all; most important thing is your enthusiasm to the project and willingness to learn practical skills with reasonable (and consistent) time commitment.
- CS background; proficiency in python - knowledge of regression, classification models (ML methods) - knowledge of statistics (linear/ordinal regression) - (optional) experience with GWAS summary statistics and functional data - (optional) knowledge of experimental variant screening (MPRAs), scRNA-seq