Mutations in DNA that accumulate during the lifespan of each individual result in mosaic bodies, in which each cell has unique variants in the genome. That phenomenon is called somatic mosaicism. Despite the prevalence of somatic mosaicism, studying it has been limited by the lack of means to detect such variants at the level of single cells. Recent advances in single-cell genomics, however, make such research possible. Our group develops computational methods for precisely detecting somatic mosaic variants by harnessing new experimental approaches, including clonal expansion and whole genome amplification. By applying those methods to human samples, we aim to answer questions about the origin, spread, and consequence of mosaic mutations, which involves determining mutation rates, differences in the number and pattern of mutations between tissues and ages, relevance of the mutation to diseases and aging. Additionally, we are developing scalable approaches for tracing cell lineages using mutations as lineage markers.
Single-cell sequencing is the ultimate way to study somatic mosaicism in healthy tissues and in cancer. However, due to the scarcity of DNA in a single cell, an amplification process is required. Such amplifications can be achieved via clonal expansion, in which a single cell is cultured to produce a colony, and via in vitro whole genome amplification (WGA), in which DNA is amplified by using polymerases. We are currently formulating strategies for the quality control of WGA and to distinguish signal from noise that may be introduced during cell culture or DNA amplification, as well as developing approaches to estimate the contributions of signal and noise when they cannot be distinguished unambiguously.
During the past decade, high-throughput next-generation technologies coupled with computational algorithms have enabled us to better understand the biology of cancer as well as the molecular underpinnings of its development and progression. Numerous functionally significant point mutations as well as structural alterations have been identified in several types and subtypes of cancers that illustrate the diverse landscape of the cancer genome. In our laboratory, we focus on the discovery and analysis of somatic point mutations and structural alterations, including deletions, duplications, and copy number changes, in colon cancer and glioma. We are especially interested in understanding the relationship between patterns of genetic alterations and modes of evolution of cancer, as well as molecular differences between cancer-free and cancer-adjacent polyps.
Copy number variation (CNV) in the genome is a complex phenomenon that remains incompletely understood. Frequent in cancers, somatic copy number alterations (CNA) have been related to cancer susceptibility, cancer progression and invasiveness, individual response to the treatment, and patients’ quality of life after treatment. The detection of CNVs and CNAs is important to address a wide spectrum of clinical and scientific questions. Research in our laboratory is focused on the discovery and analysis of CNVs and CNAs along with their relevance to diseases. We have developed and continually improved a method, CNVnator/CNVpytor, for CNV discovery and genotyping from a read-depth analysis of personal genome or cancer sequencing that currently ranks among the best, most widely used methods for CNV analysis.
Simultaneous advances in genomics (i.e., in variant discovery), epigenomics, and functional genomics (i.e., emergence of ChiP-seq, ATAC-seq, Hi-C, and RNA-seq techniques) provide opportunities to study both the origins and consequences of genomic variants. We are interested in understanding various epigenomic properties that predispose mutational processes generating single nucleotide variation (SNV) and structural variation (SV). Inversely, germline and somatic variants affect genome function. However, because many of those variants occur in non-coding regions of the genome, their effects remain poorly understood. In response, our laboratory is actively working to elucidate such effects with a particular focus on variants contributing to neuro-developmental disorders such as autism spectrum disorders and Tourette syndrome.
Accurate discovery of somatic mutations in a cell is a challenge that partially lays in immaturity of dedicated analytical approaches. Approaches comparing a cell's genome to a control bulk sample miss common mutations, while approaches to find such mutations from bulk suffer from low sensitivity. We developed a tool, All2, which enables accurate filtering of mutations in a cell without the need for data from bulk(s). It is based on pair-wise comparisons of all cells to each other where every call for base pair substitution and indel is classified as either a germline variant, mosaic mutation, or false positive. As All2 allows for considering dropped-out regions, it is applicable to whole genome and exome analysis of cloned and amplified cells. By applying the approach to a variety of available data, we showed that its application reduces false positives, enables sensitive discovery of high frequency mutations, and is indispensable for conducting high resolution cell lineage tracing.
Detecting copy number variations (CNVs) and copy number alterations (CNAs) based on whole-genome sequencing data is important for personalized genomics and treatment. CNVnator is one of the most popular tools for CNV/CNA discovery and analysis based on read depth. Herein, we present an extension of CNVnator developed in Python -- CNVpytor. CNVpytor inherits the reimplemented core engine of its predecessor and extends visualization, modularization, performance, and functionality. Additionally, CNVpytor uses B-allele frequency likelihood information from single-nucleotide polymorphisms and small indels data as additional evidence for CNVs/CNAs and as primary information for copy number neutral losses of heterozygosity. CNVpytor is significantly faster than CNVnator -- particularly for parsing alignment files (2-20 times faster) -- and has (20-50 times) smaller intermediate files. CNV calls can be filtered using several criteria, annotated, and merged over multiple samples. Modular architecture allows it to be used in shared and cloud environments such as Google Colab and Jupyter notebook. Data can be exported into JBrowse, while a lightweight plugin version of CNVpytor for JBrowse enables nearly instant and GUI-assisted analysis of CNVs by any user. CNVpytor release and the source code are available on GitHub at https://github.com/abyzovlab/CNVpytor under the MIT license.
Various barcoding and labelling strategies have been developed for cell-lineage tracing ...
Post-zygotic mutations incurred during DNA replication, DNA repair, and other cellular processes lead to somatic mosaicism. Somatic mosaicism is an established cause of various diseases, including cancers. However, detecting mosaic variants in DNA from non-cancerous somatic tissues poses significant challenges, particularly if the variants only are present in a small fraction of cells. Here, the Brain Somatic Mosaicism Network conducts a coordinated, multi-institutional study to examine the ability of existing methods to detect simulated somatic single-nucleotide variants (SNVs) in DNA mixing experiments, generate multiple replicates of whole-genome sequencing data from the dorsolateral prefrontal cortex, other brain regions, dura mater, and dural fibroblasts of a single neurotypical individual, devise strategies to discover somatic SNVs, and apply various approaches to validate somatic SNVs. These efforts lead to the identification of 43 bona fide somatic SNVs that range in variant allele fractions from 0.005 to 0.28. Guided by these results, we devise best practices for calling mosaic SNVs from 250× whole-genome sequencing data in the accessible portion of the human genome that achieve 90% specificity and sensitivity. Finally, we demonstrate that analysis of multiple bulk DNA samples from a single individual allows the reconstruction of early developmental cell lineage trees. This study provides a unified set of best practices to detect somatic SNVs in non-cancerous tissues. The data and methods are freely available to the scientific community and should serve as a guide to assess the contributions of somatic SNVs to neuropsychiatric diseases.
|76.||All2: A tool for selecting mosaic mutations from comprehensive multi-cell comparisons.
, , , , Fasching L, , Tomasini L, Mariani J, Vaccarino FM,
PLoS Comput Biol 2022; 18(4):e1009487
|75.||CNVpytor: a tool for copy number variation detection and analysis from read depth and allele imbalance in whole-genome sequencing.
, , Diesh C, Holmes I,
Gigascience 2021; 10(11):giab074
|74.||LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads.
Bioinformatics 2021; 37(7):1015-1017
|73.||Comprehensive identification of somatic nucleotide variants in human brain tissue.
, , Thorpe J, Sherman MA, Jones AG, Cho S, Daily K, Dou Y, Ganz J, Galor A, Lobon I, Pattni R, Rosenbluh C, Tomasi S, Tomasini L, Yang X, Zhou B, Akbarian S, Ball LL, Bizzotto S, Emery SB, Doan R, Fasching L, , Juan D, Lizano E, Luquette LJ, Moldovan JB, Narurkar R, Oetjens MT, Rodin RE, , Shin JH, Soriano E, Straub RE, Zhou W, Chess A, Gleeson JG, Marquès-Bonet T, Park PJ, Peters MA, Pevsner J, Walsh CA, Weinberger DR, Vaccarino FM, Moran JV, Urban AE, Kidd JM, Mills RE,
Genome Biol 2021; 22(1):92
|72.||Early developmental asymmetries in cell lineage trees in living individuals.
Fasching L, , Tomasi S, Schreiner J, Tomasini L, Brady MV, , , , , Szekely A, Fernandez TV, Leckman JF, , Vaccarino FM
Science 2021; 371(6535):1245-1248
|71.||Landmarks of human embryonic development inscribed in somatic mutations.
Bizzotto S, Dou Y, Ganz J, Doan RN, Kwon M, Bohrson CL, Kim SN, , , Park PJ, Walsh CA
Science 2021; 371(6535):1249-1253
|70.||Machine learning reveals bilateral distribution of somatic L1 insertions in human neurons and glia.
Zhu X, Zhou B, Pattni R, Gleason K, Tan C, Kalinowski A, Sloan S, Fiston-Lavier AS, Mariani J, Petrov D, Barres BA, Duncan L, , Vogel H, Moran JV, Vaccarino FM, Tamminga CA, Levinson DF, Urban AE
Nat Neurosci 2021; 24(2):186-196
|69.||PsychENCODE and beyond: transcriptomics and epigenomics of brain development and organoids.
Jourdon A, Scuderi S, Capauto D, , Vaccarino FM
Neuropsychopharmacology 2021; 46(1):70-85
|68.||Complex mosaic structural variations in human fetal brains.
, Tomasini L, Proukakis C, , Manlove L, , Scuderi S, Zhou B, Kalyva M, Amiri A, Mariani J, Sedlazeck FJ, Urban AE, Vaccarino FM,
Genome Res 2020; 30(12):1695-1704
|67.||The role of somatic mosaicism in brain disease.
Jourdon A, Fasching L, Scuderi S, , Vaccarino FM
Curr Opin Genet Dev 2020; 65:84-90
|66.||Adult diffuse glioma GWAS by molecular subtype identifies variants in D2HGDH and FAM20C.
Eckel-Passow JE, Drucker KL, Kollmeyer TM, Kosel ML, Decker PA, Molinaro AM, Rice T, Praska CE, Clark L, Caron A, , Batzler A, Song JS, Pekmezci M, Hansen HM, McCoy LS, Bracci PM, Wiemels J, Wiencke JK, Francis S, Burns TC, Giannini C, Lachance DH, Wrensch M, Jenkins RB
Neuro Oncol 2020; 22(11):1602-1613
|65.||SCELLECTOR: ranking amplification bias in single cells using shallow sequencing.
, Jourdon A, , , Vaccarino F,
BMC Bioinformatics 2020; 21(1):521
|64.||Cell Lineage Tracing and Cellular Diversity in Humans.
, Vaccarino FM
Annu Rev Genomics Hum Genet 2020; 21:101-116
|63.||Neurological safety of oxaliplatin in patients with uncommon variants in Charcot-Marie-tooth disease genes.
Le-Rademacher JG, Lopez CL, Kanwar R, Major-Elechi B, , Banck MS, Therneau TM, Sloan JA, Loprinzi CL, Beutler AS
J Neurol Sci 2020; 411:116687
|62.||Combining copy number, methylation markers, and mutations as a panel for endometrial cancer detection via intravaginal tampon collection.
Sangtani A, Wang C, Weaver A, Hoppman NL, Kerr SE, , Shridhar V, Staub J, Kocher JA, Voss JS, Podratz KC, Wentzensen N, Kisiel JB, Sherman ME, Bakkum-Gamez JN
Gynecol Oncol 2020; 156(2):387-392
|61.||Haplotype-resolved and integrated genome analysis of the cancer cell line HepG2.
Zhou B, Ho SS, Greer SU, Spies N, Bell JM, Zhang X, Zhu X, Arthur JG, Byeon S, Pattni R, Saha I, Huang Y, Song G, Perrin D, Wong WH, Ji HP, , Urban AE
Nucleic Acids Res 2019; 47(8):3846-3861
|60.||Chromatin organization modulates the origin of heritable structural variations in human genome.
Nucleic Acids Res 2019; 47(6):2766-2777
|59.||Comprehensive, integrated, and phased whole-genome analysis of the primary ENCODE cell line K562.
Zhou B, Ho SS, Greer SU, Zhu X, Bell JM, Arthur JG, Spies N, Zhang X, Byeon S, Pattni R, Ben-Efraim N, Haney MS, Haraksingh RR, Song G, Ji HP, Perrin D, Wong WH, , Urban AE
Genome Res 2019; 29(3):472-484
|58.||Molecular signatures of multiple myeloma progression through single cell RNA-Seq.
Jang JS, Li Y, Mitra AK, Bi L, , van Wijnen AJ, Baughn LB, Van Ness B, Rajkumar V, Kumar S, Jen J
Blood Cancer J 2019; 9(1):2
|57.||Revealing the brain's molecular architecture.
Science 2018; 362(6420):1262-1263
|56.||Transcriptome and epigenome landscape of human cortical development modeled in organoids.
Amiri A, Coppola G, Scuderi S, Wu F, , Liu F, Pochareddy S, Shin Y, Safi A, Song L, Zhu Y, Sousa AMM, Gerstein M, Crawford GE, Sestan N, , Vaccarino FM
Science 2018; 362(6420):eaat6720
|55.||Molecular characterization of colorectal adenomas with and without malignancy reveals distinguishing genome, transcriptome and methylome alterations.
Druliner BR, Wang P, , Baheti S, Slettedahl S, Mahoney D, , Xu H, , Bockol M, O'Brien D, Grill D, Warner N, Munoz-Gomez M, Kossick K, Johnson R, Mouchli M, Felmlee-Devine D, Washechek-Aletto J, Smyrk T, Oberg A, Wang J, Chia N, , Ahlquist D, Boardman LA
Sci Rep 2018; 8(1):3161
|54.||Different mutational rates and mechanisms in human cells at pregastrulation and neurogenesis.
, Tomasini L, Mariani J, Zhou B, , Franjic D, Pletikos M, Pattni R, Chen BJ, Venturini E, Riley-Gillis B, Sestan N, Urban AE, , Vaccarino FM
Science 2018; 359(6375):550-555
|53.||Detection and Quantification of Mosaic Genomic DNA Variation in Primary Somatic Tissues Using ddPCR: Analysis of Mosaic Transposable-Element Insertions, Copy-Number Variants, and Single-Nucleotide Variants.
Zhou B, Haney MS, Zhu X, Pattni R, , Urban AE
Methods Mol Biol 2018; 1768:173-190
|52.||Inferring modes of evolution from colorectal cancer with residual polyp of origin.
, Druliner BR, , , Chia N, , Boardman LA
Oncotarget 2017; 9(6):6780-6792
|51.||Patient-reported (EORTC QLQ-CIPN20) versus physician-reported (CTCAE) quantification of oxaliplatin- and paclitaxel/carboplatin-induced peripheral neuropathy in NCCTG/Alliance clinical trials.
Le-Rademacher J, Kanwar R, Seisler D, Pachman DR, Qin R, , Ruddy KJ, Banck MS, Lavoie Smith EM, Dorsey SG, Aaronson NK, Sloan J, Loprinzi CL, Beutler AS
Support Care Cancer 2017; 25(11):3537-3544
|50.||Landscape and variation of novel retroduplications in 26 human populations.
Zhang Y, Li S, , Gerstein MB
PLoS Comput Biol 2017; 13(6):e1005567
|49.||Human induced pluripotent stem cells for modelling neurodevelopmental disorders.
Ardhanareeswaran K, Mariani J, Coppola G, , Vaccarino FM
Nat Rev Neurol 2017; 13(5):265-278
|48.||Intersection of diverse neuronal genomes and neuropsychiatric disease: The Brain Somatic Mosaicism Network.
McConnell MJ, Moran JV, , Akbarian S, , Cortes-Ciriano I, Erwin JA, Fasching L, Flasch DA, Freed D, Ganz J, Jaffe AE, Kwan KY, Kwon M, Lodato MA, Mills RE, Paquola ACM, Rodin RE, Rosenbluh C, Sestan N, Sherman MA, Shin JH, Song S, Straub RE, Thorpe J, Weinberger DR, Urban AE, Zhou B, Gage FH, Lehner T, Senthil G, Walsh CA, Chess A, Courchesne E, Gleeson JG, Kidd JM, Park PJ, Pevsner J, Vaccarino FM
Science 2017; 356(6336):eaal1641
|47.||Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans.
Haraksingh RR, , Urban AE
BMC Genomics 2017; 18(1):321
|46.||One thousand somatic SNVs per skin fibroblast cell set baseline of mosaic mutational load with patterns that suggest proliferative origin.
, Tomasini L, Zhou B, , Coppola G, Amenduni M, Pattni R, Wilson M, Gerstein M, Weissman S, Urban AE, Vaccarino FM
Genome Res 2017; 27(4):512-523
|45.||Genomic Mosaicism in Neurons and Other Cell Types
, Urban AE, Vaccarino FM
Principles and Approaches for Discovery and Validation of Somatic Mosaicism in the Human Brain., Springer New York: Springer Nature; 2017; Chapter 1.; 3-24p
|44.||Colorectal Cancer with Residual Polyp of Origin: A Model of Malignant Transformation.
Druliner BR, Rashtak S, Ruan X, , , O'Brien D, Johnson R, Felmlee-Devine D, Washechek-Aletto J, Basu N, Liu H, Smyrk T, , Boardman LA
Transl Oncol 2016; 9(4):280-6
|43.||Elevated variant density around SV breakpoints in germline lineage lends support to error-prone replication hypothesis.
Genome Res 2016; 26(7):874-81
|42.||Single-cell analysis of targeted transcriptome predicts drug sensitivity of single cells within human myeloma tumors.
Mitra AK, Mukherjee UK, Harding T, Jang JS, Stessman H, Li Y, , Jen J, Kumar S, Rajkumar V, Van Ness B
Leukemia 2016; 30(5):1094-102
|41.||A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals.
Chen J, Rozowsky J, Galeev TR, Harmanci A, Kitchen R, Bedford J, , Kong Y, Regan L, Gerstein M
Nat Commun 2016; 7:11101
|40.||Testing of candidate single nucleotide variants associated with paclitaxel neuropathy in the trial NCCTG N08C1 (Alliance).
Boora GK, Kanwar R, Kulkarni AA, , Sloan J, Ruddy KJ, Banck MS, Loprinzi CL, Beutler AS
Cancer Med 2016; 5(4):631-9
|39.||Understanding genome structural variations.
, Li S, Gerstein MB
Oncotarget 2016; 7(7):7370-1
|38.||The PsychENCODE project.
Akbarian S, Liu C, Knowles JA, Vaccarino FM, Farnham PJ, Crawford GE, Jaffe AE, Pinto D, Dracheva S, Geschwind DH, Mill J, Nairn AC, , Pochareddy S, Prabhakar S, Weissman S, Sullivan PF, State MW, Weng Z, Peters MA, White KP, Gerstein MB, Amiri A, Armoskus C, Ashley-Koch AE, , Beckel-Mitchener A, Berman BP, Coetzee GA, Coppola G, Francoeur N, Fromer M, Gao R, Grennan K, Herstein J, Kavanagh DH, Ivanov NA, Jiang Y, Kitchen RR, Kozlenkov A, Kundakovic M, Li M, Li Z, Liu S, Mangravite LM, Mattei E, Markenscoff-Papadimitriou E, Navarro FC, North N, Omberg L, Panchision D, Parikshak N, Poschmann J, Price AJ, Purcaro M, Reddy TE, Roussos P, Schreiner S, Scuderi S, Sebra R, Shibata M, Shieh AW, Skarica M, Sun W, Swarup V, Thomas A, Tsuji J, van Bakel H, Wang D, , Wang K, Werling DM, Willsey AJ, Witt H, Won H, Wong CC, Wray GA, Wu EY, Xu X, Yao L, Senthil G, Lehner T, Sklar P, Sestan N
Nat Neurosci 2015; 18(12):1707-12
|37.||An integrated map of structural variation in 2,504 human genomes.
Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, , Huddleston J, Zhang Y, Ye K, Jun G, Fritz MH, Konkel MK, Malhotra A, Stütz AM, Shi X, Casale FP, Chen J, Hormozdiari F, Dayama G, Chen K, Malig M, Chaisson MJP, Walter K, Meiers S, Kashin S, Garrison E, Auton A, Lam HYK, Mu XJ, Alkan C, Antaki D, , Cerveira E, Chines P, Chong Z, Clarke L, Dal E, Ding L, Emery S, Fan X, Gujral M, Kahveci F, Kidd JM, Kong Y, Lameijer EW, McCarthy S, Flicek P, Gibbs RA, Marth G, Mason CE, Menelaou A, Muzny DM, Nelson BJ, Noor A, Parrish NF, Pendleton M, Quitadamo A, Raeder B, Schadt EE, Romanovitch M, Schlattl A, Sebra R, Shabalin AA, Untergasser A, Walker JA, Wang M, Yu F, Zhang C, Zhang J, Zheng-Bradley X, Zhou W, Zichner T, Sebat J, Batzer MA, McCarroll SA, Mills RE, Gerstein MB, Bashir A, Stegle O, Devine SE, Lee C, Eichler EE, Korbel JO
Nature 2015; 526(7571):75-81
|36.||A global reference for human genetic variation.
Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR
Nature 2015; 526(7571):68-74
|35.||MetaSV: an accurate and integrative structural-variant caller for next generation sequencing.
Mohiyuddin M, Mu JC, Li J, Bani Asadi N, Gerstein MB, , Wong WH, Lam HY
Bioinformatics 2015; 31(16):2741-4
|34.||FOXG1-Dependent Dysregulation of GABA/Glutamate Neuron Differentiation in Autism Spectrum Disorders.
Mariani J, Coppola G, Zhang P, , Provini L, Tomasini L, Amenduni M, Szekely A, Palejev D, Wilson M, Gerstein M, Grigorenko EL, Chawarska K, Pelphrey KA, Howe JR, Vaccarino FM
Cell 2015; 162(2):375-390
|33.||Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms.
, Li S, Kim DR, Mohiyuddin M, Stütz AM, Parrish NF, Mu XJ, Clark W, Chen K, Hurles M, Korbel JO, Lam HY, Lee C, Gerstein MB
Nat Commun 2015; 6:7256
|32.||VarSim: a high-fidelity simulation and validation framework for high-throughput genome sequencing with cancer applications.
Mu JC, Mohiyuddin M, Li J, Bani Asadi N, Gerstein MB, , Wong WH, Lam HY
Bioinformatics 2015; 31(9):1469-71
|31.||Analysis of variable retroduplications in human populations suggests coupling of retrotransposition to cell division.
, Iskow R, Gokcumen O, Radke DW, Balasubramanian S, Pei B, Habegger L, Lee C, Gerstein M
Genome Res 2013; 23(12):2042-52
|30.||Integrative annotation of variants from 1092 humans: application to cancer genomics.
Khurana E, Fu Y, Colonna V, Mu XJ, Kang HM, Lappalainen T, Sboner A, Lochovsky L, Chen J, Harmanci A, Das J, , Balasubramanian S, Beal K, Chakravarty D, Challis D, Chen Y, Clarke D, Clarke L, Cunningham F, Evani US, Flicek P, Fragoza R, Garrison E, Gibbs R, Gümüş ZH, Herrero J, Kitabayashi N, Kong Y, Lage K, Liluashvili V, Lipkin SM, MacArthur DG, Marth G, Muzny D, Pers TH, Ritchie GRS, Rosenfeld JA, Sisu C, Wei X, Wilson M, Xue Y, Yu F, Dermitzakis ET, Yu H, Rubin MA, Tyler-Smith C, Gerstein M
Science 2013; 342(6154):1235587
|29.||Child development and structural variation in the human genome.
Zhang Y, Haraksingh R, Grubert F, , Gerstein M, Weissman S, Urban AE
Child Dev 2013; 84(1):34-48
|28.||Somatic copy number mosaicism in human skin revealed by induced pluripotent stem cells.
, Mariani J, Palejev D, Zhang Y, Haney MS, Tomasini L, Ferrandino AF, Rosenberg Belmaker LA, Szekely A, Wilson M, Kocabas A, Calixto NE, Grigorenko EL, Huttner A, Chawarska K, Weissman S, Urban AE, Gerstein M, Vaccarino FM
Nature 2012; 492(7429):438-42
|27.||An integrated map of genetic variation from 1,092 human genomes.
Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA
Nature 2012; 491(7422):56-65
|26.||Architecture of the human regulatory network derived from ENCODE data.
Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, Mu XJ, Khurana E, Rozowsky J, Alexander R, Min R, Alves P, , Addleman N, Bhardwaj N, Boyle AP, Cayting P, Charos A, Chen DZ, Cheng Y, Clarke D, Eastman C, Euskirchen G, Frietze S, Fu Y, Gertz J, Grubert F, Harmanci A, Jain P, Kasowski M, Lacroute P, Leng JJ, Lian J, Monahan H, O'Geen H, Ouyang Z, Partridge EC, Patacsil D, Pauli F, Raha D, Ramirez L, Reddy TE, Reed B, Shi M, Slifer T, Wang J, Wu L, Yang X, Yip KY, Zilberman-Schapira G, Batzoglou S, Sidow A, Farnham PJ, Myers RM, Weissman SM, Snyder M
Nature 2012; 489(7414):91-100
|25.||An integrated encyclopedia of DNA elements in the human genome.
Nature 2012; 489(7414):57-74
|24.||Regulatory element copy number differences shape primate expression profiles.
Iskow RC, Gokcumen O, , Malukiewicz J, Zhu Q, Sukumar AT, Pai AA, Mills RE, Habegger L, Cusanovich DA, Rubel MA, Perry GH, Gerstein M, Stone AC, Gilad Y, Lee C
Proc Natl Acad Sci U S A 2012; 109(31):12656-61
|23.||Genome-wide mapping of copy number variation in humans: comparative analysis of high resolution array platforms.
Haraksingh RR, , Gerstein M, Urban AE, Snyder M
PLoS One 2011; 6(11):e27859
|22.||Integration of protein motions with molecular networks reveals different mechanisms for permanent and transient interactions.
Bhardwaj N, , Clarke D, Shou C, Gerstein MB
Protein Sci 2011; 20(10):1745-54
|21.||AlleleSeq: analysis of allele-specific expression and binding in a network framework.
Rozowsky J, , Wang J, Alves P, Raha D, Harmanci A, Leng J, Bjornson R, Kong Y, Kitabayashi N, Bhardwaj N, Rubin M, Snyder M, Gerstein M
Mol Syst Biol 2011; 7:522
|20.||Identification of genomic indels and structural variations using split reads.
Zhang ZD, Du J, Lam H, , Urban AE, Snyder M, Gerstein M
BMC Genomics 2011; 12:375
|19.||CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing.
, Urban AE, Snyder M, Gerstein M
Genome Res 2011; 21(6):974-84
|18.||Annual Research Review: The promise of stem cell research for neuropsychiatric disorders.
Vaccarino FM, Urban AE, Stevens HE, Szekely A, , Grigorenko EL, Gerstein M, Weissman S
J Child Psychol Psychiatry 2011; 52(4):504-16
|17.||AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision.
, Gerstein M
Bioinformatics 2011; 27(5):595-603
|16.||Mapping copy number variation by population-scale genome sequencing.
Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, , Yoon SC, Ye K, Cheetham RK, Chinwalla A, Conrad DF, Fu Y, Grubert F, Hajirasouliha I, Hormozdiari F, Iakoucheva LM, Iqbal Z, Kang S, Kidd JM, Konkel MK, Korn J, Khurana E, Kural D, Lam HY, Leng J, Li R, Li Y, Lin CY, Luo R, Mu XJ, Nemesh J, Peckham HE, Rausch T, Scally A, Shi X, Stromberg MP, Stütz AM, Urban AE, Walker JA, Wu J, Zhang Y, Zhang ZD, Batzer MA, Ding L, Marth GT, McVean G, Sebat J, Snyder M, Wang J, Ye K, Eichler EE, Gerstein MB, Hurles ME, Lee C, McCarroll SA, Korbel JO
Nature 2011; 470(7332):59-65
|15.||A map of human genome variation from population-scale sequencing.
Abecasis GR, Altshuler D, Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME, McVean GA
Nature 2010; 467(7319):1061-73
|14.||Analysis of combinatorial regulation: scaling of partnerships between regulators with the number of governed targets.
Bhardwaj N, Carson MB, , Yan KK, Lu H, Gerstein MB
PLoS Comput Biol 2010; 6(5):e1000755
|13.||RigidFinder: a fast and sensitive method to detect rigid blocks in large macromolecular complexes.
, Bjornson R, Felipe M, Gerstein M
Proteins 2010; 78(2):309-24
|12.||PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data.
Korbel JO, , Mu XJ, Carriero N, Cayting P, Zhang Z, Snyder M, Gerstein MB
Genome Biol 2009; 10(2):R23
|11.||MSB: a mean-shift-based approach for the analysis of structural variation in the genome.
Wang LY, , Korbel JO, Snyder M, Gerstein M
Genome Res 2009; 19(1):106-17
|10.||An AP endonuclease 1-DNA polymerase beta complex: theoretical prediction of interacting surfaces.
, Uzun A, Strauss PR, Ilyin VA
PLoS Comput Biol 2008; 4(4):e1000066
|9.||UmuD and RecA directly modulate the mutagenic potential of the Y family DNA polymerase DinB.
Godoy VG, Jarosz DF, Simon SM, , Ilyin V, Walker GC
Mol Cell 2007; 28(6):1058-70
|8.||A comprehensive analysis of non-sequential alignments between all protein structures.
, Ilyin VA
BMC Struct Biol 2007; 7:78
|7.||Structure SNP (StSNP): a web server for mapping and modeling nsSNPs on protein structures with linkage to metabolic pathways.
Uzun A, Leslin CM, , Ilyin V
Nucleic Acids Res 2007; 35(Web Server issue):W384-92
|6.||TOPOFIT-DB, a database of protein structural alignments based on the TOPOFIT method.
Leslin CM, , Ilyin VA
Nucleic Acids Res 2007; 35(Database issue):D317-21
|5.||Friend, an integrated analytical front-end application for bioinformatics.
, Errami M, Leslin CM, Ilyin VA
Bioinformatics 2005; 21(18):3677-8
|4.||Active site prediction for comparative model structures with thematics.
Shehadi IA, , Uzun A, Wei Y, Murga LF, Ilyin V, Ondrechen MJ
J Bioinform Comput Biol 2005; 3(1):127-43
|3.||Structural exon database, SEDB, mapping exon boundaries on multiple protein structures.
Leslin CM, , Ilyin VA
Bioinformatics 2004; 20(11):1801-3
|2.||Structural alignment of proteins by a novel TOPOFIT method, as a superimposition of common volumes at a topomax point.
Ilyin VA, , Leslin CM
Protein Sci 2004; 13(7):1865-74
|1.||Efficiency Profile Method To Study The Hit Efficiency Of Drift Chambers.
, Bel'kov A, Lanyov A, Spiridonov A, Walter M, Hulsbergen W
Particles and Nuclei, Letters 2002; 5 (114); :40-52
application accepted year-round
Applicants are invited to apply for a post-doctoral (i.e., postdoc) position in Abyzov lab at Mayo Clinic. The choice of project will depend on the applicant's interests and skills, however, the research must be purely computational and focus on one of the following main fields of computational biology: population/personal human omics, cancer omics, single cell and somatic omics, and the analysis of next-generation sequencing data. Specific sub-areas of interest are discovery, annotation, and the functional annotation of human genomic variants, cancer genomics, cancer evolution, somatic mosaicism in normal human cells.
The ideal applicant will have a Ph.D. in computational biology or bioinformatics, experience in one of the aforementioned research areas, demonstrate a record of peer-reviewed publications, and possess motivation for independent research. He or she should have a very strong understanding of biology and be skilled in programming and using computers to solve problems (e.g., experience with C/C++, Java, Python/Perl, R/ROOT, etc.). Oral and written proficiency in English is also a big plus.
To apply, please email your CV, including a list of publications and details for three references, to abyzov dot alexej at mayo dot edu. Please include the phrase “PostDoc application” and your full name in the subject of the email.
application accepted year-round
Applications are invited for an internship at the Mayo Clinic. Anticipated projects will be related to the analysis of whole genome sequencing data, with the aims of studying germline and somatic variants (SNPs, CNVs, etc.). The analysis will involve applications of commonly used, and in-house developed, software tools, and making biological hypothesis from statistical data analysis. Intern applicants with strong programming skills will have opportunities to participate in developing new tools and improving our existing software.
To apply, please email your CV, including a list of publications to abyzov dot alexej at mayo dot edu. Please include the phrase “Internship application” and your full name in the subject of the email.
Graduate students (M.S or Ph.D.) wishing to conduct research in the Abyzov lab at Mayo Clinic are invited to contact Dr. Abyzov (abyzov dot alexej at mayo dot edu). The choice of the project will depend on the applicant's interests and skills. However, the research must be purely computational and focus on one of the following main fields of computational biology: population/personal human omics, cancer omics, single cell and somatic omics, and the analysis of next-generation sequencing data. Specific sub-areas of interest are discovery, annotation, and the functional annotation of human genomic variants such as SNPs, SNVs, indels, structural variations, retrotransposition, etc.
We are looking for candidates that possess motivation for independent research, have experience in computational biology or bioinformatics, and are familiar with one of the aforementioned research areas. They should have a very strong understanding of biology and be skilled in programming and using computers to solve problems (e.g., experience with C/C++ Java, Python/Perl, R/ROOT, etc.). Record of peer-reviewed publications and oral and written proficiency in English is also a big plus.
Please express your interest by emailing your CV, including a list of publications to abyzov dot alexej at mayo dot edu. Please include the phrase “PhD/MS interest” and your full name in the subject of the email.
Phone: (507) 284-5569
Harwick Building 3rd floor, Mayo Clinic|
200 1st street SW
Rochester, MN 55905, USA