Analysis of mosaic variants and cell lineage reconstruction

Mutations in DNA that accumulate during the lifespan of each individual result in mosaic bodies, in which each cell has unique variants in the genome. That phenomenon is called somatic mosaicism. Despite the prevalence of somatic mosaicism, studying it has been limited by the lack of means to detect such variants at the level of single cells. Recent advances in single-cell genomics, however, make such research possible. Our group develops computational methods for precisely detecting somatic mosaic variants by harnessing new experimental approaches, including clonal expansion and whole genome amplification. By applying those methods to human samples, we aim to answer questions about the origin, spread, and consequence of mosaic mutations, which involves determining mutation rates, differences in the number and pattern of mutations between tissues and ages, relevance of the mutation to diseases and aging. Additionally, we are developing scalable approaches for tracing cell lineages using mutations as lineage markers.

Single cell studies

Single-cell sequencing is the ultimate way to study somatic mosaicism in healthy tissues and in cancer. However, due to the scarcity of DNA in a single cell, an amplification process is required. Such amplifications can be achieved via clonal expansion, in which a single cell is cultured to produce a colony, and via in vitro whole genome amplification (WGA), in which DNA is amplified by using polymerases. We are currently formulating strategies for the quality control of WGA and to distinguish signal from noise that may be introduced during cell culture or DNA amplification, as well as developing approaches to estimate the contributions of signal and noise when they cannot be distinguished unambiguously.

Cancer genomics

During the past decade, high-throughput next-generation technologies coupled with computational algorithms have enabled us to better understand the biology of cancer as well as the molecular underpinnings of its development and progression. Numerous functionally significant point mutations as well as structural alterations have been identified in several types and subtypes of cancers that illustrate the diverse landscape of the cancer genome. In our laboratory, we focus on the discovery and analysis of somatic point mutations and structural alterations, including deletions, duplications, and copy number changes, in colon cancer and glioma. We are especially interested in understanding the relationship between patterns of genetic alterations and modes of evolution of cancer, as well as molecular differences between cancer-free and cancer-adjacent polyps.

CNV and CNA analysis

Copy number variation (CNV) in the genome is a complex phenomenon that remains incompletely understood. Frequent in cancers, somatic copy number alterations (CNA) have been related to cancer susceptibility, cancer progression and invasiveness, individual response to the treatment, and patients’ quality of life after treatment. The detection of CNVs and CNAs is important to address a wide spectrum of clinical and scientific questions. Research in our laboratory is focused on the discovery and analysis of CNVs and CNAs along with their relevance to diseases. We have developed and continually improved a method, CNVnator/CNVpytor, for CNV discovery and genotyping from a read-depth analysis of personal genome or cancer sequencing that currently ranks among the best, most widely used methods for CNV analysis.

Variant function

Simultaneous advances in genomics (i.e., in variant discovery), epigenomics, and functional genomics (i.e., emergence of ChiP-seq, ATAC-seq, Hi-C, and RNA-seq techniques) provide opportunities to study both the origins and consequences of genomic variants. We are interested in understanding various epigenomic properties that predispose mutational processes generating single nucleotide variation (SNV) and structural variation (SV). Inversely, germline and somatic variants affect genome function. However, because many of those variants occur in non-coding regions of the genome, their effects remain poorly understood. In response, our laboratory is actively working to elucidate such effects with a particular focus on variants contributing to neuro-developmental disorders such as autism spectrum disorders and Tourette syndrome.


Publication from the lab in BMC Bioinformatics

SCELLECTOR: ranking amplification bias in single cells using shallow sequencing

The study of mosaic mutation is important since it has been linked to cancer and various disorders. Single cell sequencing has become a powerful tool to study the genome of individual cells for the detection of mosaic mutations. The amount of DNA in a single cell needs to be amplified before sequencing and multiple displacement amplification (MDA) is widely used owing to its low error rate and long fragment length of amplified DNA. However, the phi29 polymerase used in MDA is sensitive to template fragmentation and presence of sites with DNA damage that can lead to biases such as allelic imbalance, uneven coverage and over representation of C to T mutations. It is therefore important to select cells with uniform amplification to decrease false positives and increase sensitivity for mosaic mutation detection. We propose a method, Scellector (single cell selector), which uses haplotype information to detect amplification quality in shallow coverage sequencing data. We tested Scellector on single human neuronal cells, obtained in vitro and amplified by MDA. Qualities were estimated from shallow sequencing with coverage as low as 0.3× per cell and then confirmed using 30× deep coverage sequencing. The high concordance between shallow and high coverage data validated the method. Scellector can potentially be used to rank amplifications obtained from single cell platforms relying on a MDA-like amplification step, such as Chromium Single Cell profiling solution.

posted Nov 13, 2020 by Alexej Abyzov

We are featured in neuroDEVELOPMENTS

Mosaics in mind

For this issue of neuroDEVELOPMENTS we focus on the startling reality of the mosaic nature of genomes in the human brain. Since the meeting of Craig Venter, Francis Collins, and Bill Clinton at the White House on June 6, 2000 to announce the first draft of the human genome, the idea that we all carry our own version of the human genetic code is commonplace. It is now clear that this is a simplified view of reality because every cell in our body does not have precisely the same genome ...

posted Oct 27, 2020 by Alexej Abyzov

Publication from the lab in Annual Review of Genomics and Human Genetics

Cell Lineage Tracing and Cellular Diversity in Humans

Tracing cell lineages is fundamental for understanding the rules governing development in multicellular organisms and delineating complex biological processes involving the differentiation of multiple cell types with distinct lineage hierarchies. In humans, experimental lineage tracing is unethical, and one has to rely on natural-mutation markers that are created within cells as they proliferate and age. Recent studies have demonstrated that it is now possible to trace lineages in normal, noncancerous cells with a variety of data types using natural variations in the nuclear and mitochondrial DNA as well as variations in DNA methylation status. It is also apparent that the scientific community is on the verge of being able to make a comprehensive and detailed cell lineage map of human embryonic and fetal development. In this review, we discuss the advantages and disadvantages of different approaches and markers for lineage tracing. We also describe the general conceptual design for how to derive a lineage map for humans.

posted Aug 12, 2020 by Alexej Abyzov

Publication from the lab in Bioinformatics

LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads

Defining the precise location of structural variations (SVs) at single-nucleotide breakpoint resolution is a challenging problem due to large gaps in alignment. Previously, Alignment with Gap Excision (AGE) enabled us to define breakpoints of SVs at single-nucleotide resolution, however, AGE requires a vast amount of memory when aligning a pair of long sequences. To address this, we developed a memory-efficient implementation - LongAGE - based on the classical Hirschberg algorithm. We demonstrate an application of LongAGE for resolving breakpoints of SVs embedded into segmental duplications on Pacific Biosciences (PacBio) reads that can be longer than 10Kbp. Furthermore, we observed different breakpoints for a deletion and a duplication in the same locus, providing direct evidence that such multi-allelic copy number variants (mCNVs) arise from two or more independent ancestral mutations.

posted Aug 12, 2020 by Alexej Abyzov


Alexej Abyzov, Ph.D.
Associate Professor of Biomedical Informatics

Leighann Wagner
Lab Support Personnel
Administrative Assistant

Taejeong Bae, Ph.D.
Lab Member
Research Associate

Milovan Suvakov, Ph.D.
Lab Member
Research Associate

Arijit Panda, Ph.D.
Lab Member
Postdoctoral fellow

Yeongjun Jang, Ph.D.
Lab Member
Postdoctoral fellow

Yifan Wang, Ph.D.
Lab Member
Postdoctoral fellow

Vivekananda Sarangi, M.S.
Lab Member
PhD Student

Chen Wang, Ph.D.
Affiliated member
Senior Associate Consultant


  • Shobana Sekar, Ph.D.
  • Tanmoy Roychowdhury, Ph.D.
  • Quang Tran, B.S.
  • Logan J. Manlove, B.S.
  • Marcela Aguilera, B.S.
  • Dhananjay Dhokarh, Ph.D.
  • Nikolaos Vasmatzis
  • Minsoo Kim
  • Justin Wood


71. Machine learning reveals bilateral distribution of somatic L1 insertions in human neurons and glia.
Zhu X, Zhou B, Pattni R, Gleason K, Tan C, Kalinowski A, Sloan S, Fiston-Lavier AS, Mariani J, Petrov D, Barres BA, Duncan L, Abyzov A, Vogel H, Moran JV, Vaccarino FM, Tamminga CA, Levinson DF, Urban AE
Nat Neurosci 2021; [Epub ahead of print]
70. PsychENCODE and beyond: transcriptomics and epigenomics of brain development and organoids.
Jourdon A, Scuderi S, Capauto D, Abyzov A, Vaccarino FM
Neuropsychopharmacology 2021; 46(1):70-85

69. Complex mosaic structural variations in human fetal brains.
Sekar S, Tomasini L, Proukakis C, Bae T, Manlove L, Jang Y, Scuderi S, Zhou B, Kalyva M, Amiri A, Mariani J, Sedlazeck FJ, Urban AE, Vaccarino FM, Abyzov A
Genome Res 2020; 30(12):1695-1704
68. The role of somatic mosaicism in brain disease.
Jourdon A, Fasching L, Scuderi S, Abyzov A, Vaccarino FM
Curr Opin Genet Dev 2020; 65:84-90
67. Adult diffuse glioma GWAS by molecular subtype identifies variants in D2HGDH and FAM20C.
Eckel-Passow JE, Drucker KL, Kollmeyer TM, Kosel ML, Decker PA, Molinaro AM, Rice T, Praska CE, Clark L, Caron A, Abyzov A, Batzler A, Song JS, Pekmezci M, Hansen HM, McCoy LS, Bracci PM, Wiemels J, Wiencke JK, Francis S, Burns TC, Giannini C, Lachance DH, Wrensch M, Jenkins RB
Neuro Oncol 2020; 22(11):1602-1613
66. SCELLECTOR: ranking amplification bias in single cells using shallow sequencing.
Sarangi V, Jourdon A, Bae T, Panda A, Vaccarino F, Abyzov A
BMC Bioinformatics 2020; 21(1):521
65. Cell Lineage Tracing and Cellular Diversity in Humans.
Abyzov A, Vaccarino FM
Annu Rev Genomics Hum Genet 2020; 21:101-116
64. LongAGE: defining breakpoints of genomic structural variants through optimal and memory efficient alignments of long reads.
Tran Q, Abyzov A
Bioinformatics 2020; btaa703
63. Neurological safety of oxaliplatin in patients with uncommon variants in Charcot-Marie-tooth disease genes.
Le-Rademacher JG, Lopez CL, Kanwar R, Major-Elechi B, Abyzov A, Banck MS, Therneau TM, Sloan JA, Loprinzi CL, Beutler AS
J Neurol Sci 2020; 411:116687
62. Combining copy number, methylation markers, and mutations as a panel for endometrial cancer detection via intravaginal tampon collection.
Sangtani A, Wang C, Weaver A, Hoppman NL, Kerr SE, Abyzov A, Shridhar V, Staub J, Kocher JA, Voss JS, Podratz KC, Wentzensen N, Kisiel JB, Sherman ME, Bakkum-Gamez JN
Gynecol Oncol 2020; 156(2):387-392

61. Haplotype-resolved and integrated genome analysis of the cancer cell line HepG2.
Zhou B, Ho SS, Greer SU, Spies N, Bell JM, Zhang X, Zhu X, Arthur JG, Byeon S, Pattni R, Saha I, Huang Y, Song G, Perrin D, Wong WH, Ji HP, Abyzov A, Urban AE
Nucleic Acids Res 2019; 47(8):3846-3861
60. Chromatin organization modulates the origin of heritable structural variations in human genome.
Roychowdhury T, Abyzov A
Nucleic Acids Res 2019; 47(6):2766-2777
59. Comprehensive, integrated, and phased whole-genome analysis of the primary ENCODE cell line K562.
Zhou B, Ho SS, Greer SU, Zhu X, Bell JM, Arthur JG, Spies N, Zhang X, Byeon S, Pattni R, Ben-Efraim N, Haney MS, Haraksingh RR, Song G, Ji HP, Perrin D, Wong WH, Abyzov A, Urban AE
Genome Res 2019; 29(3):472-484
58. Molecular signatures of multiple myeloma progression through single cell RNA-Seq.
Jang JS, Li Y, Mitra AK, Bi L, Abyzov A, van Wijnen AJ, Baughn LB, Van Ness B, Rajkumar V, Kumar S, Jen J
Blood Cancer J 2019; 9(1):2

57. Revealing the brain's molecular architecture.

Science 2018; 362(6420):1262-1263
56. Transcriptome and epigenome landscape of human cortical development modeled in organoids.
Amiri A, Coppola G, Scuderi S, Wu F, Roychowdhury T, Liu F, Pochareddy S, Shin Y, Safi A, Song L, Zhu Y, Sousa AMM, Gerstein M, Crawford GE, Sestan N, Abyzov A, Vaccarino FM
Science 2018; 362(6420):eaat6720
55. Molecular characterization of colorectal adenomas with and without malignancy reveals distinguishing genome, transcriptome and methylome alterations.
Druliner BR, Wang P, Bae T, Baheti S, Slettedahl S, Mahoney D, Vasmatzis N, Xu H, Kim M, Bockol M, O'Brien D, Grill D, Warner N, Munoz-Gomez M, Kossick K, Johnson R, Mouchli M, Felmlee-Devine D, Washechek-Aletto J, Smyrk T, Oberg A, Wang J, Chia N, Abyzov A, Ahlquist D, Boardman LA
Sci Rep 2018; 8(1):3161
54. Different mutational rates and mechanisms in human cells at pregastrulation and neurogenesis.
Bae T, Tomasini L, Mariani J, Zhou B, Roychowdhury T, Franjic D, Pletikos M, Pattni R, Chen BJ, Venturini E, Riley-Gillis B, Sestan N, Urban AE, Abyzov A, Vaccarino FM
Science 2018; 359(6375):550-555
53. Detection and Quantification of Mosaic Genomic DNA Variation in Primary Somatic Tissues Using ddPCR: Analysis of Mosaic Transposable-Element Insertions, Copy-Number Variants, and Single-Nucleotide Variants.
Zhou B, Haney MS, Zhu X, Pattni R, Abyzov A, Urban AE
Methods Mol Biol 2018; 1768:173-190

52. Inferring modes of evolution from colorectal cancer with residual polyp of origin.
Kim M, Druliner BR, Vasmatzis N, Bae T, Chia N, Abyzov A, Boardman LA
Oncotarget 2017; 9(6):6780-6792
51. Patient-reported (EORTC QLQ-CIPN20) versus physician-reported (CTCAE) quantification of oxaliplatin- and paclitaxel/carboplatin-induced peripheral neuropathy in NCCTG/Alliance clinical trials.
Le-Rademacher J, Kanwar R, Seisler D, Pachman DR, Qin R, Abyzov A, Ruddy KJ, Banck MS, Lavoie Smith EM, Dorsey SG, Aaronson NK, Sloan J, Loprinzi CL, Beutler AS
Support Care Cancer 2017; 25(11):3537-3544
50. Landscape and variation of novel retroduplications in 26 human populations.
Zhang Y, Li S, Abyzov A, Gerstein MB
PLoS Comput Biol 2017; 13(6):e1005567
49. Human induced pluripotent stem cells for modelling neurodevelopmental disorders.
Ardhanareeswaran K, Mariani J, Coppola G, Abyzov A, Vaccarino FM
Nat Rev Neurol 2017; 13(5):265-278
48. Intersection of diverse neuronal genomes and neuropsychiatric disease: The Brain Somatic Mosaicism Network.
McConnell MJ, Moran JV, Abyzov A, Akbarian S, Bae T, Cortes-Ciriano I, Erwin JA, Fasching L, Flasch DA, Freed D, Ganz J, Jaffe AE, Kwan KY, Kwon M, Lodato MA, Mills RE, Paquola ACM, Rodin RE, Rosenbluh C, Sestan N, Sherman MA, Shin JH, Song S, Straub RE, Thorpe J, Weinberger DR, Urban AE, Zhou B, Gage FH, Lehner T, Senthil G, Walsh CA, Chess A, Courchesne E, Gleeson JG, Kidd JM, Park PJ, Pevsner J, Vaccarino FM
Science 2017; 356(6336):eaal1641
47. Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans.
Haraksingh RR, Abyzov A, Urban AE
BMC Genomics 2017; 18(1):321
46. One thousand somatic SNVs per skin fibroblast cell set baseline of mosaic mutational load with patterns that suggest proliferative origin.
Abyzov A, Tomasini L, Zhou B, Vasmatzis N, Coppola G, Amenduni M, Pattni R, Wilson M, Gerstein M, Weissman S, Urban AE, Vaccarino FM
Genome Res 2017; 27(4):512-523
45. Genomic Mosaicism in Neurons and Other Cell Types.
Abyzov A, Urban AE, Vaccarino FM
Principles and Approaches for Discovery and Validation of Somatic Mosaicism in the Human Brain, Springer New York: Springer Nature; 2017; Chapter 1.; 3-24p

44. Colorectal Cancer with Residual Polyp of Origin: A Model of Malignant Transformation.
Druliner BR, Rashtak S, Ruan X, Bae T, Vasmatzis N, O'Brien D, Johnson R, Felmlee-Devine D, Washechek-Aletto J, Basu N, Liu H, Smyrk T, Abyzov A, Boardman LA
Transl Oncol 2016; 9(4):280-6
43. Elevated variant density around SV breakpoints in germline lineage lends support to error-prone replication hypothesis.
Dhokarh D, Abyzov A
Genome Res 2016; 26(7):874-81
42. Single-cell analysis of targeted transcriptome predicts drug sensitivity of single cells within human myeloma tumors.
Mitra AK, Mukherjee UK, Harding T, Jang JS, Stessman H, Li Y, Abyzov A, Jen J, Kumar S, Rajkumar V, Van Ness B
Leukemia 2016; 30(5):1094-102
41. A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals.
Chen J, Rozowsky J, Galeev TR, Harmanci A, Kitchen R, Bedford J, Abyzov A, Kong Y, Regan L, Gerstein M
Nat Commun 2016; 7:11101
40. Testing of candidate single nucleotide variants associated with paclitaxel neuropathy in the trial NCCTG N08C1 (Alliance).
Boora GK, Kanwar R, Kulkarni AA, Abyzov A, Sloan J, Ruddy KJ, Banck MS, Loprinzi CL, Beutler AS
Cancer Med 2016; 5(4):631-9
39. Understanding genome structural variations.
Abyzov A, Li S, Gerstein MB
Oncotarget 2016; 7(7):7370-1

38. The PsychENCODE project.
Akbarian S, Liu C, Knowles JA, Vaccarino FM, Farnham PJ, Crawford GE, Jaffe AE, Pinto D, Dracheva S, Geschwind DH, Mill J, Nairn AC, Abyzov A, Pochareddy S, Prabhakar S, Weissman S, Sullivan PF, State MW, Weng Z, Peters MA, White KP, Gerstein MB, Amiri A, Armoskus C, Ashley-Koch AE, Bae T, Beckel-Mitchener A, Berman BP, Coetzee GA, Coppola G, Francoeur N, Fromer M, Gao R, Grennan K, Herstein J, Kavanagh DH, Ivanov NA, Jiang Y, Kitchen RR, Kozlenkov A, Kundakovic M, Li M, Li Z, Liu S, Mangravite LM, Mattei E, Markenscoff-Papadimitriou E, Navarro FC, North N, Omberg L, Panchision D, Parikshak N, Poschmann J, Price AJ, Purcaro M, Reddy TE, Roussos P, Schreiner S, Scuderi S, Sebra R, Shibata M, Shieh AW, Skarica M, Sun W, Swarup V, Thomas A, Tsuji J, van Bakel H, Wang D, Wang Y, Wang K, Werling DM, Willsey AJ, Witt H, Won H, Wong CC, Wray GA, Wu EY, Xu X, Yao L, Senthil G, Lehner T, Sklar P, Sestan N
Nat Neurosci 2015; 18(12):1707-12
37. An integrated map of structural variation in 2,504 human genomes.
Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, Zhang Y, Ye K, Jun G, Fritz MH, Konkel MK, Malhotra A, Stütz AM, Shi X, Casale FP, Chen J, Hormozdiari F, Dayama G, Chen K, Malig M, Chaisson MJP, Walter K, Meiers S, Kashin S, Garrison E, Auton A, Lam HYK, Mu XJ, Alkan C, Antaki D, Bae T, Cerveira E, Chines P, Chong Z, Clarke L, Dal E, Ding L, Emery S, Fan X, Gujral M, Kahveci F, Kidd JM, Kong Y, Lameijer EW, McCarthy S, Flicek P, Gibbs RA, Marth G, Mason CE, Menelaou A, Muzny DM, Nelson BJ, Noor A, Parrish NF, Pendleton M, Quitadamo A, Raeder B, Schadt EE, Romanovitch M, Schlattl A, Sebra R, Shabalin AA, Untergasser A, Walker JA, Wang M, Yu F, Zhang C, Zhang J, Zheng-Bradley X, Zhou W, Zichner T, Sebat J, Batzer MA, McCarroll SA, Mills RE, Gerstein MB, Bashir A, Stegle O, Devine SE, Lee C, Eichler EE, Korbel JO
Nature 2015; 526(7571):75-81
36. A global reference for human genetic variation.
Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR
Nature 2015; 526(7571):68-74
35. MetaSV: an accurate and integrative structural-variant caller for next generation sequencing.
Mohiyuddin M, Mu JC, Li J, Bani Asadi N, Gerstein MB, Abyzov A, Wong WH, Lam HY
Bioinformatics 2015; 31(16):2741-4
34. FOXG1-Dependent Dysregulation of GABA/Glutamate Neuron Differentiation in Autism Spectrum Disorders.
Mariani J, Coppola G, Zhang P, Abyzov A, Provini L, Tomasini L, Amenduni M, Szekely A, Palejev D, Wilson M, Gerstein M, Grigorenko EL, Chawarska K, Pelphrey KA, Howe JR, Vaccarino FM
Cell 2015; 162(2):375-390
33. Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms.
Abyzov A, Li S, Kim DR, Mohiyuddin M, Stütz AM, Parrish NF, Mu XJ, Clark W, Chen K, Hurles M, Korbel JO, Lam HY, Lee C, Gerstein MB
Nat Commun 2015; 6:7256
32. VarSim: a high-fidelity simulation and validation framework for high-throughput genome sequencing with cancer applications.
Mu JC, Mohiyuddin M, Li J, Bani Asadi N, Gerstein MB, Abyzov A, Wong WH, Lam HY
Bioinformatics 2015; 31(9):1469-71

31. Analysis of variable retroduplications in human populations suggests coupling of retrotransposition to cell division.
Abyzov A, Iskow R, Gokcumen O, Radke DW, Balasubramanian S, Pei B, Habegger L, Lee C, Gerstein M
Genome Res 2013; 23(12):2042-52
30. Integrative annotation of variants from 1092 humans: application to cancer genomics.
Khurana E, Fu Y, Colonna V, Mu XJ, Kang HM, Lappalainen T, Sboner A, Lochovsky L, Chen J, Harmanci A, Das J, Abyzov A, Balasubramanian S, Beal K, Chakravarty D, Challis D, Chen Y, Clarke D, Clarke L, Cunningham F, Evani US, Flicek P, Fragoza R, Garrison E, Gibbs R, Gümüş ZH, Herrero J, Kitabayashi N, Kong Y, Lage K, Liluashvili V, Lipkin SM, MacArthur DG, Marth G, Muzny D, Pers TH, Ritchie GRS, Rosenfeld JA, Sisu C, Wei X, Wilson M, Xue Y, Yu F, Dermitzakis ET, Yu H, Rubin MA, Tyler-Smith C, Gerstein M
Science 2013; 342(6154):1235587
29. Child development and structural variation in the human genome.
Zhang Y, Haraksingh R, Grubert F, Abyzov A, Gerstein M, Weissman S, Urban AE
Child Dev 2013; 84(1):34-48

28. Somatic copy number mosaicism in human skin revealed by induced pluripotent stem cells.
Abyzov A, Mariani J, Palejev D, Zhang Y, Haney MS, Tomasini L, Ferrandino AF, Rosenberg Belmaker LA, Szekely A, Wilson M, Kocabas A, Calixto NE, Grigorenko EL, Huttner A, Chawarska K, Weissman S, Urban AE, Gerstein M, Vaccarino FM
Nature 2012; 492(7429):438-42
27. An integrated map of genetic variation from 1,092 human genomes.
Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA
Nature 2012; 491(7422):56-65
26. Architecture of the human regulatory network derived from ENCODE data.
Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, Mu XJ, Khurana E, Rozowsky J, Alexander R, Min R, Alves P, Abyzov A, Addleman N, Bhardwaj N, Boyle AP, Cayting P, Charos A, Chen DZ, Cheng Y, Clarke D, Eastman C, Euskirchen G, Frietze S, Fu Y, Gertz J, Grubert F, Harmanci A, Jain P, Kasowski M, Lacroute P, Leng JJ, Lian J, Monahan H, O'Geen H, Ouyang Z, Partridge EC, Patacsil D, Pauli F, Raha D, Ramirez L, Reddy TE, Reed B, Shi M, Slifer T, Wang J, Wu L, Yang X, Yip KY, Zilberman-Schapira G, Batzoglou S, Sidow A, Farnham PJ, Myers RM, Weissman SM, Snyder M
Nature 2012; 489(7414):91-100
25. An integrated encyclopedia of DNA elements in the human genome.

Nature 2012; 489(7414):57-74
24. Regulatory element copy number differences shape primate expression profiles.
Iskow RC, Gokcumen O, Abyzov A, Malukiewicz J, Zhu Q, Sukumar AT, Pai AA, Mills RE, Habegger L, Cusanovich DA, Rubel MA, Perry GH, Gerstein M, Stone AC, Gilad Y, Lee C
Proc Natl Acad Sci U S A 2012; 109(31):12656-61

23. Genome-wide mapping of copy number variation in humans: comparative analysis of high resolution array platforms.
Haraksingh RR, Abyzov A, Gerstein M, Urban AE, Snyder M
PLoS One 2011; 6(11):e27859
22. Integration of protein motions with molecular networks reveals different mechanisms for permanent and transient interactions.
Bhardwaj N, Abyzov A, Clarke D, Shou C, Gerstein MB
Protein Sci 2011; 20(10):1745-54
21. AlleleSeq: analysis of allele-specific expression and binding in a network framework.
Rozowsky J, Abyzov A, Wang J, Alves P, Raha D, Harmanci A, Leng J, Bjornson R, Kong Y, Kitabayashi N, Bhardwaj N, Rubin M, Snyder M, Gerstein M
Mol Syst Biol 2011; 7:522
20. Identification of genomic indels and structural variations using split reads.
Zhang ZD, Du J, Lam H, Abyzov A, Urban AE, Snyder M, Gerstein M
BMC Genomics 2011; 12:375
19. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing.
Abyzov A, Urban AE, Snyder M, Gerstein M
Genome Res 2011; 21(6):974-84
18. Annual Research Review: The promise of stem cell research for neuropsychiatric disorders.
Vaccarino FM, Urban AE, Stevens HE, Szekely A, Abyzov A, Grigorenko EL, Gerstein M, Weissman S
J Child Psychol Psychiatry 2011; 52(4):504-16
17. AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision.
Abyzov A, Gerstein M
Bioinformatics 2011; 27(5):595-603
16. Mapping copy number variation by population-scale genome sequencing.
Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, Abyzov A, Yoon SC, Ye K, Cheetham RK, Chinwalla A, Conrad DF, Fu Y, Grubert F, Hajirasouliha I, Hormozdiari F, Iakoucheva LM, Iqbal Z, Kang S, Kidd JM, Konkel MK, Korn J, Khurana E, Kural D, Lam HY, Leng J, Li R, Li Y, Lin CY, Luo R, Mu XJ, Nemesh J, Peckham HE, Rausch T, Scally A, Shi X, Stromberg MP, Stütz AM, Urban AE, Walker JA, Wu J, Zhang Y, Zhang ZD, Batzer MA, Ding L, Marth GT, McVean G, Sebat J, Snyder M, Wang J, Ye K, Eichler EE, Gerstein MB, Hurles ME, Lee C, McCarroll SA, Korbel JO
Nature 2011; 470(7332):59-65

15. A map of human genome variation from population-scale sequencing.
Abecasis GR, Altshuler D, Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME, McVean GA
Nature 2010; 467(7319):1061-73
14. Analysis of combinatorial regulation: scaling of partnerships between regulators with the number of governed targets.
Bhardwaj N, Carson MB, Abyzov A, Yan KK, Lu H, Gerstein MB
PLoS Comput Biol 2010; 6(5):e1000755
13. RigidFinder: a fast and sensitive method to detect rigid blocks in large macromolecular complexes.
Abyzov A, Bjornson R, Felipe M, Gerstein M
Proteins 2010; 78(2):309-24

12. PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data.
Korbel JO, Abyzov A, Mu XJ, Carriero N, Cayting P, Zhang Z, Snyder M, Gerstein MB
Genome Biol 2009; 10(2):R23
11. MSB: a mean-shift-based approach for the analysis of structural variation in the genome.
Wang LY, Abyzov A, Korbel JO, Snyder M, Gerstein M
Genome Res 2009; 19(1):106-17

10. An AP endonuclease 1-DNA polymerase beta complex: theoretical prediction of interacting surfaces.
Abyzov A, Uzun A, Strauss PR, Ilyin VA
PLoS Comput Biol 2008; 4(4):e1000066

9. UmuD and RecA directly modulate the mutagenic potential of the Y family DNA polymerase DinB.
Godoy VG, Jarosz DF, Simon SM, Abyzov A, Ilyin V, Walker GC
Mol Cell 2007; 28(6):1058-70
8. A comprehensive analysis of non-sequential alignments between all protein structures.
Abyzov A, Ilyin VA
BMC Struct Biol 2007; 7:78
7. Structure SNP (StSNP): a web server for mapping and modeling nsSNPs on protein structures with linkage to metabolic pathways.
Uzun A, Leslin CM, Abyzov A, Ilyin V
Nucleic Acids Res 2007; 35(Web Server issue):W384-92
6. TOPOFIT-DB, a database of protein structural alignments based on the TOPOFIT method.
Leslin CM, Abyzov A, Ilyin VA
Nucleic Acids Res 2007; 35(Database issue):D317-21

5. Friend, an integrated analytical front-end application for bioinformatics.
Abyzov A, Errami M, Leslin CM, Ilyin VA
Bioinformatics 2005; 21(18):3677-8
4. Active site prediction for comparative model structures with thematics.
Shehadi IA, Abyzov A, Uzun A, Wei Y, Murga LF, Ilyin V, Ondrechen MJ
J Bioinform Comput Biol 2005; 3(1):127-43

3. Structural exon database, SEDB, mapping exon boundaries on multiple protein structures.
Leslin CM, Abyzov A, Ilyin VA
Bioinformatics 2004; 20(11):1801-3
2. Structural alignment of proteins by a novel TOPOFIT method, as a superimposition of common volumes at a topomax point.
Ilyin VA, Abyzov A, Leslin CM
Protein Sci 2004; 13(7):1865-74

1. Efficiency Profile Method To Study The Hit Efficiency Of Drift Chambers.
Abyzov A, Bel'kov A, Lanyov A, Spiridonov A, Walter M, Hulsbergen W
Particles and Nuclei, Letters 2002; 5 (114); :40-52



PostDoc in computational biology and bioinformatics

application accepted year-round


Applicants are invited to apply for a post-doctoral (i.e., postdoc) position in Abyzov lab at Mayo Clinic. The choice of project will depend on the applicant's interests and skills, however, the research must be purely computational and focus on one of the following main fields of computational biology: population/personal human omics, cancer omics, single cell and somatic omics, and the analysis of next-generation sequencing data. Specific sub-areas of interest are discovery, annotation, and the functional annotation of human genomic variants, cancer genomics, cancer evolution, somatic mosaicism in normal human cells.

The ideal applicant will have a Ph.D. in computational biology or bioinformatics, experience in one of the aforementioned research areas, demonstrate a record of peer-reviewed publications, and possess motivation for independent research. He or she should have a very strong understanding of biology and be skilled in programming and using computers to solve problems (e.g., experience with C/C++, Java, Python/Perl, R/ROOT, etc.). Oral and written proficiency in English is also a big plus.

To apply, please email your CV, including a list of publications and details for three references, to abyzov dot alexej at mayo dot edu. Please include the phrase “PostDoc application” and your full name in the subject of the email.

Internship in computational biology and bioinformatics

application accepted year-round


Applications are invited for an internship at the Mayo Clinic. Anticipated projects will be related to the analysis of whole genome sequencing data, with the aims of studying germline and somatic variants (SNPs, CNVs, etc.). The analysis will involve applications of commonly used, and in-house developed, software tools, and making biological hypothesis from statistical data analysis. Intern applicants with strong programming skills will have opportunities to participate in developing new tools and improving our existing software.

To apply, please email your CV, including a list of publications to abyzov dot alexej at mayo dot edu. Please include the phrase “Internship application” and your full name in the subject of the email.

Graduate student in computational biology and bioinformatics


Graduate students (M.S or Ph.D.) wishing to conduct research in the Abyzov lab at Mayo Clinic are invited to contact Dr. Abyzov (abyzov dot alexej at mayo dot edu). The choice of the project will depend on the applicant's interests and skills. However, the research must be purely computational and focus on one of the following main fields of computational biology: population/personal human omics, cancer omics, single cell and somatic omics, and the analysis of next-generation sequencing data. Specific sub-areas of interest are discovery, annotation, and the functional annotation of human genomic variants such as SNPs, SNVs, indels, structural variations, retrotransposition, etc.

We are looking for candidates that possess motivation for independent research, have experience in computational biology or bioinformatics, and are familiar with one of the aforementioned research areas. They should have a very strong understanding of biology and be skilled in programming and using computers to solve problems (e.g., experience with C/C++ Java, Python/Perl, R/ROOT, etc.). Record of peer-reviewed publications and oral and written proficiency in English is also a big plus.

Dr. Abyzov is affiliated with Mayo Clinic College of Medicine and with the program in Biomedical Informatics and Computational Biology at University of Minnesota Rochester.

Please express your interest by emailing your CV, including a list of publications to abyzov dot alexej at mayo dot edu. Please include the phrase “PhD/MS interest” and your full name in the subject of the email.

Click on image for map:


Lets get in touch. Send us a message:

Alexej Abyzov, Ph.D., Principal Investigator


Leighann Wagner, Administrative Assistant

Phone: (507) 284-5569


Harwick Building 3rd floor, Mayo Clinic
200 1st street SW
Rochester, MN 55905, USA