Population-genetic nature of copy number variations in the human genome

Kato, M., Kawaguchi, T., Ishikawa, S., Umeda, T., Nakamichi, R., Shapero, M. H., Jones, K. W., Nakamura, Y., Aburatani, H., Tsunoda, T. (2010) Population-genetic nature of copy number variations in the human genome. Human Molecular Genetics, 19 (5). pp. 761-773.

[thumbnail of Paper]
Preview
PDF (Paper)
Kato Human Mol Gen 2010.pdf - Published Version

Download (466kB) | Preview
URL: http://www.ncbi.nlm.nih.gov/pubmed/19966329
DOI: 10.1093/hmg/ddp541

Abstract

Copy number variations (CNVs) are universal genetic variations, and their association with disease has been increasingly recognized. We designed high-density microarrays for CNVs, and detected 3000-4000 CNVs (4-6% of the genomic sequence) per population that included CNVs previously missed because of smaller sizes and residing in segmental duplications. The patterns of CNVs across individuals were surprisingly simple at the kilo-base scale, suggesting the applicability of a simple genetic analysis for these genetic loci. We utilized the probabilistic theory to determine integer copy numbers of CNVs and employed a recently developed phasing tool to estimate the population frequencies of integer copy number alleles and CNV-SNP haplotypes. The results showed a tendency toward a lower frequency of CNV alleles and that most of our CNVs were explained only by zero-, one-and two-copy alleles. Using the estimated population frequencies, we found several CNV regions with exceptionally high population differentiation. Investigation of CNV-SNP linkage disequilibrium (LD) for 500-900 bi-and multi-allelic CNVs per population revealed that previous conflicting reports on bi-allelic LD were unexpectedly consistent and explained by an LD increase correlated with deletion-allele frequencies. Typically, the bi-allelic LD was lower than SNP-SNP LD, whereas the multi-allelic LD was somewhat stronger than the bi-allelic LD. After further investigation of tag SNPs for CNVs, we conclude that the customary tagging strategy for disease association studies can be applicable for common deletion CNVs, but direct interrogation is needed for other types of CNVs. © The Author 2009. Published by Oxford University Press.

Item Type: Paper
Subjects: bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
organism description > animal
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > copy number variants
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes
organism description > animal > mammal > primates > hominids > human
CSHL Authors:
Communities: CSHL labs > Krasnitz lab
Depositing User: Matt Covey
Date: 2010
Date Deposited: 20 Feb 2013 16:51
Last Modified: 20 Feb 2013 16:51
PMCID: PMC2816609
Related URLs:
URI: https://repository.cshl.edu/id/eprint/27433

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving