Rank-statistics based enrichment-site prediction algorithm developed for chromatin immunoprecipitation on chip experiments

Ghosh, S., Hirsch, H. A., Sekinger, E., Struhl, K., Gingeras, T. R. (October 2006) Rank-statistics based enrichment-site prediction algorithm developed for chromatin immunoprecipitation on chip experiments. BMC Bioinformatics, 7. ISSN 1471-2105

[thumbnail of Paper] PDF (Paper)
Rank-statistics based enrichment-site prediction.pdf - Published Version

Download (1MB)

Abstract

Background: High density oligonucleotide tiling arrays are an effective and powerful platform for conducting unbiased genome-wide studies. The ab initio probe selection method employed in tiling arrays is unbiased, and thus ensures consistent sampling across coding and non-coding regions of the genome. Tiling arrays are increasingly used in chromatin immunoprecipitation (IP) experiments (ChIP on chip). ChIP on chip facilitates the generation of genome-wide maps of in-vivo interactions between DNA-associated proteins including transcription factors and DNA. Analysis of the hybridization of an immunoprecipitated sample to a tiling array facilitates the identification of ChIP-enriched segments of the genome. These enriched segments are putative targets of antibody assayable regulatory elements. The enrichment response is not ubiquitous across the genome. Typically 5 to 10% of tiled probes manifest some significant enrichment. Depending upon the factor being studied, this response can drop to less than 1%. The detection and assessment of significance for interactions that emanate from non-canonical and/or un-annotated regions of the genome is especially challenging. This is the motivation behind the proposed algorithm. Results: We have proposed a novel rank and replicate statistics-based methodology for identifying and ascribing statistical confidence to regions of ChIP-enrichment. The algorithm is optimized for identification of sites that manifest low levels of enrichment but are true positives, as validated by alternative biochemical experiments. Although the method is described here in the context of ChIP on chip experiments, it can be generalized to any treatment-control experimental design. The results of the algorithm show a high degree of concordance with independent biochemical validation methods. The sensitivity and specificity of the algorithm have been characterized via quantitative PCR and independent computational approaches. Conclusion: The algorithm ranks all enrichment sites based on their intra-replicate ranks and inter-replicate rank consistency. Following the ranking, the method allows segmentation of sites based on a meta p-value, a composite array signal enrichment criterion, or a composite of these two measures. The sensitivities obtained subsequent to the segmentation of data using a meta p-value of 10(-5), an array signal enrichment of 0.2 and a composite of these two values are 88%, 87% and 95%, respectively.

Item Type: Paper
Additional Information: Times Cited: 9
Subjects: bioinformatics > genomics and proteomics > analysis and processing > microarray gene expression processing
CSHL Authors:
Communities: CSHL labs > Gingeras lab
Depositing User: CSHL Librarian
Date: 5 October 2006
Date Deposited: 08 Mar 2012 16:50
Last Modified: 12 Jul 2013 20:18
PMCID: PMC1615882
Related URLs:
URI: https://repository.cshl.edu/id/eprint/25305

Actions (login required)

Administrator's edit/view item Administrator's edit/view item