OSCAR: one-class SVM for accurate recognition of cis-elements

Jiang, B., Zhang, M. Q., Zhang, X. (November 2007) OSCAR: one-class SVM for accurate recognition of cis-elements. Bioinformatics, 23 (21). pp. 2823-8. ISSN 1460-2059 (Electronic)

Abstract

MOTIVATION: Traditional methods to identify potential binding sites of known transcription factors still suffer from large number of false predictions. They mostly use sequence information in a position-specific manner and neglect other types of information hidden in the proximal promoter regions. Recent biological and computational researches, however, suggest that there exist not only locational preferences of binding, but also correlations between transcription factors. RESULTS: In this article, we propose a novel approach, OSCAR, which utilizes one-class SVM algorithms, and incorporates multiple factors to aid the recognition of transcription factor binding sites. Using both synthetic and real data, we find that our method outperforms existing algorithms, especially in the high sensitivity region. The performance of our method can be further improved by taking into account locational preference of binding events. By testing on experimentally-verified binding sites of GATA and HNF transcription factor families, we show that our algorithm can infer the true co-occurring motif pairs accurately, and by considering the co-occurrences of correlated motifs, we not only filter out false predictions, but also increase the sensitivity. AVAILABILITY: An online server based on OSCAR is available at http://bioinfo.au.tsinghua.edu.cn/oscar.

Item Type: Paper
Uncontrolled Keywords: *Algorithms Amino Acid Motifs *Artificial Intelligence Base Sequence Binding Sites Molecular Sequence Data Pattern Recognition, Automated/*methods Promoter Regions (Genetics)/*genetics Protein Binding Regulatory Elements, Transcriptional/*genetics Sequence Analysis, DNA/*methods Transcription Factors/*genetics
Subjects: bioinformatics > genomics and proteomics > annotation > gene expression profiling annotation
bioinformatics > genomics and proteomics > annotation > sequence annotation
bioinformatics > computational biology
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > protein structure, function, modification > protein types > transcription factor
CSHL Authors:
Communities: CSHL labs > Zhang lab
Depositing User: CSHL Librarian
Date: 1 November 2007
Date Deposited: 14 Nov 2011 19:44
Last Modified: 28 Mar 2018 14:29
Related URLs:
URI: https://repository.cshl.edu/id/eprint/23053

Actions (login required)

Administrator's edit/view item Administrator's edit/view item