DWE: discriminating word enumerator

Sumazin, P., Chen, G., Hata, N., Smith, A. D., Zhang, T., Zhang, M. Q. (January 2005) DWE: discriminating word enumerator. Bioinformatics, 21 (1). pp. 31-8. ISSN 1367-4803 (Print)

URL: http://www.scopus.com/record/display.url?eid=2-s2....
DOI: 10.1093/bioinformatics/bth471 bth471


MOTIVATION: Tissue-specific transcription factor binding sites give insight into tissue-specific transcription regulation. RESULTS: We describe a word-counting-based tool for de novo tissue-specific transcription factor binding site discovery using expression information in addition to sequence information. We incorporate tissue-specific gene expression through gene classification to positive expression and repressed expression. We present a direct statistical approach to find overrepresented transcription factor binding sites in a foreground promoter sequence set against a background promoter sequence set. Our approach naturally extends to synergistic transcription factor binding site search. We find putative transcription factor binding sites that are overrepresented in the proximal promoters of liver-specific genes relative to proximal promoters of liver-independent genes. Our results indicate that binding sites for hepatocyte nuclear factors (especially HNF-1 and HNF-4) and CCAAT/enhancer-binding protein (C/EBPbeta) are the most overrepresented in proximal promoters of liver-specific genes. Our results suggest that HNF-4 has strong synergistic relationships with HNF-1, HNF-4 and HNF-3beta and with C/EBPbeta. AVAILABILITY: Programs are available for use over the Web at http://rulai.cshl.edu/tools/dwe.

Item Type: Paper
Uncontrolled Keywords: Algorithms Amino Acid Motifs Animals Binding Sites Consensus Sequence Conserved Sequence Gene Expression Profiling methods Humans Liver metabolism Nuclear Proteins chemistry metabolism Promoter Regions (Genetics) physiology Sequence Alignment methods Sequence Analysis Protein methods Software Transcription Factors chemistry metabolism
Subjects: bioinformatics > genomics and proteomics > databases > database construction
bioinformatics > genomics and proteomics > databases > database optimization
bioinformatics > computational biology
bioinformatics > genomics and proteomics > computers > computer software
bioinformatics > genomics and proteomics > databases > databases
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > DNA expression > promoter
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > protein structure, function, modification > protein types > transcription factor
CSHL Authors:
Communities: CSHL labs > Zhang lab
Depositing User: CSHL Librarian
Date: 1 January 2005
Date Deposited: 05 Jan 2012 15:17
Last Modified: 05 Jan 2012 15:17
URI: http://repository.cshl.edu/id/eprint/22719

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving