Davuluri, R. V., Suzuki, Y., Sugano, S., Zhang, M. Q. (November 2000) CART classification of human 5 ' UTR sequences. Genome Research, 10 (11). pp. 1807-1816. ISSN 1088-9051
Preview |
PDF (Paper)
Zhang Genome Research 2000.pdf - Published Version Download (422kB) | Preview |
Abstract
A nonredundant database of 2312 full-length human 5'-untranslated regions (UTRs) was carefully prepared using state-of-the-art experimental and computational technologies. A comprehensive computational analysis of this data was conducted for characterizing the 5' UTR Features. Classification and regression tree (CART) analysis was used to classify the data into three distinct classes. Class I consists of mRNAs that are believed to be poorly translated with long 5' UTRs filled with potential inhibitory features. Class II consists of terminal oligopyrimidine tract (TOP) mRNAs that are regulated in a growth-dependent manner, and class III consists of mRNAs with Favorable 5' UTR features that may help efficient translation. The most accurate tree we found has 92.5% classification accuracy as estimated by cross validation. The classification model included the presence of TOP, a secondary structure, 5' UTR length, and the presence of upstream AUGs (uAUGs) as the most relevant variables. The present classification and characterization of the 5' UTRs provide precious information for better understanding the translational regulation of human mRNAs. Furthermore, this database and classification can help people build better computational models for predicting the 5'-terminal exon and separating the 5' UTR from the coding region.
Item Type: | Paper |
---|---|
Uncontrolled Keywords: | EUKARYOTIC MESSENGER-RNAS TRANSLATIONAL CONTROL SECONDARY STRUCTURE PROTEIN-SYNTHESIS CODON USAGE INITIATION EFFICIENCY EXPRESSION POSITION-+5 PREDICTION |
Subjects: | bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification bioinformatics > genomics and proteomics > databases bioinformatics > genomics and proteomics bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > mRNA dynamics bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification > mRNA |
CSHL Authors: | |
Communities: | CSHL labs > Zhang lab |
Depositing User: | Matt Covey |
Date: | November 2000 |
Date Deposited: | 31 Jan 2014 16:30 |
Last Modified: | 31 Jan 2014 16:30 |
PMCID: | PMC310970 |
Related URLs: | |
URI: | https://repository.cshl.edu/id/eprint/29361 |
Actions (login required)
Administrator's edit/view item |