Discovering Active Motifs in Sets of Related Protein Sequences and Using Them for Classification

Wang, J. T. L., Marr, T. G., Shasha, D., Shapiro, B. A., Chirn, G. W. (July 1994) Discovering Active Motifs in Sets of Related Protein Sequences and Using Them for Classification. Nucleic Acids Research, 22 (14). pp. 2769-2775. ISSN 0305-1048

Abstract

We describe a method for discovering active motifs in a set of related protein sequences. The method is an automatic two step process: (1) find candidate motifs in a small sample of the sequences; (2) test whether these motifs are approximately present in all the sequences. To reduce the running time, we develop two optimization heuristics based on statistical estimation and pattern matching techniques. Experimental results obtained by running these algorithms on generated data and functionally related proteins demonstrate the good performance of the presented method compared with visual method of O'Farrell and Leopold. By combining the discovered motifs with an existing fingerprint technique, we develop a protein classifier. When we apply the classifier to the 698 groups of related proteins in the PROSITE catalog, it gives information that is complementary to the BLOCKS protein classifier of Henikoff and Henikoff. Thus, using our classifier in conjunction with theirs, one can obtain high confidence classifications (if BLOCKS and our classifier agree) or suggest a new hypothesis (if the two disagree).

Item Type: Paper
Uncontrolled Keywords: NUCLEIC-ACID SEQUENCES NUCLEOTIDE-SEQUENCE PATTERN-RECOGNITION ESCHERICHIA-COLI FIND HOMOLOGIES ALIGNMENT PROGRAM CONSENSUS ALGORITHM IDENTIFICATION
Subjects: bioinformatics > genomics and proteomics > genetics & nucleic acid processing > protein structure, function, modification
bioinformatics > computational biology > algorithms
bioinformatics > computational biology
CSHL Authors:
Communities: CSHL labs
Depositing User: Matt Covey
Date: July 1994
Date Deposited: 04 Aug 2015 20:51
Last Modified: 04 Aug 2015 20:51
PMCID: PMC308246
Related URLs:
URI: https://repository.cshl.edu/id/eprint/31447

Actions (login required)

Administrator's edit/view item Administrator's edit/view item