Computational analysis of protein tyrosine phosphatases: practical guide to bioinformatics and data resources

Andersen, J. N., Del Vecchio, R. L., Kannan, N., Gergel, J., Neuwald, A. F., Tonks, N.K. (January 2005) Computational analysis of protein tyrosine phosphatases: practical guide to bioinformatics and data resources. Methods, 35 (1). pp. 90-114. ISSN 1046-2023 (Print)

DOI: 10.1016/j.ymeth.2004.07.012


The exponential growth of sequence data has become a challenge to database curators and end-users alike and biologists seeking to utilize the data effectively are faced with numerous analysis methods. Here, with practical examples from our bioinformatics analysis of the protein tyrosine phosphatases (PTPs), we show how computational analysis can be exploited to fuel hypothesis-driven experimental research through the exploration of online databases. We cover the following elements: (i) similarity searches and strategies to collect a non-redundant database of tyrosine-specific PTP domains; (ii) utilization of this database to classify human, fly, and worm PTPs (based on alignments and phylogenetic analysis); (iii) three-dimensional structural analysis to identify conserved regions (structure-function) and non-conserved selectivity-determining regions (substrate specificity); and (iv) genomic analysis, including mapping of exon structure, identification of pseudogenes, and exploration of disease databases. We discuss the importance of manual curation, illustrating examples in which pseudogenes give rise to predicted proteins in GenBank and note that domain servers, such as PFAM and SMART, erroneously include dual-specificity and lipid phosphatases in their collection of tyrosine-specific PTPs. To capitalize on our annotated set of 402 PTP domains (from 47 species and five phyla), we identify sequence conservation across taxonomic categories and explore structure-function relationships among tandem domain receptor-like PTPs. We define three Src homology 2 domain-containing PTP genes in stingray, zebrafish, and fugu and speculate on their evolutionary relationship with human pseudogenes. Our annotated sequences, along with a web service for phylogenetic classification of PTP domains, are available online ( and

Item Type: Paper
Uncontrolled Keywords: Amino Acid Sequence Animals Computational Biology methods Databases Protein Genome Human Humans Protein Structure Tertiary Protein-Tyrosine-Phosphatase genetics Sequence Analysis Protein Sequence Homology
Subjects: bioinformatics > genomics and proteomics > annotation > sequence annotation
bioinformatics > genomics and proteomics > Mapping and Rendering > Sequence Rendering
bioinformatics > computational biology
CSHL Authors:
Communities: CSHL labs > Tonks lab
Depositing User: CSHL Librarian
Date: January 2005
Date Deposited: 17 Jan 2012 15:27
Last Modified: 21 Feb 2017 20:49
Related URLs:

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving