Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana: The Rice Annotation Project

Itoh, T., Tanaka, T., Barrero, R. A., Yamasaki, C., Fujii, Y., Hilton, P. B., Antonio, B. A., Aono, H., Apweiler, R., Bruskiewich, R., Bureau, T., Burr, F., De Oliveira, A. C., Fuks, G., Habara, T., Haberer, G., Han, B., Harada, E., Hiraki, A. T., Hirochika, H., Hoen, D., Hokari, H., Hosokawa, S., Hsing, Y. I., Ikawa, H., Ikeo, K., Imanishi, T., Ito, Y., Jaiswal, P., Kanno, M., Kawahara, Y., Kawamura, T., Kawashima, H., Khurana, J. P., Kikuchi, S., Komatsu, S., Koyanagi, K. O., Kubooka, H., Lieberherr, D., Lin, Y. C., Lonsdale, D., Matsumoto, T., Matsuya, A., McCombie, W. R., Messing, J., Miyao, A., Mulder, N., Nagamura, Y., Nam, J., Namiki, N., Numa, H., Nurimoto, S., O'Donovan, C., Ohyanagi, H., Okido, T., Oota, S., Osato, N., Palmer, L. E., Quetier, F., Raghuvanshi, S., Saichi, N., Sakai, H., Sakai, Y., Sakata, K., Sakurai, T., Sato, F., Sato, Y., Schoof, H., Seki, M., Shibata, M., Shimizu, Y., Shinozaki, K., Shinso, Y., Singh, N. K., Smith-White, B., Takeda, J. I., Tanino, M., Tatusova, T., Thongjuea, S., Todokoro, F., Tsugane, M., Tyagi, A. K., Vanavichit, A., Wang, A., Wing, R. A., Yamaguchi, K., Yamamoto, M., Yamamoto, N., Yu, Y., Zhang, H., Zhao, Q., Higo, K., Burr, B., Gojobori, T., Sasaki, T. (2007) Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana: The Rice Annotation Project. Genome Research, 17 (2). pp. 175-183. ISSN 10889051 (ISSN)

URL: http://www.ncbi.nlm.nih.gov/pubmed/17210932
DOI: 10.1101/gr.5509507

Abstract

We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ∼32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene. ©2007 by Cold Spring Harbor Laboratory Press.

Item Type: Paper
Uncontrolled Keywords: complementary DNA RNA amino acid sequence Arabidopsis article comparative study DNA sequence evolution gene duplication gene function gene locus genomics nonhuman plant genetics priority journal protein analysis protein domain protein function rice validation process Arabidopsis Proteins Codon Databases, Protein DNA, Complementary DNA, Plant Evolution, Molecular Genome, Plant Mutagenesis, Insertional Open Reading Frames Oryza sativa Plant Proteins RNA, Messenger RNA, Plant RNA, Transfer Species Specificity Variation (Genetics) Arabidopsis thaliana Japonica
Subjects: organism description > plant > Arabidopsis
bioinformatics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > DNA, RNA structure, function, modification
bioinformatics > genomics and proteomics > genetics & nucleic acid processing
bioinformatics > genomics and proteomics
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes
bioinformatics > genomics and proteomics > genetics & nucleic acid processing > genomes > genome annotation
organism description > plant
organism description > plant > rice
CSHL Authors:
Communities: CSHL labs > McCombie lab
Depositing User: Matt Covey
Date Deposited: 25 Apr 2013 18:56
Last Modified: 25 Apr 2013 18:56
PMCID: PMC1781349
Related URLs:
URI: http://repository.cshl.edu/id/eprint/28206

Actions (login required)

Administrator's edit/view item Administrator's edit/view item
CSHL HomeAbout CSHLResearchEducationNews & FeaturesCampus & Public EventsCareersGiving