GnpAnnot community annotation system: features, qualifiers, values - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Communication Dans Un Congrès Année : 2009

GnpAnnot community annotation system: features, qualifiers, values

Gaëtan Droc
Mathieu Rouard
Michael M. Alaux
Nancy N. Terrier
Olivier Garsmeur
A. Simon
Claire Hoede
Hadi Quesneville

Résumé

In January 2009, 991 complete genomes have been already published and 3376 genome sequencing projects are ongoing, leading to an explosion of data that needs to be stored, curated and analyzed. GnpAnnot is a project on green genomics which intends to develop a system of structural and functional annotation supported by comparative genomics and dedicated to plant and bio-aggressor genomes allowing both automatic predictions and manual curations of genomic objects. The core of GnpAnnot is a community annotation system (CAS) based on GMOD components: Chado / GBrowse / Apollo / Artemis. The system should also enable to browse comparative genomics results, to build queries and to export sets of gene lists and gene reports in various formats. The system should allow the annotation reconciliation, history, integrity, consistency and update and the management of public and private projects. To facilitate the work of the curators, four steps are crucial: 1. To provide homogeneous features, qualifiers and values for genomic objects; 2. To share a strong CAS: run high quality combiners / pipelines to predict automatically genomic objects which are stored in a relational database management system and then available from graphical and textual fast browsers and powerful editors; 3. To define annotation rules, train the annotators and organize annotation jamborees; 4. To submit the results in public sequence knowledge bases in an easy way. In this work we focus on the first and third steps. A mapping between different known sources: sequence ontology, DDBJ / EMBL / GenBank feature definition, GFF3, Chado, gene nomenclatures, transposable element classification and annotation guidelines from various genome project consortia is described. Homogeneous feature keys, qualifiers and value format with a maximum of controlled vocabularies for genes and transposable elements are proposed. Rules to annotate, in a coherent way, the structure and the function of genes and the structure and the classification of transposable elements are proposed. These rules could be useful both for automatic predictions and manual curation. Examples of annotations on a BAC sequence of a monocot are presented.
Fichier non déposé

Dates et versions

hal-02758372 , version 1 (04-06-2020)

Identifiants

  • HAL Id : hal-02758372 , version 1
  • PRODINRA : 33460

Citer

Stéphanie Sidibe-Bocs, Fabrice Legeai, Gaëtan Droc, Mathieu Rouard, Michael M. Alaux, et al.. GnpAnnot community annotation system: features, qualifiers, values. 3. International Biocuration Conference, Apr 2009, Berlin, Germany. ⟨hal-02758372⟩
42 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More