Identifying discriminative classification-based motifs in biological sequences - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue Bioinformatics Année : 2011

Identifying discriminative classification-based motifs in biological sequences

Résumé

Motivation: Identification of conserved motifs in biological sequences is crucial to unveil common shared functions. Many tools exist for motif identification, including some that allow degenerate positions with multiple possible nucleotides or amino acids. Most efficient methods available today search conserved motifs in a set of sequences, but do not check for their specificity regarding to a set of negative sequences. Results: We present a tool to identify degenerate motifs, based on a given classification of amino acids according to their physico-chemical properties. It returns the top K motifs that are most frequent in a positive set of sequences involved in a biological process of interest, and absent from a negative set. Thus, our method discovers discriminative motifs in biological sequences that may be used to identify new sequences involved in the same process. We used this tool to identify candidate effector proteins secreted into plant tissues by the root knot nematode Meloidogyne incognita. Our tool identified a series of motifs specifically present in a positive set of known effectors while totally absent from a negative set of evolutionarily conserved housekeeping proteins. Scanning the proteome of M.incognita, we detected 2579 proteins that contain these specific motifs and can be considered as new putative effectors
Fichier principal
Vignette du fichier
47554_20110915044507777_1.pdf (109.68 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02652585 , version 1 (29-05-2020)

Identifiants

Citer

Celine Vens, Marie-Noelle Rosso, Etienne Danchin. Identifying discriminative classification-based motifs in biological sequences. Bioinformatics, 2011, 27 (9), pp.1231-1238. ⟨10.1093/bioinformatics/btr110⟩. ⟨hal-02652585⟩
23 Consultations
32 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More