Using structure recurrence to define protein domains - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Poster De Conférence Année : 2010

Using structure recurrence to define protein domains

Résumé

Domains are basic units of protein structure and essential for exploring protein fold space and structure evolution. With the NIH Protein Structure Initiative and other structural genomics initiatives worldwide, the number of protein structures in PDB is increasing dramatically and domain parsing needs to be done automatically. Most of the existing structural domain parsing programsconsider the compactness of the domains and/or the number and strength of internal (intra-domain) versus external (inter-domain) contacts. Here we present a completely different approach. Taking advantage of the growing number of known structures in the PDB, the chains are parsed solely by using recurrence of similar structures that appear in the structural database. A non-redundant set of 6373 protein chains was selected as the target data set and 128 benchmark chains from pDomains were used as query chains. For each query chain, one against all target structure comparisons were performed using VAST. Then the VAST cliques were collected and the protein residues were clustered using mathematical procedures akin to those used for analyzing the microarray data. These clusters define domains. NDO scores were used to compare the results with SCOP and CATH domain boundaries as well as with those from other parsing programs. Our algorithm gave results that were comparable to those of several existing programs. It handles segmented domains equally well as non-segmented domains. The structures that contribute the cliques that define a domain may contain distant evolutionary information of the domain.
Fichier principal
Vignette du fichier
49824_20120224031122697_1.pdf (40.35 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-02758033 , version 1 (04-06-2020)

Identifiants

  • HAL Id : hal-02758033 , version 1
  • PRODINRA : 49824
  • WOS : 000208762004275

Citer

Chin-Hsien Tai, Sam Vichetra, Jean-François Gibrat, Peter Munson, Byungkook Lee, et al.. Using structure recurrence to define protein domains. Biophysical Society 54th Annual Meeting, Feb 2010, San Francisco, United States. pp.CD, 2010, acte of Biophysical Society 54th Annual Meeting. ⟨hal-02758033⟩

Collections

INRA INRAE MATHNUM
11 Consultations
21 Téléchargements

Partager

Gmail Facebook X LinkedIn More