Sentence filtering for information extraction in genomics, a classification problem - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Communication Dans Un Congrès Année : 2001

Sentence filtering for information extraction in genomics, a classification problem

Résumé

In some domains, Information Extraction (IE) from texts requires syntactic and semantic parsing. This analysis is computationally expensive and IE is potentially noisy if it applies to the whole set of documents when the relevant information is sparse. A preprocessing phase that selects the fragments which are potentially relevant increases the efficiency of the IE process. This phase has to be fast and based on a shallow description of the texts. We applied various classification methods — IVI, a Naive Bayes learner and C4.5 — to this fragment filtering task in the domain of functional genomics. This paper describes the results of this study. We show that the IVI and Naive Bayes methods with feature selection gives the best results as compared with their results without feature selection and with C4.5 results.
Fichier principal
Vignette du fichier
44074_20111116035734131_1.pdf (73.13 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-02764043 , version 1 (04-06-2020)

Identifiants

  • HAL Id : hal-02764043 , version 1
  • PRODINRA : 44074

Citer

Claire Nédellec, Mohamed Ould Abdel Vetah, Philippe Bessières. Sentence filtering for information extraction in genomics, a classification problem. 5. European conference, PKDD'2001, Sep 2001, Freiburg, Germany. ⟨hal-02764043⟩
11 Consultations
19 Téléchargements

Partager

Gmail Facebook X LinkedIn More