Numerical comparison of several approximations of the word count distribution in random sequences - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue Journal of Computational Biology Année : 2001

Numerical comparison of several approximations of the word count distribution in random sequences

Résumé

The exact distribution of word counts in random sequences and several approximations have been proposed in the past few years. The exact distribution has no theoretical limit but may require prohibitive computation time. On the other hand, approximate distributions can be rapidly calculated but, in practice, are only accurate under specific conditions. After making a survey of these distributions, we compare them according to both their accuracy and computational cost. Rules are suggested for choosing between Gaussian approximations, compound Poisson approximation, and exact distribution. This work is illustrated with the detection of exceptional words in the phage Lambda genome
Fichier non déposé

Dates et versions

hal-02675878 , version 1 (31-05-2020)

Identifiants

  • HAL Id : hal-02675878 , version 1
  • PRODINRA : 39661
  • WOS : 000171024100001

Citer

Stephane S. Robin, Sophie S. Schbath. Numerical comparison of several approximations of the word count distribution in random sequences. Journal of Computational Biology, 2001, 8 (4), pp.349-359. ⟨hal-02675878⟩
6 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More