Genomic exploration and molecular marker development in a large and complex conifer genome using RADseq and mRNAseq - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement Accéder directement au contenu
Article Dans Une Revue Molecular Ecology Resources Année : 2015

Genomic exploration and molecular marker development in a large and complex conifer genome using RADseq and mRNAseq

Résumé

We combined restriction site associated DNA sequencing (RADseq) using a hypomethylation-sensitive enzyme and messenger RNA sequencing (mRNAseq) to develop molecular markers for the 16 gigabase genome of[i] Cedrus atlantica[/i], a conifer tree species. With each method, Illumina(®) reads from one individual were used to generate de novo assemblies. SNPs from the RADseq data set were detected in a panel of one single individual and three pools of three individuals each. We developed a flexible script to estimate the ascertainment bias in SNP detection considering the pooling and sampling effects on the probability of not detecting an existing polymorphism. Gene Ontology (GO) and transposable element (TE) search analyses were applied to both data sets. The RADseq and the mRNAseq assemblies represented 0.1% and 0.6% of the genome, respectively. Genome complexity reduction resulted in 17% of the RADseq contigs potentially coding for proteins. This rate was doubled in the mRNAseq data set, suggesting that RADseq also explores noncoding low-repeat regions. The two methods gave very similar GO-slim profiles. As expected, the two assemblies were poor in TE-like sequences (<4% of contigs length). We identified 17,348 single nucleotide polymorphisms (SNPs) in the RADseq data set and 5,714 simple sequence repeats (SSRs) in the transcriptome. A subset of 282 SNPs was validated using the Fluidigm genotyping technology, giving a conversion rate of 50.4%, falling within the expected range for conifers. Increasing sample size had the greatest effect for ascertainment bias reduction. These results validated the utility of the RADseq approach for highly complex genomes such as conifers

Dates et versions

hal-01175638 , version 1 (10-07-2015)

Identifiants

Citer

Marie-Joe Karam, Francois Lefèvre, M Bou Dagher-Kharrat, S Pinosio, G.G. Vendramin. Genomic exploration and molecular marker development in a large and complex conifer genome using RADseq and mRNAseq. Molecular Ecology Resources, 2015, 15 (3), pp.601-612. ⟨10.1111/1755-0998.12329⟩. ⟨hal-01175638⟩
85 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More