Semi-quantitative characterisation of mixed pollen samples using MinION sequencing and Reverse Metagenomics (RevMet)

The ability to identify and quantify the constituent plant species that make up a mixed-species sample of pollen has important applications in ecology, conservation, and agriculture. Recently, metabarcoding protocols have been developed for pollen that can identify constituent plant species, but there are strong reasons to doubt that metabarcoding can accurately quantify their relative abundances. A PCR-free, shotgun metagenomics approach has greater potential for accurately quantifying species relative abundances, but applying metagenomics to eukaryotes is challenging due to low numbers of reference genomes. We have developed a pipeline, RevMet (Reverse Metagenomics), that allows reliable and semi-quantitative characterization of the species composition of mixed-species eukaryote samples, such as bee-collected pollen, without requiring reference genomes. Instead, reference species are represented only by 'genome skims': low-cost, low-coverage, shortread sequence datasets. The skims are mapped to individual long reads sequenced from mixed-species samples using the MinION, a portable nanopore sequencing device, and each long read is uniquely assigned to a plant species. We genome-skimmed 49 wild UK plant species, validated our pipeline with mock DNA mixtures of known composition, and then applied RevMet to pollen loads collected from wild bees. We demonstrate that RevMet can identify plant species present in mixed-species samples at proportions of DNA >1%, with few false positives and false negatives, and reliably differentiate species represented by high versus low amounts of DNA in a sample. The RevMet pipeline could readily be adapted to generate semi-quantitative datasets for a wide range of mixed eukaryote samples, which could include characterising diets, quantifying allergenic pollen from air samples, quantifying soil fauna, and identifying the compositions of algal and diatom communities. Our per-sample costs were GBP 90 per genome skim and GBP 60 per pollen sample, and new versions of sequencers available now will further reduce these costs.

Data and Resources

Additional Info

Field Value
Maintainer Email
Article Host Type publisher
Article Is Open Access true
Article License Type cc-by-nc-nd
Article Version Type publishedVersion
Citation Report
DOI 10.1101/551960
Date Last Updated 2019-07-28T00:53:07.253437
Evidence open (via page says license)
Funder code(s)
Journal Is Open Access false
Open Access Status hybrid
Publisher URL