Doi for manuscript: https://doi.org/10.12688/wellcomeopenres.19155.1
Please follow the tutorial in my Jupyter Book Available Here: https://aidanfoo96.github.io/MINUUR/ for reproduction of my analysis or to apply in your host of interest :)
MINUUR is a snakemake pipeline I developed to extract non-host sequencing reads from mosquito whole genome sequencing data and utilise a range of metagenomic analyses to characterise potential host-associated microbes. Its application can be applied to other host-associated WGS data. MINUUR aims to leverage pre-existing WGS data to recover microbial information pertaining to host associated microbiomes.
MINUUR utilises:
- KRAKEN2: Classify taxa from unmapped read sequences
- KrakenTools: extract classified reads for downstream analysis
- BRACKEN: reestimate taxonomic abundance from KRAKEN2
- MetaPhlan3: Classify taxa using marker genes
- MEGAHIT: Metagenome assemblies using unmapped reads
- QUAST: Assembly statistics from MEGAHIT assemblies
- MetaBat2: Bin contiguous sequences from MEGAHIT
- CheckM: Assess bin quality from MetaBat2
MINUUR is run using the workflow manager Snakemake
Snakemake is best installed using the package manager Mamba
Once Mamba is installed run
mamba create -c bioconda -c conda-forge --name snakemake snakemake
Use git clone https://github.com/aidanfoo96/MINUUR/
and cd MINUUR/workflow
. This is the reference point from which the pipeline will be run. See the JupyterBooks page for a full tutorial on establishing the configuration to run this pipeline.
- Added Github actions
- Dummy dataset now included in
workflow/data
, tutorial for running this is included in the JupyterBooks page. Use this to ensure the pipeline works on your machine. - Added the option to run BUSCO to help assess eukaryotic contamination in MAGs
Any feedback or bugs please open an issue or contact: aidan.foo@lstmed.ac.uk