-
Notifications
You must be signed in to change notification settings - Fork 5
Home
Welcome to the MINUUR wiki
Developed by Aidan Foo Email: 248064@lstmed.ac.uk
Here we list the outputs of MINUUR
fastqc
fastqc report per sample of the raw unmapped sequences
fastqc_trimmed
fastqc report per sample of trimmed sequences
trimmed_fastq
trimmed fastq files
If your input was a paired fastq file, this folder will output a list of coordinate sorted BAM files of your raw reads against the reference genome of your choosing.
A folder listing a simple set of alignment statistics of of your data, including reads mapped, reads unmapped and raw total sequences. There is also a concatenated .txt
file of all your samples with their alignment statistics.
A folder listing a detailed summary of the bowtie2 alignment, generated using samtools.
Kraken2 classification of unmapped reads
combined_report
: concatenated file of all taxonomic classifications from all samples
mpa_out
: mpa style report of Kraken2 classifications
mpa_report
: mpa style report with unclassified reads
report
: FINISH
classified_summary
: .txt file per sample showing proportion of reads classified and proportion of reads unclassified
concatenated_kraken_summary.txt
: concatenated file of kraken summaries
plots
:
-> classified_proportions.pdf
: proportion of reads classified and unclassified per sample.
-> species_heatmap.pdf
: log number of reads classified across species with a user provided read threshold
-> genus_heatmap.pdf
: log number of reads classified across genus with a user provided read threshold
-> species_spatial_plot.pdf
: Spatial plot of raw number of classified reads against each species
-> genus_spatial_plot.pdf
: Spatial plot of raw number of classified reads against each genus
-> stratified_heatmap.pdf
: species stratified heatmap with log number of reads
tables
: summary tables classified_reads_table.txt
gives the study and total number of reads classified in question. species_table_tidy.txt
summarized table of species classified, number of associated reads and sample in question. kingdom_table_tidy
and genus_table_tidy
give the same information at these taxonomic levels.
kraken_taxon_extract
: paired fastq files of user specified taxon/taxa of interest. Can be used for further analysis. e.g. isolate genome assemblies, gene calling.
Reestimation of Kraken2 classified reads to infer taxonomic relative abundance
bracken_out
: .txt files with Kraken2 classified taxonomic IDs, kraken assigned reads, added reads, new estimated number of reads and fraction of total reads.
concat_bracken_out
: concatenated file of all samples and Bracken added reads
plots
:
--> added_reads_plot.pdf
: bar graph and bracken added reads per sample
--> FINISH
MetaPhlAn3 taxonomic classifications.
bowtie2_aln
.bz2 alignment files, can be used for further analysis or quickly repeating the MetaPhlAn3 taxonomic classification
clean_summaries
: clean taxonomic summaries with either species, genus, kingdom, the sample it originated from and relative abundance.
taxa_profile
: is the MetaPhlAn3 taxonomic class
concatenated_humann_files
: files with raw gene families (RPK), pathway abundances (RPK) and pathway coverage (0-1) concatenated across all samples. Also includes stratified and unstratified files, normalized relative abundance and gene families with relative abundance renamed with UniRef90 IDs.
plots
:
gene_and_path_hits_per_sample.pdf
: number of hits across gene and pathway abundance profiles per sample
gene_and_path_hits_per_spsecies.pdf
: number of hits across gene and pathway abundance profiles per species
gene_hits.pdf
: number of identified genes per sample and their percentage identity to their alignments. Size of each dot denotes the number of genes identified at that specific identity threshold.
Each directory contains the sample and assembled contigs, named final.contigs.fa
.
quast report of assmembled contigs generated from the megahit assembly. concat_transposed_report.tsv
is a concatenated quast report per sample.
The binning folder contains the output of checkm
and metabat
.
checkm_out
: concatenated bin statistics including N50 and L50 scores of all binned genomes. QA_out
contains quality assurance per bin.
metabat_out
: will contain binned genomes in .fasta format.
plots
:
-> BarChartCompletenessContamination.pdf
: Flipped bar chart with sample on the y-axis and CheckM contamination & completeness score on the x-axis. Light blue bar = completeness and dark blue bar = contamination.
-> CompletenessVsContam_Per_Sample.pdf
: Dot plot showing completeness and contamination of each bin colour coded by sample (note this figure is best suited where samples are <10). Orange line >= completeness 50%, green line >= completeness >= 80%