Virus identification using geNomad

Overview

Teaching: min
Exercises: 30 min
Questions
  • How to use geNomad?

Objectives
  • Install geNomad

  • Run and interpret its result

4. Virus Identification and annotation by geNomad

Tools: geNomad

geNomad is a tool that identifies virus and plasmid genomes from nucleotide sequences. It provides state-of-the-art classification performance and can be used to quickly find mobile genetic elements from genomes, metagenomes, or metatranscriptomes.

Installation

using conda

conda create -n genomad -c conda-forge -c bioconda genomad
conda activate genomad
genomad download-database .

using docker

docker pull antoniopcamargo/genomad
docker run -ti --rm -v "$(pwd):/app" antoniopcamargo/genomad download-database .
docker run -ti --rm -v "$(pwd):/app" antoniopcamargo/genomad end-to-end PRJEB47625/illumina_sample_01_megahit.fa.gz output genomad_db

Pipeline Options

Option Description
end-to-end Executes the full pipeline
–cleanup Force geNomad to delete intermediate files
–splits 8 To make it possible to run this example in a notebook

Usage:

# Run the full geNomad pipeline (end-to-end command), taking a nucleotide FASTA file (illumina_sample_01_megahit.fa.gz) and the database (genomad_db) as input and produce output in genomad_output
genomad end-to-end --cleanup --splits 8 PRJEB47625/illumina_sample_01_megahit.fa.gz genomad_output genomad_db

geNomad identifies viral sequences within the assembled contigs and provides annotations that are crucial for understanding the viral components of your virome.

Key Points

  • geNomad