Explanation¶
These pages give the background and the reasoning behind GVClass: why the pipeline is shaped the way it is, how it turns marker placements into a taxonomic call, and what its quality numbers mean for giant viruses. They make for good reading away from the keyboard. None of it is needed to get a classification done, but it helps you read a result with the right amount of trust.
- How it works: the pipeline end to end, from gene calling across nine genetic codes through HMM marker detection, per-marker trees, and the nearest-neighbor vote that produces a call.
- Taxonomy: how the per-marker majority vote becomes a lineage, why genus and species are a nearest-reference label rather than an ICTV assignment, and how
taxonomy_confidenceis derived. - Quality metrics: what estimated completeness and contamination mean for giant viruses, and why the many eukaryote-like genes acquired by horizontal transfer are not counted as contamination.
- Species tree: the opt-in concatenated-marker tree that adds a genome-level placement and fills the four
species_tree_*columns.
Note
For step-by-step recipes see the how-to guides, and for exact flags, columns, and panels see the reference.