Serious Constellations Of Reoccurring Phylogenetically-Independent Origin
A Python-based command-line tool for rapid classification of SARS-CoV-2 sequences using mutation constellations. Essential for detecting variants of concern (VOCs) and tracking key mutation patterns.

Scorpio provides specialized commands for different analysis workflows.
Evaluates sequences against lineage-defining mutation patterns (constellations) and reports matches.
Generates haplotype representations as strings or tabular data for analysis.
Outputs constellation metadata including lineage names and classification rules.
Extracts shared mutations from grouped sequences with optional outgroup comparison.
Scorpio can be installed via Bioconda (recommended) or from the GitHub repository. It works seamlessly with Pangolin for comprehensive lineage analysis.
Install via Bioconda
The easiest method with all dependencies managed.
Install Constellations
Download SARS-CoV-2 specific constellation definitions.
Run Classification
Classify your sequences against known variant patterns.
Constellations are JSON-formatted files defining mutation patterns that characterize specific variants. Each constellation specifies sites to check and classification rules.
Sites
Mutation codes in format gene:[ref]position[alt] (e.g., s:N501Y)
Rules
Thresholds like minimum/maximum counts of reference, alternate, or ambiguous calls
Metadata
Name, description, WHO label, and citations for each constellation
{
"name": "Omicron (BA.1-like)",
"description": "Omicron BA.1 variant",
"citation": "WHO designation",
"sites": [
"s:A67V",
"s:H69-",
"s:V70-",
"s:T95I",
"s:G142D",
"s:N211-",
"s:ins214EPE",
"s:G339D",
"s:S371L",
"s:S373P",
"s:S375F",
"s:K417N",
"s:N440K",
"s:G446S",
"s:S477N",
"s:T478K",
"s:E484A",
"s:Q493R",
"s:G496S",
"s:Q498R",
"s:N501Y",
"s:Y505H"
],
"rules": {
"min_alt": 20,
"max_ref": 2
}
}