01.04.22 | Christopher Ruis
13.04.22 | General news
Pango lineages: guidelines for suggesting novel and recombinant lineages
Pango lineages are designated to aid fine-scale tracking of SARS-CoV-2. They represent clades within the phylogenetic tree defined by both at least one evolutionary event (nonsynonymous mutation, insertion/deletion or recombination event) and an event of epidemiological significance. The presence of a single mutation will not, by itself, be sufficient to warrant a new lineage designation in most cases. Epidemiological events include movement of the virus into a new geographical region, rapid and sustained growth in frequency compared to other co-circulating lineages, a jump into a novel host species and acquisition of a set of mutations of particular biological interest. Pango lineages are therefore not designed to simply split the SARS-CoV-2 tree into clades. If a clade in the tree contains thousands of sequences but does not have an epidemiological distinction from the parental lineage, it will retain the parental lineage label.
The absolute minimum number of high-quality sequences required for a new lineage is five. The large number of global SARS-CoV-2 genomes means that we typically require many more sequences than this to designate a new lineage. However, small clades sampled in locations with low sequencing rates that meet the designation criteria may be designated.
A number of recombinant lineages have been identified thus far for SARS-CoV-2. The first recombinant lineage identified arose from a recombination event between lineage B.1.177 and lineage B.1.1.7 and was designated lineage XA. As the virus has continued to evolve and diversify, we have begun to identify and designate more recombinant lineages. By Pango nomenclature rules, recombinant lineages all begin with the prefix `X` and receive a letter based on the next available recombinant designation e.g. XB, XC, etc. As recombination in coronaviruses is common, recombinant lineages need to exhibit epidemiological significance and evidence of onward transmission (evidenced through internal shared mutations within the lineage). New recombinant designations, as with new lineage designations, should aim to capture the diversity at the leading edge of the pandemic and include recent sequences. In most cases, we expect significant recombinant lineages to contain a minimum of 50 sequences, however exceptions to this may arise if the recombinant has particular novelty or significance, with unusual breakpoints and/or parental lineages.