How to suggest a new lineage

The Pango team continually updates the list of designated Pango lineages. In addition, genome data producers and Pango system users can suggest the creation of a new lineage. Here we provide a step-by-step guide of how to propose a new lineage. If the proposal is accepted then the lineage will be created and sequences designated to it. Future versions of pangolin will then be able to assign new sequences to the lineage.

Lineage creation and designation decisions are made by the Lineage Designation Committee and there is no guarantee that a new lineage proposal will be accepted.

1. Does your cluster fulfil the definition of a lineage?

Refer to Section I of the Pango statement of rules to check if your cluster meets the criteria for creation of a new lineage.

2. Navigate to the pango-designation repository

Go to the repository at github.com/cov-lineages/pango-designation. This site is shown below. Notice the lineage_notes.txt and lineages.csv files. These files include details of recently created and manually curated new lineages.

3. Check the current issues and releases

It could be that other users have also suggested the new lineage that you are proposing. It’s a good idea to check the issues list and the latest tagged releases to ensure your new lineage doesn’t exist or is already being considered.

4. File a new issue in the repository

Use the example issue to create a new issue, describing why your cluster should be a new lineage and presenting evidence to support lineage creation and designation, potentially in the form of a phylogenetic tree. Note that sequences should be on GISAID and you must provide a list of sequence names or GISAID IDs that we can match to the database. Ideally, sequences can be identified using the format consistent with the lineages.csv file. We’ve previously encountered issues with the treatment of spaces in sequence names. For example, GISAID may use ‘South Africa/XXXXX/2020’. However, spaces are not tolerated in fasta headers, so we replace spaces with an underscore, creating ‘South_Africa/XXXXX/2020’. Whilst we try to catch these issues, please take care to avoid systematic changes to names that may interfere with the linking of names to sequences.

5. What happens next

If your lineage proposal fits in with the lineage scheme, the following steps will take place:

i) The lineage gets a name

A name will be given to your proposed lineage according to the Pango naming rules. This name and your description will be added to the lineage_notes.txt file. Sequences designated to this new lineage will be appended to the lineages.csv file.

ii) A new Pango-designation release is tagged

A new release will be tagged at github.com/cov-lineages/pango-designation. Small lineage updates will get a minor release tag, whilst large-scale designations will get a major tag.

iii) Updates to lineage assignment software

The new set of designations will be incorporated into the reference data for pangolin and other lineage assignment software packages. This will not only allow the new lineages to be detected but also increase the accuracy of assignment for all genomes.

iv) Updates to the Pango lists

The information in the sequence designation list and lineage description list will be updated with the latest designations and assignments.