Skip to article frontmatterSkip to article content

label-seqs

genome-sampler label-seqs

Modifies sequence identifiers either by adding or removing metadata. If metadata and one or more columns are provided, the specified metadata columns will be added to the sequence id following the original sequence id and separated by delimiter. If metadata and columns are not provided, the first occurrence of delimiter and any characters following that will be removed from all sequence ids.

Citations

Bolyen et al., 2020

Inputs

seqs: FeatureData[Sequence¹ | AlignedSequence²]

The sequences to be re-labeled.[required]

Parameters

delimiter: Str % Choices('|', ',', '+', ':', ';')

The delimiter between the sequence id and each metadata entry.[required]

metadata: Metadata

The metadata to embed in the header.[optional]

columns: List[Str]

The columns in the metadata to be used.[optional]

missing_value: Str

Value to use to indicate missing metadata column values for sequences.[default: 'missing']

Outputs

labeled_seqs: FeatureData[Sequence¹ | AlignedSequence²]

The re-labeled sequences.[required]

References
  1. Bolyen, E., Dillon, M. R., Bokulich, N. A., Ladner, J. T., Larsen, B. B., Hepp, C. M., Lemmer, D., Sahl, J. W., Sanchez, A., Holdgraf, C., Sewell, C., Choudhury, A. G., Stachurski, J., McKay, M., Engelthaler, D. M., Worobey, M., Keim, P., & Gregory Caporaso, J. (2020). Reproducibly sampling SARS-CoV-2 genomes across time, geography, and viral diversity. F1000 Research, 9(657), 657. 10.12688/f1000research.24751.1