Skip to article frontmatterSkip to article content

filter-seqs

genome-sampler filter-seqs

Filter sequences based on their length and ambiguity.

Citations

Bolyen et al., 2020

Inputs

sequences: FeatureData[Sequence]

The sequences to be filtered.[required]

Parameters

min_length: Int % Range(1, None)

The minimum length of a sequence that will allow it be retained.[default: 1]

max_length: Int % Range(1, None)

The maximum length of a sequence that will allow the sequence to be retained.[optional]

max_proportion_ambiguous: Float % Range(0, 1, inclusive_end=True)

The maximum proportion of sequence characters that can be ambiguous (e.g., N) that will allow it be retained.[default: 1.0]

Outputs

filtered_sequences: FeatureData[Sequence]

The sequences retained after filtering.[required]

References
  1. Bolyen, E., Dillon, M. R., Bokulich, N. A., Ladner, J. T., Larsen, B. B., Hepp, C. M., Lemmer, D., Sahl, J. W., Sanchez, A., Holdgraf, C., Sewell, C., Choudhury, A. G., Stachurski, J., McKay, M., Engelthaler, D. M., Worobey, M., Keim, P., & Gregory Caporaso, J. (2020). Reproducibly sampling SARS-CoV-2 genomes across time, geography, and viral diversity. F1000 Research, 9(657), 657. 10.12688/f1000research.24751.1