genome-sampler filter-seqs¶
Filter sequences based on their length and ambiguity.
Citations¶
Inputs¶
- sequences:
FeatureData[Sequence] The sequences to be filtered.[required]
Parameters¶
- min_length:
Int%Range(1, None) The minimum length of a sequence that will allow it be retained.[default:
1]- max_length:
Int%Range(1, None) The maximum length of a sequence that will allow the sequence to be retained.[optional]
- max_proportion_ambiguous:
Float%Range(0, 1, inclusive_end=True) The maximum proportion of sequence characters that can be ambiguous (e.g., N) that will allow it be retained.[default:
1.0]
Outputs¶
- filtered_sequences:
FeatureData[Sequence] The sequences retained after filtering.[required]
- Bolyen, E., Dillon, M. R., Bokulich, N. A., Ladner, J. T., Larsen, B. B., Hepp, C. M., Lemmer, D., Sahl, J. W., Sanchez, A., Holdgraf, C., Sewell, C., Choudhury, A. G., Stachurski, J., McKay, M., Engelthaler, D. M., Worobey, M., Keim, P., & Gregory Caporaso, J. (2020). Reproducibly sampling SARS-CoV-2 genomes across time, geography, and viral diversity. F1000 Research, 9(657), 657. 10.12688/f1000research.24751.1