Enrichment approach for unbiased sequencing of respiratory syncytial virus directly from clinical samples
Waweru JW., de Laurent Z., Kamau E., Said K., Gicheru E., Mutunga M., Kibet C., Kinyua J., Nokes J., Sande C., Githinji G.
Background: Nasopharyngeal samples contain higher quantities of bacterial and host nucleic acids relative to viruses; presenting challenges during virus metagenomics sequencing, which underpins agnostic sequencing protocols. We aimed to develop a viral enrichment protocol for unbiased whole-genome sequencing of respiratory syncytial virus (RSV) from nasopharyngeal samples using the Oxford Nanopore Technology (ONT) MinION platform. Methods: We assessed two protocols using RSV positive samples. Protocol 1 involved physical pre-treatment of samples by centrifugal processing before RNA extraction, while Protocol 2 entailed direct RNA extraction without prior enrichment. Concentrates from Protocol 1 and RNA extracts from Protocol 2 were each divided into two fractions; one was DNase treated while the other was not. RNA was then extracted from both concentrate fractions per sample and RNA from both protocols converted to cDNA, which was then amplified using the tagged Endoh primers through Sequence-Independent Single-Primer Amplification (SISPA) approach, a library prepared, and sequencing done. Statistical significance during analysis was tested using the Wilcoxon signed-rank test. Results: DNase-treated fractions from both protocols recorded significantly reduced host and bacterial contamination unlike the untreated fractions (in each protocol p<0.01). Additionally, DNase treatment after RNA extraction (Protocol 2) enhanced host and bacterial read reduction compared to when done before (Protocol 1). However, neither protocol yielded whole RSV genomes. Sequenced reads mapped to parts of the nucleoprotein (N gene) and polymerase complex (L gene) from Protocol 1 and 2, respectively. Conclusions: DNase treatment was most effective in reducing host and bacterial contamination, but its effectiveness improved if done after RNA extraction than before. We attribute the incomplete genome segments to amplification biases resulting from the use of short length random sequence (6 bases) in tagged Endoh primers. Increasing the length of the random nucleotides from six hexamers to nine or 12 in future studies may reduce the coverage biases.