Characterization and genetic diversity of Dioscorea bacilliform viruses present in a Pacific yam germplasm collection

This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the document is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recognise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to qut.copyright@qut.edu.au

The Centre for Pacific Crops and Trees (CePaCT) within the Pacific Community (SPC) maintains a unique collection of in vitro yam germplasm from the Pacific region. Although this germplasm represents a potentially valuable resource to improve yam production, a lack of reliable diagnostic protocols for badnaviruses has restricted its exploitation. Sukal et al. (2017) identified three DBVs, namely DBALV2, DBESV, and DBRTV2, in the Pacific germplasm collection at CePaCT using a rolling circle amplification (RCA)-based strategy. In this study, RCA was used to identify and further characterize the molecular diversity of badnaviruses infecting the Pacific yam germplasm collection. The present study brings the number of Pacific DBV complete genome sequences to 24, suggesting geographic restriction of DBVs in the Pacific, a key consideration for future germplasm screening and dissemination.

| Samples and nucleic acid extraction
SPC-CePaCT holds 330 accessions in its in vitro germplasm collection. For this study, 224 out of the 330 yam accessions from the collection were acclimatized in SPC-CePaCT's insect-proof screenhouse. After 3 months, leaf samples were collected and total nucleic acids (TNA) extracted (Kleinow et al., 2009). The purified TNA was quantified using a spectrophotometer (NanoDrop 2000; Thermo Fisher Scientific) and the concentration adjusted to c. 500 ng/μl with sterile nuclease-free water (NF-H 2 O). Episomal DBV-free yam accession DA/NGA01, a Nigerian accession obtained from the International Institute of Tropical Agriculture (IITA), was used as a virus-free control for the RCA experiments. This accession had previously tested negative for episomal DBV both at IITA using immunocapture (IC)-PCR and at SPC-CePaCT using IC-PCR and RCA according to protocols described in Seal et al. (2014). All yam accession descriptors included in the text refer to those used at SPC-CePaCT.

| RCA, cloning and Sanger sequencing
A badnavirus-biased RCA approach (Sukal et al., 2019) was used to amplify viral circular DNA. Briefly, a mixture of 32 degenerate badnavirus primers (Sukal et al., 2019) at a final concentration of 0.4 μM of each primer, 1 × Φ29 buffer (NEB) and 1 μl (c. 500 ng) of TNA was made up to a final volume of 10 μl with sterile NF-H 2 O and denatured at 95 °C for 3 min, cooled to 4 °C and placed on ice. A 10 μl reaction mixture consisting of 2.5 μM exo-resistant random hexamers (Thermo Fisher Scientific), 1 × Φ29 buffer, 2 ng/μl bovine serum albumin (BSA), 4 mM DTT, 15 mM dNTPs, 5 U/μl of Φ29 DNA polymerase (Thermo Fisher Scientific) and sterile NF-H 2 O to make up the final volume, was prepared and added to each denatured sample.
Reactions were incubated at 36 °C for 18 hr, followed by 65 °C for 10 min to denature the Φ29 DNA polymerase.
RCA products were independently digested with EcoRI and SphI (NEB). These enzymes were selected following in silico restriction analysis of published yam badnavirus genome sequences and previously published work (Sukal et al., 2019). Digest products were electrophoresed through 1.5% agarose gels, stained with SYBR Safe (Thermo Fisher Scientific) and fragments of interest were excised, purified and cloned as described in Sukal et al. (2017). Sequencing was carried out using BadnaFP/RP primers (Yang et al., 2003) and the resulting reads were queried against GenBank using the BLASTn and BLASTx algorithms to determine identity. A primer-walking approach using standard Sanger sequencing conditions was subsequently used to sequence the complete genomes of four DBALV isolates, including three samples (one each of D. alata, D. transversa, and D. trifida) from Vanuatu and one D. esculenta sample from Tonga.
The presence of the putative single SphI restriction site was confirmed with PCR using sequence-specific primers (DBALV-SphI site_

| Next-generation sequencing and genome assembly
Undigested RCA products of 10 Papua New Guinea (PNG) samples and 6 Vanuatu samples were purified using the Illustra GFX PCR DNA and Gel Band Purification Kit (GE Healthcare) and sent to the Central Analytical Research Facility (CARF), Queensland University of Technology, Brisbane, Australia, for library preparation and sequencing using the Illumina MiSeq system to generate paired-end reads of 301 bp each. Similarly, undigested RCA products from a further 98 samples that did not produce any apparent visible restriction digest profiles following digestion of RCA products using EcoRI and SphI were purified and pooled by country and sent for the preparation of 13 additional libraries. This included six libraries using samples from Fiji, two libraries each for samples from Vanuatu, New Caledonia and Federated States of Micronesia (FSM), and one library for samples from PNG, which were subsequently sequenced by next-generation sequencing (NGS) as described previously.
Quality of the raw reads was assessed with FastQC v. 0.10.1 (Babraham Bioinformatics). A pipeline similar to that described in Muller et al. (2018) was used for the processing of NGS data and characterization of badnavirus diversity. Raw reads were trimmed to obtain optimum quality using the dynamic trim function of

| Pairwise sequence comparison, phylogenetic and recombination analysis
Partial coding sequences for reverse transcriptase (RT)-ribonuclease H (RNase H), delimited by the BadnaFP/RP primers, were used to determine the pairwise nucleotide sequence identity using the Sequence Demarcation Tool (SDT v. 1.2; Muhire et al., 2014). For phylogenetic analysis, multiple sequence alignments were constructed using ClustalW (Larkin et al., 2007) within MEGA 7  and the maximum-likelihood method (Kimura-2parameter model) was used to reconstruct phylogenetic trees following 1,000 bootstrap iterations. To assess possible recombination events, sequences from the different DBALV and DBALV2 groups from this study together with representative isolates from the NCBI database were analysed using the RDP4 program (with embedded RDP, GENECONV, BOOTSCAN, MAXCHI, CHIMAERA, 3SEQ, and SISCAN tools) using the default parameters (Martin et al., 2015).

| RCA, Sanger sequencing and NGS
Total nucleic acids extracts from 224 yam samples representing 5 yam species, including D. alata (185), D. esculenta (31), D. bulbifera (6), and 1 each of D. transversa and D. trifida, were subjected to RCA, followed by independent restriction digestion using either EcoRI or SphI. A total of 35 samples from PNG, Vanuatu, and Tonga produced restriction profiles indicative of the presence of badnaviruses (Table 1). In some instances, the total size of the EcoRI restriction digest products was larger than for a single badnavirus genome, suggesting mixed infections of different virus genotypes.
Of the 185 D. alata accessions tested, only 15/38 samples from PNG and 2/58 samples from Vanuatu produced restriction profiles indicative of badnavirus infection (Table 1) (Tables S1 and   S2), complete badnavirus genomes were assembled. The genomes generated from all the PNG samples (Table S1) were most similar to DBALV2, while the genomes assembled from Vanuatu samples (Table S2)

| Dioscorea bacilliform AL virus from the Pacific
Using NGS, six complete DBALV genomes from Vanuatu were generated including one from D. alata, three from D. bulbifera and two from D. esculenta. In addition, four complete genomes were gen-   (Figure 3a). Phylogenetic analysis using the partial RT/RNase H-coding region revealed that the DBALV isolates from Africa were ancestral to the isolates from the Pacific (Figure 3b).
Recombination analysis did not detect any potential recombination event(s) in any of the DBALV full-length sequences.

| Dioscorea bacilliform AL virus 2
RCA combined with NGS was used to generate 10 complete ge-

| D ISCUSS I ON
In this study, DBVs present in the Pacific yam germplasm collec-  , 06, 12, 14, 17, 20, 22, 23, 43, 45, 51, 57, 58, and 59) are present in both regions, several viruses identified in the African region have not been detected in the Pacific and vice versa. The present study shows that some of these viruses are restricted to only one or a few countries in the Pacific and, as such, special considerations must be taken to ensure that germplasm collections are thoroughly screened to prevent the dissemination of these badnavirus species to other countries. The availability of virus-tested yam germplasm is essential for the effective distribution and eventual use of yams for improved food and nutritional security in the Pacific. Testing of the Pacific yam germplasm for other viruses will also be necessary, however, before material can be made available for distribution.

ACK N OWLED G EM ENTS
The