Volume 88, Issue 10 pp. 1821-1826
Short Communication
Full Access

Phylogenetic analysis of the NS5 gene of Zika virus

Rama Adiga

Corresponding Author

Rama Adiga

Division of Bioinformatics and Computational Genomics, Nitte University Centre for Science Education and Research (NUCSER), Nitte University, Paneer Campus, Mangalore, India

Correspondence to: Rama Adiga, Division of Bioinformatics and Computational Genomics, Nitte University Centre for Science Education and Research (NUCSER), Nitte University, Paneer Campus, Mangalore 5750018, India. E-mail: [email protected]

Search for more papers by this author
First published: 23 June 2016
Citations: 13

Abstract

ZIKV infection has become a global threat spreading across 31 countries in Central America, South America, and the Caribbean. However, little information is available about the molecular epidemiology of ZIKV. Shared mutation of a threonine residue to alanine at the same position in the C terminal of NS5 sequences was observed in sequences from Colombia, Mexico, Panama, and Martinique. The sequences in the phylogenetic tree fell within the same cluster. Based on shared mutation the presence of a Latin American genotype was proposed. Comparison of African and Asian lineages yielded R29N, N273S, H383Q, and P391S mutation. The study highlights that mutation of amino acids at NS5 may contribute to neutropism of ZIKV. J. Med. Virol. 88:1821–1826, 2016. © 2016 Wiley Periodicals, Inc.

INTRODUCTION

Since its appearance in humans in 1954 [Macnamara, 1954], ZIKV infection has become a global health threat. It has spread outside Africa and Asia and is no longer confined to these regions [Haddow et al., 2012; Faye et al., 2014]. The first pacific outbreak was reported in Yap island in the Federated states of Micronesia in 2007 [Duffy et al., 2009] and in French Polynesia in 2013 [Musso et al., 2014; Cao-Lormeau et al., 2016; Malkki, 2016]. It had spread to New Caledonia, Cook island, and Easter island [Musso et al., 2015] and later appeared in Brazil [Campos et al., 2015; Zanluca et al., 2015]. However, the Brazilian outbreak (2015–2016) turned pandemic with autochthonous transmission of ZIKV causing microcephaly in newborns [Mlakar et al., 2016]. ZIKV has been associated with a rapidly developing neurotropic threat causing microcephaly and Guillain–Barré syndrome [Calvet et al., 2016; Cao-Lormeau et al., 2016]. It spread across Latin American countries of Colombia, El Salvador, Suriname, and Martinique (WHO situation report, March 17, 2016). Acute illness and myelitis due to ZIKV have been reported from Latin American countries and the Carribean [Araúz et al., 2016; Rodriguez-Morales et al., 2016; Rozé et al., 2016] and a global public health emergency was declared in Feb, 2016 [WHO, 2016].

With the onset of severe forms of ZIKV, it was decided to compare NS5 protein sequences of ZIKV isolates from different countries. The ZIKV NS5 gene contain unique phylogenetic signals and were found suitable for analysis [Faye et al., 2014].

METHODS

The NS5 full length sequences were 652 amino acid in length. The partial NS5 sequences were aligned at amino acids 1–466 of the full NS5 gene. All protein sequences were aligned using MUSCLE under default settings. The evolutionary history was inferred by using the Maximum Likelihood method based on the Le_Gascuel_2008 model. Evolutionary analyses were conducted in MEGA6. The analysis involved 55 sequences of ZIKV from GenBank. The tree was presented using Figtree.

RESULTS AND DISCUSSION

Phylogenetic trees identified two major lineages, the Asian and the African lineage (Fig. 1A) consistent with previous reports [Haddow et al., 2012; Faye et al., 2014]. The present study was built up from previous analysis of Faye et al. [2014]; Wang et al. [2016]; and Zhang et al. [] (Fig. 1A) using additional sequences updated up to May 2016. The sequences from Colombia, Mexico, Panama, and Martinique (Latin American countries) formed a separate clade distinct from Brazilian sequences (Fig. 1A) referred as Latin American strain. Recent ZIKV infection reported in Central America and the French West Indies especially Martinique have been associated with neurological complications and acute symptoms associated with Guillain–Barré syndrome [Araúz et al., 2016; Rozé et al., 2016; WHO situation report, March 2016]. The amino acid polymorphism exhibited by NS5 of ZIKV in the alignment showed T581A substitution in nine sequences obtained from isolates of Colombia, Martinique, Mexico, and Panama (Fig. 1B). The substitution lie in the C-terminal of NS5 gene involved in RNA-dependent RNA polymerase activity. The domains of the enzyme in flaviviruses are involved in intermolecular interaction and is responsible for de novo RNA synthesis of flaviruses [Zou et al., 2011]. It is suggested that mutation in the C-terminal of NS5 may bring about changes in RNA replication since hydrophilic to hydrophobic substitution was observed. The sequences in the Latin American genotype which are of clinical origin include AMQ34004, AMQ34003, AMM39804, AMZ03557, AMC33116, ANB66184, ANB66182, ANB66183. Although Zhang et al. [] did similar work of clustering strains from Colombia, Mexico, and Martinique, they failed to observe the amino acid mutation T581A, since they did not use the protein sequence of NS5 in the analysis. Their group also did not include sequences from Panama isolates which also bears T581A mutation (Fig. 1B). The polymorphism at the amino acid level was useful in genotyping the strains as Latin American strains. However, some sequences which clustered together with Brazilian sequences and not with the Latin American cluster were the sequences from Mexico (ANF04750), Honduras (ANG09399), Suriname (ALX35659), Colombia (ANF04752), Puerto Rico (AMZ03556), French Polynesia (AHZ13508) (Fig. 1A). The results suggested two genotypes circulating in Colombia and Mexico based on the sequences found in different clusters.

Details are in the caption following the image
A: Molecular phylogenetic analysis of Zika NS5 by maximum likelihood method. Phylogenetic tree of NS5 protein sequences of Zika. Sequences having T581A belonged to a single cluster. The tree with the highest log likelihood (−2327.2252) is shown. The percentage of trees in which the associated taxa clustered together is shown above the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. The rate variation model allowed for some sites to be evolutionarily invariable ([+I], 58.9110% sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. Scale represents a genetic distance of 10%. Spondweni lineage isolated in South Africa was used as outgroup to root the tree. The branch tips of tree include accession number followed by country, year. Evolutionary analyses were conducted in MEGA6. Figtree was used for display of sequences colored and form a cluster of latin American strains. B: Multiple sequence alignment of shared mutation in sequences from various countries downloaded from NCBI. Accession number of protein sequence. Given in bracket indicate genomic Id.

Quiñonez et al. [2016] reported the detection of ZIKV in Mexico from a traveller to Colombia in 2015. The present study also supports this hypothesis that transmission and spread of ZIKV in Mexico may be from the route of Colombia. The sequences from Mexico (AMQ34003 and AMQ34004) belong to the cluster of T581A (Fig. 1A). However, additional amino acid substitution at different positions in the Mexican sequence (AMQ34003 and AMQ34004) revealed H38Q and G359V substitutions occurring at position 38 and 359, respectively (Table I).

Table I. The GenBank Accession No. of Amino Acid Sequences, Country and Year of Isolation, Geographical Origin if Not Autochthonous, Clinical Isolation Source, and Mutation of Amino Acid at Positions Marked
Mutation at amino acid position number indicated
Accession no. Country/year Origin country Clinical source 29 36 38 60 62 125 138 203 274 359 391 452 581
1 AMK79469 China, 2016 Venezuela Imported into China Serum N A K D A S H I R G S D T
2 ALY05362 Brazil, 2015 Serum N S H E P S H N G S D T
3 AMM39806 China, 2016 Serum N S H E P S H N R G S D T
4 AMO03410 China, 2016 American Samoa, imported into China Serum N S H E P S H N R G S D T
5 AMM39804 Columbia, 2015 Serum N S H E P S H N R G S D
6 AAC58803* USA,1997 Serum N S H E H N R G S D T
7 ABY86749* Australia, 2007 Serum R S H E P N R N G S
8 AHL43499* Senegal, 1979 Serum N H R G P S T
9 AMR39830 China, 2016 Serum N S H E P S H N R G S D T
10 AMD61710 Thailand, 2014 Serum N S H E P S H N R G S D T
11 AMD61711 Philippines, 2012 Serum N S H E P S H N R G P D T
12 ABI54475 Uganda MR766, 2006 Serum R S H E P N R N R G P S T
13 AHL43471* Cote, 1999 Serum N H R G P S T
14 AHL43470* Senegal, 1969 Serum N H R G P S T
15 AHL43469* Senegal, 1991 Serum N H R G P S T
16 AHL43484* Senegal, 1997 Serum N H R G P S T
17 AHL43482* Senegal, 1997 Serum N H R G P S T
18 AHL43491* Cote, 1990 Serum N H R G P S T
19 AMB18850 Brazil, 2015 Fetus brain N S H E P S H N R G S D T
20 AMD16557 Brazil, 2015 Amniotic fluid N S H E P S H N R G S D T
21 AMK49165 Brazil, 2015 Serum D S H E P S H N R G P D T
22 ALX35659 Suriname, 2015 Serum N S H E P S H N R G S D T
23 AHZ13508 French Polynesia, 2013 Serum N S H E P S H N R G S D T
24 AEN75265 Nigeria, 1968 Serum R N H E P N R N R G P S T
25 AMN14619 Italy, 2016 Dominican Republic Urine N S H E P S H N R G S D T
26 AMK49164 Brazil, 2015 BeH823339 Serum D S H E P S H N R G S D T
27 AML82110 China, 2016 Saliva N S H E P S H N R G S D T
28 AMD16557 Brazil, 2015 Amniotic fluid N S H V P S H N R G S D T
29 AMH87239 Brazil, Bahia, 2015 Serum N S H E P S H N R G S D T
30 AMQ34003 Mexico, 2016 Cerebrospinal fluid D S Q E P S H N R V S D A
31 AHF49785 Central African Republic, 2013 Mosquito host K S H E P N R N R G P S T
32 ACD75819 Micronesia, 2007 Serum N S H E P S H N R G S D T
33 AMQ48981 Brazil, 2016 Urine N S H E P S H N R G S D T
34 ABI54480 Spondweni, 2006 Serum S S Y E P A N N R G P S T
35 AMQ48986 Guatemala USA Fetal brain N S H E P S H N C G S D T
36 ANG09399 Honduras N S H E P S H N C G S D T
37 ANF04750 Mexico N S H E P S H N C G S D T
38 ANF04752 Colombia Blood N S H E P S H N R G S D A
39 AFD30972 Cambodia Vero cells N S H E P S H N R G S D T
40 AMZ03556 Puerto Rico Vero cells N S H E P S H N R G S D T
41 AMZ03557 Colombia Barranquilla Vero cells N S H E P S H N R G S D A
42 AMS00611 Italy N S H E P S H N R G S D T
43 AMQ34004 Mexico Saliva N S Q E P S H N R G S D A
44 AMM39805 China Urine N S H E P S H N R G S D T
45 ANB66182 Panama N S H E P S H N R G S D A
46 ANB66184 Panama N S H E P S H N R G S D A
47 ANC90428 Panama N S H E P S H N R G S D A
48 ANB66183 Panama N S H E P S H N R G S D A
49 AMC33116 Martinique Vero cells N S H E P S H N R G S D A
50 AMR39831 China Serum N S H E P S H N R G S D T
51 ANC90426 Brazil N S H E P S H N R G S D T
52 AME17085 Brazil N S H E P S H N R G S D T
53 AMQ48982 Brazil N S H E P S H N R G S D T
54 AMZ03557 Colombia N S H E P S H N R G S D A
55 AMC39589 Mexico N S H E P S H N R G S D A
  • The partial sequences are shown with * and the − dash indicates gaps in the alignment of partial sequences. Origin/country indicated only where travel importation was reported.

Other clusters obtained from the analysis include the sequences from China. ZIKV was introduced into China through more than one route and were important contributory factors for the 2016 spread in China [Zhang et al., ]. Seven sequences from China associated with ZIKV cases were used in the analysis. The present study using NS5 also observed results similar to Zhang et al. [] using genome sequence. This study included two additional sequences AMR39830 (KU955589) and AMR39831 (KU955590) deposited in GenBank on 23rd March 2016 hence was not included in the analysis by Zhang et al. []. These two sequences segregated separately in two different clusters (Fig. 1A). However, the amino acid mutation of the Chinese isolate imported from Venezuela at NS5 (AMK79469, KU744693) were not discussed by Zhang et al. []. The substitutions observed were D60E, P62A, H38K, N203I, K38H, and S36A among a total of six substitutions (Table I). Three other sequences analysed were the sequences from Honduras collected in 2016 and Guatemala from fetal brain also in 2016 and a Mexican sequence (ANF04750, Year of collection, 2015). All three sequences show R274C mutation involving cysteine.

Phylogenetic analysis of ZIKV sequences from Brazil was reported by Wang et al. [2016]. The present analysis were built upon the analysis by Wang et al. [2016] using additional sequences. Mutation of E to V was observed in Natal sequence (AMB18850) of NS5 in the present study. However, mutation of V to A in Rio S1 (AMQ48982) of NS5 discussed by the author was not observed in the present analysis. Comparison of substitution at NS5 in the Asian lineages was done with the African lineages. The mutation observed in Asian lineages were N125S, R138H, H198Q, A275I, G336K, R396N, E309K (Table I) which were same as that observed by Wang et al. [2016]. However, the present analysis observed additional mutation R29N, N273S, H383Q, P391S which were not observed by Wang et al. [2016] at NS5 (Table I). Thus, two N to S mutation and two H to Q mutation was found in the analysis. Also, the proline to serine mutation was an interesting observation and it was suggested that the sequence from Philippines, year 2012 (AMD61711) and/or the sequence from Brazil collected in 2015 (AMQ76465) may have reverted the mutation appearing as proline.

The origin of the Asian lineage might be hypothesized to be from African ZIKV based on a partial sequence used in the analysis. Interestingly, the partial sequence of NS5 from an isolate submitted from Australian security CRC, Queensland in 2007, accession no. ABY86749 isolation source unknown (probably isolated from the pacific region), fell within the Ugandan MR766 2006 clade (Fig. 1). The partial sequence was referred to here as the outbreak clade of 2007, the same year the first transmission of Zika virus was reported outside Africa and Asia. The results suggested the outbreak of 2007 from pacific to be the result of introduction of Ugandan strain MR766 into the pacific region probably by travel importation. The partial sequence did not show the mutation at position 29, 125, 138, and 198 (Table I) explaining the African origin.

To conclude the co-circulation of two zika virus genotypes associated with ZIKV was suggested. Based on the results of NS5 sequence analysis the Brazilian genotype and the Latin American genotype of Mexico, Panama, and the Caribbean was proposed. However, it is possible that other sequences depicting co-circulation could have passed unnoticed because of limited sampling and sequencing.

ACKNOWLEDGMENTS

The author wishes to thank Indrani Karunasagar, Director (R&D), Nitte University Centre for Science Education and Research and the Management of Nitte University, Deralakatte, Mangalore, Karnataka, India for the support in establishing the center, providing facilities and continuous encouragement in research, including the present work.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.