Volume 88, Issue 10 pp. 1672-1676
Research Article
Full Access

Bayesian coalescent inference reveals high evolutionary rates and diversification of Zika virus populations

Alvaro Fajardo

Alvaro Fajardo

Molecular Virology Laboratory, CIN, Facultad de Ciencias, Universidad de la Republica, Montevideo, Uruguay

Search for more papers by this author
Martín Soñora

Martín Soñora

Molecular Virology Laboratory, CIN, Facultad de Ciencias, Universidad de la Republica, Montevideo, Uruguay

Search for more papers by this author
Pilar Moreno

Pilar Moreno

Molecular Virology Laboratory, CIN, Facultad de Ciencias, Universidad de la Republica, Montevideo, Uruguay

Search for more papers by this author
Gonzalo Moratorio

Gonzalo Moratorio

Molecular Virology Laboratory, CIN, Facultad de Ciencias, Universidad de la Republica, Montevideo, Uruguay

Viral Populations and Pathogenesis laboratory. Institut Pasteur, Paris, France

Search for more papers by this author
Juan Cristina

Corresponding Author

Juan Cristina

Molecular Virology Laboratory, CIN, Facultad de Ciencias, Universidad de la Republica, Montevideo, Uruguay

Correspondence to: Juan Cristina, Laboratorio de Virología Molecular, Centro de Investigaciones Nucleares, Facultad de Ciencias, Universidad de la Republica, Igua 4225, 11400 Montevideo, Uruguay. E-mail: [email protected]

Search for more papers by this author
First published: 09 June 2016
Citations: 10
Conflict of interest: None.

Abstract

Zika virus (ZIKV) is a member of the family Flaviviridae. In 2015, ZIKV triggered an epidemic in Brazil and spread across Latin America. By May of 2016, the World Health Organization warns over spread of ZIKV beyond this region. Detailed studies on the mode of evolution of ZIKV strains are extremely important for our understanding of the emergence and spread of ZIKV populations. In order to gain insight into these matters, a Bayesian coalescent Markov Chain Monte Carlo analysis of complete genome sequences of recently isolated ZIKV strains was performed. The results of these studies revealed a mean rate of evolution of 1.20 × 10−3 nucleotide substitutions per site per year (s/s/y) for ZIKV strains enrolled in this study. Several variants isolated in China are grouped together with all strains isolated in Latin America. Another genetic group composed exclusively by Chinese strains were also observed, suggesting the co-circulation of different genetic lineages in China. These findings indicate a high level of diversification of ZIKV populations. Strains isolated from microcephaly cases do not share amino acid substitutions, suggesting that other factors besides viral genetic differences may play a role for the proposed pathogenesis caused by ZIKV infection. J. Med. Virol. 88:1672–1676, 2016. © 2016 Wiley Periodicals, Inc.

INTRODUCTION

Zika virus (ZIKV) is a flavivirus, whose natural transmission cycle involves mosquitoes vectors from the Aedes (Ae.) genus, while humans are occasional hosts [Hayes, 2009]. Clinical manifestations of disease range from asymptomatic cases to fever, headache, malaise, and cutaneous rash. ZIKV is transmitted primarily by Ae. aegypti mosquitoes [Hayes, 2009]. Other mosquitoes species, like Ae. albopictus, can transmit the virus. Both mosquitoes species are found throughout the Americas, where also transmit Dengue and Chikungunya viruses [Hennessey et al., 2016].

ZIKV genome consists of a single-stranded positive sense RNA molecule of 10,794 nt in length. It has two non-coding regions at the 5′ and 3′ end of the genome. This genome encode for a single long open reading frame encoding a polyprotein that is cleaved into capsid (C), precursor of membrane (prM), envelope (E), and seven non-structural proteins (NS) [Kuno and Chang, 2007].

ZIKV was isolated for the first time in 1947, from the blood of a sentinel Rhesus monkey stationed in the Zika forest, Uganda [Dick et al., 1952]. Although, ZIKV enzootic activity was reported in diverse countries of Africa and Asia, few human cases were reported until 2007, when an epidemic took place in Micronesia [Duffy et al., 2009]. A large ZIKV outbreak took place in French Polynesia during 2013–2014 and then spread to other Pacific Islands [Musso, 2015]. In early 2015, a ZIKV epidemic outbreak took place in Brazil, currently estimated at 440,000–1,300,000 cases [Campos et al., 2015]. By January 20th, 2016, ZIKV locally transmitted cases were reported from Puerto Rico and 19 other countries or territories in the American region to the Pan American Health Organization [Hennessey et al., 2016]. Several studies have raised concern about the possible relation among microcephaly and ZIKV infection [Schuler-Faccini et al., 2016].

Recent phylogenetic studies estimates a single introduction of ZIKV in the American region occurred between May and December of 2013. This is more than 12 months prior the detection of ZIKV in Brazil. This date coincides with an increase of passengers flying to Brazil from ZIKV endemic areas, and with outbreaks in Pacific Islands [Faria et al., 2016].

By May of 2016, the World Health Organization expressed concern over spread of ZIKV beyond Latin America [Gulland, 2016]. In order to gain insight into the current situation of the ZIKV outbreak, a Bayesian coalescent analysis of recently isolated ZIKV strains, including strains isolated from microcephaly cases and for whom complete genomes are known, was performed in order to investigate evolutionary rates, population dynamics, and patterns of evolution.

MATERIALS AND METHODS

Sequences

Complete coding sequences of 39 available and comparable ZIKV strains (10,269 nucleotides) were obtained from GenBank (available at: http://www.ncbi.nlm.nih.gov). For strain names and accession numbers see Supplementary Material Table SI. Sequences were aligned using the MUSCLE program [Edgar, 2004].

Bayesian Coalescent Markov Chain Monte Carlo (MCMC) Analysis

In order to gain insight into the evolutionary rate and mode of evolution of currently circulating ZIKV strains, we used a Bayesian MCMC approach as implemented in the BEAST package v.1.8.0 [Drummond and Rambaut, 2007]. First, the optimal evolutionary model that best fitted our sequence dataset was identified using FindModel software (available at: http://hiv.lanl.gov/content/sequence/findmodel/findmodel.html). Akaike Information Criteria (AIC) and the log of the likelihood (LnL) indicated that the GTR+Γ model was the most suitable model. Both strict and relaxed molecular clock models were used to test different dynamic models (constant population size, exponential population growth, expansion population growth, logistic population growth, and Bayesian Skyline). To account for uncertainty of sampling date, precision values were included for sequences that only indicated its sampling year. Statistical uncertainty in the data was reflected by the 95% highest posterior density (HPD) values. Results were examined using the TRACER v1.6 program (available from http://beast.bio.ed.ac.uk/Tracer). Convergence was obtained for two independent runs with 40 million generations, after a burn-in of four million steps, which were sufficient to obtain a proper sample for the posterior, assessed by effective sample sizes (ESS) above 200. Models were compared by AICM from the posterior output of each of the models using TRACER v1.6 program. Lower AICM values indicate better model fit. The Bayesian Skyline model was the best model to analyze the data. Maximum clade credibility trees were generated by means of the use of the Tree Annotator program from the BEAST package. Visualization of the annotated trees was done using the FigTree program v1.4.2 (available at: http://tree.bio.ed.ac.uk).

RESULTS

In order to determine the evolutionary rate and mode of evolution of the currently circulating ZIKV strains, a Bayesian MCMC approach was employed as implemented in the BEAST package v.1.8.0 [Drummond and Rambaut, 2007]. The results shown in Table I are the outcome of 40 million steps of the MCMC, using the GTR+Γ model, a strict molecular clock, and the Bayesian Skyline model.

Table I. Bayesian Coalescent Inference of ZIKV Strains
Group Parameter Value HPD ESS
ZIKV full-length coding sequence Prior −2657 −2673 to −2640 2232
Posterior −22566 −22584 to −22548 3101
Log likelihood −19909 −19920 to −19899 6369
Clock rate 1.20 × 10-3 9.51 × 10−4 to 1.41 × 10−3 1967
tMRCA American clade 2.063 2.4398 to 1.7244 1773
11/02/2014 27/09/2013 to 15/06/2014
tMRCA American-China clade 2.774 3.2125 to 2.4054 2663
28/05/2013 19/12/2012 to 09/10/2013
  • a See Supplementary Material Table SI for strains included in this analysis.
  • b In all cases, the mean values are shown.
  • c HPD, high probability density values.
  • d ESS, effective sample size.
  • e Clock rate was calculated in substitutions/site/year.
  • f tMRCA, time of the most common recent ancestor is shown in years. The date estimated is indicated in bold.

A mean rate of 1.20 × 10−3 nucleotide substitutions per site per year (s/s/y) was obtained for the ZIKV strains enrolled in these studies using exclusively complete coding sequences (Table I). Phylogenetic relations among currently circulating ZIKV strains were explored and summarized in a maximum clade credibility tree shown in Figure. 1. ZIKV strains that emerged in Latin America in 2015, as well as strains isolated in that region in 2016 cluster together and are closely related with the only French Polynesian variant available, in agreement with recent results [Musso, 2015]. This Latin American cluster group together several strains isolated in China in 2016 (Fig. 1, in red). Interestingly, another genetic group composed exclusively by Chinese strains can be observed (Fig. 1, in blue), revealing that two different genetic lineages of ZIKV co-circulate in China. Strains isolated from microcephaly cases in 2015 and 2016 are not identical and cluster in different branches of the Latin American group (Fig. 1, in green).

Details are in the caption following the image
MCC tree of ZIKV complete coding sequences. A maximum clade credibility tree was obtained using the GTR+Γ model, the Bayesian Skyline model, and a strict clock. The tree is rooted to the MRCA of strains included. The x axis indicates years. Strains in the tree are shown by their name, geographical location and year of isolation expressed in decimal format. ZIKV strains isolated in China clustering with Latin American variants are highlighted red, while other ZIKV variants isolated in that country that cluster apart are indicated in blue. Strains isolated from microcephaly cases are indicated in green.

DISCUSSION

The evolutionary rate estimated in this study for currently circulating ZIKV strains (1.20 × 10−3 s/s/y) is slightly higher than the one reported in recent studies on the early stages of the ZIKV outbreak in the Americas (0.98–1.06 × 10−3 s/s/y) [Faria et al., 2016]. The difference in these estimates can be explained by the fact that the present study includes more Latin American strains, as more recently obtained sequences were considered. Therefore, it is possible that the epidemic is still in its expansive phase, as suggested in previous epidemics analyses which showed that higher evolutionary rates tend to decline as the epidemic progresses [Meyer et al., 2015; Park et al., 2015].

Previous studies have shown that ZIKV strains isolated in Latin America belong to the Asian cluster [Campos et al., 2015; Musso, 2015]. In this study, using full-length coding sequences, a cluster exclusively composed of ZIKV strains that emerged in Latin America was observed, in agreement with recent results [Faria et al., 2016] (see Fig. 1).

Recent studies revealed a significant sequence variations in ZIKV genomes between the African and Asian lineages, as well as among different strains within Asian lineage, as the clinical disease caused by ZIKV has changed from a benign illness to include severe neuropathology [Wang et al., 2016]. ZIKV prM protein had the highest percentage variability between the Asian human and the African mosquito ZIKV strains. Amino acid substitutions in that protein resulted in a dramatic predicted structural change between the African and Asian strains [Wang et al., 2016]. These changes in prM could play a role in virulence or improved fitness. More studies will be needed to address this important issue.

Previous phylogenetic analysis characterized two major genetic lineages of ZIKV, the African and Asian lineages [Haddow et al., 2012; Faye et al., 2014]. Very recently, phylogenetic analyses based on ZIKV E and NS5 genes revealed the presence of three distinct lineages, (Asian/American lineage, African lineage 1, and African lineage 2) [Shen et al., 2016]. These studies also revealed important phylogeographic roles of two African countries, Senegal and Cote d'Ivoire, in ZIKV evolution and divergence [Shen et al., 2016]. Moreover, these studies revealed the migration of ZIKV from Senegal to the Asian countries and Pacific islands. Senegal was also suggested as the geographic origin of all known ZIKV epidemics outside Africa [Shen et al., 2016]. This is in agreement with the results of this work, since ZIKV strains belonging to the cluster exclusively composed by Latin American strains have a close genetic relation with strains isolated in the French Polynesia (see Fig. 1). Moreover, this is also in agreement with recent studies that revealed that the closest strain to the one that emerged in Brazil was isolated from samples from French Polynesia that spread to the Pacific Islands [Musso, 2015].

The time of the most recent common ancestor (tMRCA) of all Latin American ZIKV strains was estimated around February 2014 (95% HPD: September 2013 to June 2014) (Table I). This result is in line with recent studies that estimate the tMRCA of Brazilian isolates between August 2013 and April 2014 [Faria et al., 2016]. This clade is not exclusively composed of Latin American strains, as several Chinese strains can be observed (Fig. 1, in red). These strains seem to derive from imported cases from Latin American variants. This has also been observed in other regions of the world [Massad et al., 2016], as is the case of strain Brazil/2016/INMI1, isolated in Rome, Italy, of Brazilian origin. Interestingly, another cluster composed exclusively of Chinese strains can be observed (Fig. 1, in blue). This Chinese genetic lineage shares a common ancestor with Latin American strains, which was estimated in May 2013 (95% HPD: December 2012 to October 2013), short before the circulation of an ancestral variant that give rise to both Latin American and French Polynesian strains (Fig. 1), recently, dated to May 2013 (confidence interval: December 2012 and September 2013) [Faria et al., 2016]. This finding suggests that this Chinese genetic lineage should have probably evolved from an ancestor that circulated in the Pacific Islands outbreak in 2013. This co-temporal circulation of viral lineages with different evolutionary histories suggests that ZIKV diversification may be greatly underestimated. This may be related to the fact that most of ZIKV infections are asymptomatic, and symptomatic diseases are generally mild with clinical manifestations that can be mistaken with other arboviral infections, leading to misdiagnosis and underreporting [Haddow et al., 2012]. This was the case in Micronesian outbreak of 2007, where patients were initially diagnosed with dengue fever [Lanciotti et al., 2008; Duffy et al., 2009]. Another important aspect that supports this observation is that ZIKV natural transmission cycle involves mainly Aedes mosquitoes and monkeys, while humans and other mammals act as occasional hosts [Darwish et al., 1983; Hayes, 2009; Faye et al., 2014]. However, humans may act as potentially reservoirs hosts in urban cycles if they exhibit high and sustainable level of viremia [Duffy et al., 2009]. Therefore, although it has been suggested that ZIKV is mainly maintained in nature in its sylvatic cycle, the serological evidence suggest a high incidence of ZIKV circulation in humans [Duffy et al., 2009; Faye et al., 2014]. Moreover, the potential of ZIKV as an emerging disease that can easily spread around the world can be supported by the extent of the current Latin American outbreak, as well as the observation of a different genetic lineage circulating in China, which may have probably remained undetected in the absence of this American epidemic.

Interestingly, although all Latin American strains enrolled in this study are assigned to the same cluster, strains isolated from microcephaly cases are not identical and cluster in different branches of the cluster (see Fig. 1). A more detailed study of ZIKV strain NatalRGN/Brazil/2015 genome, isolated from fetal brain tissue from a microcephaly case [Mlakar et al., 2016], reveal four unique amino acids substitutions in relation to all other strains enrolled in these studies, three of them in NS1 protein (substitutions K143E, T230A, and M346V) and one in NS5 protein (T3I). A similar analysis of ZIKV strain ZKV2015/Brazil/2015, isolated from amniotic fluid of fetuses with microcephaly [Calvet et al., 2016] revealed three unique amino acid substitutions in relation to all other strains enrolled in these studies, one in the envelope (E) protein (substitution S287E), one in the NS2A protein (substitution L107F), and one in NS5 protein (substitution E579V). Strain FB-GWUH-2016, isolated in 2016 from fetal brain [Driggers et al., 2016], reveal nine unique amino acid substitutions in relation to all other strains enrolled in this studies, which are substitution F147L in pre-membrane, V390I in envelope, G108A and K224Q in NS1, M38V in NS2B, T462A and M467L in NS3, A171T and R538C in NS5 proteins. As it can be observed, none of these substitutions are shared among recently published microcephaly-case genomes. This is in agreement with recent and previous reports [Calvet et al., 2016; Faria et al., 2016; Mlakar et al., 2016].

CONCLUSIONS

Taking all these results together, high evolutionary rates and fast population growth characterize the population dynamics of ZIKV strains that emerged in the Latin American region. Difficulties to diagnose ZIKV infections have historically lead to low number of reporting cases and limited sequence availability. However, both the serological evidence of high level of ZIKV circulation in humans, and the observation of a genetic lineage in China that cluster apart from Latin American outbreak strains, indicate a high level of diversification of this viral agent. These observations also suggest the potential of ZIKV as an emerging disease capable to rapid spread to different regions of the world. Strains isolated from microcephaly cases do not share amino acid substitutions, suggesting that other factors besides viral genetic differences may play a role for the proposed pathogenesis caused by ZIKV infection.

    The full text of this article hosted at iucr.org is unavailable due to technical difficulties.