The evolutionary and transmission characteristic of HIV-1 CRF07_BC in Nanjing, Jiangsu
Wei Li, Xiaoshan Li, and Yan He contributed equally to this work.
Abstract
To understand the epidemiology, evolutionary and transmission characteristics of HIV-1 CRF07_BC in Nanjing, China. One hundred and fifty-nine patients with HIV-1 CRF07_BC were recruited. DNA sequencing, phylogenetic analysis, and molecular transmission cluster analysis were conducted to determine the molecular epidemiology and evolutionary characteristics. Of these HIV-1-infected patients, 95.6% were male, and men who sex with men (76.7%) were the main transmission route. Only 34.0% of these cases were born in Nanjing, and most of them (64.8%) reported having multiple sex partners in the last 6 months. The maximum likelihood phylogenetic analyses of HIV-1 CRF07_BC revealed two lineages. Overall, 67.3% of Nanjing sequences were connected to at least one other individual distributed in 11 clusters, and the average degree was 21.2 with range (1-178). The clustered patients were more likely to be male. The time to a most recent common ancestor for the early HIV-1 CRF07_BC circulating in Nanjing was estimated to be 1998.71[1997.36-2001.07]. The mean estimated evolutionary rate for the epidemic cluster was slightly lower at 2.38[2.12-2.65] × 10−3 per site per year with the relaxed exponential clock model. HIV-1 CRF07_BC was transmitted into Nanjing more than 20 years ago from Yunnan and has become one of the most predominant subtypes with a higher evolutionary rate than before.
Highlights
-
We performed a comprehensive phylodynamic analysis of HIV-1 CRF07_BC sequences from Nanjing.
-
HIV-1 CRF07_BC was transmitted into Nanjing more than 20 years from Yunnan.
-
There were two waves of HIV-1 CRF07_BC infection transmitted to Nanjing.
-
MSM may have significantly contributed to the complicated transmission pattern of HIV-1 CRF07_BC.
1 INTRODUCTION
The HIV epidemic remains a persistent public health issue in China and worldwide. While there is great progress in preventing and treating HIV in China, there is still much to do. There are at least nine genetically distinct subtypes of HIV-1 group M. Additionally, different subtypes can combine genetic material to form a hybrid virus, known as 'circulating recombinant forms' (CRFs).1 There are three key CRFs of HIV-1 that predominate in China (CRF01 AE, CRF07 BC, CRF08 BC).2, 3 HIV-1 CRF07_BC presumably arose in Yunnan Province and initially spread to Xinjiang, Guangxi, and Sichuan provinces during 1994 to 1996, and then spread to Jiangsu and Liaoning provinces during 1997 to 1998.4-9 This CRF was first identified in the intravenous drug use (IDU) population. A few years later, CRF07_BC strains were introduced into the men who have sex with men (MSM) and heterosexual (HE) populations. According to a nationwide molecular epidemiological survey in 2006, HIV-1 CRF07_BC has become one of the most widespread subtypes circulating in China.10 Besides, CRF07_BC has been involved in many newly reported CRFs by the second recombination with other strains. Novel HIV-1 second-generation recombinant forms (CRF01_AE/CRF07_BC), (CRF07_BC/CRF5501_B), and (CRF01_AE/CRF07_BC/CRF08_BC) were isolated in some provinces of China.11-14 The emergence of these second-generation recombinant forms exemplifies the diversity of the HIV-1 epidemic, which highlights the increasing complexity of the HIV-1 epidemic in China. Consequently, further molecular epidemiological investigations of HIV-1 CRF07_BC should be done to track the genetic evolution of HIV-1 strains to prevent HIV transmission.
Nanjing is the capital of Jiangsu province and a megacity in eastern China, and like most major cities in the country, it attracts a large number of domestic migrants to study, work, and settle in the region. In Nanjing, a total of 3929 newly diagnosed HIV cases were reported in 2011-2016, with the number of newly diagnosed cases increasing year by year.15 Nanjing is the main HIV epidemic area in Jiangsu province, with a relatively high prevalence of CRF07_BC.16 Because of the rapid increase of HIV-1 CRF07_BC infection rate in China, there was critical importance to understand the current gene variation of circulating CRF07_BC strains in Nanjing. In this study, we conducted a comprehensive phylodynamic analysis of HIV-1 CRF07_BC sequences from Nanjing, to reconstruct the epidemic history of HIV CRF07_BC strain circulating in this region.
2 METHODS
2.1 Study population and sample collection
Between September 1 2015 to July 31 2017, the study participants were enrolled from five districts of Nanjing. All eligible participants met the following criteria in this study: (a) aged 18 years and above; (b) were newly diagnosed and antiretroviral-naïve; (c) agreed to enroll in this study with verbal or written informed consent. The blood samples were collected at the first follow-up after HIV diagnosis. A total of 10 mL peripheral blood sample was collected using an anticoagulant blood tube with EDTA. Plasma was separated from the whole blood within 12 hours after collection and stored at −80°C for further analysis.
2.2 HIV viral RNA extraction and Pol gene amplification
HIV viral RNA was extracted from 200 µL of plasma specimens using the QIAmp ViralRNAMini kit (Qiagen, Valencia, CA) according to the manufacturer's instructions. Reverse transcription of RNA to single-stranded cDNA was performed with SuperScript III Reverse Transcriptase using primer Oligo (dT) 20 (Invitrogen Life Technologies, Carlsbad, CA) based on the manufacturer-recommended, previously described methods.17 The RNA was applied in the subsequent reverse transcription-polymerase chain reaction (PCR) and nested PCR to generate the pol fragments. The pol fragment covering the entire protease (PR) and the first 300 codons of the reverse transcriptase (RT) gene, the amplification product size is 1060 bp, as described previously.18 RT-PCR was conducted by the primers MAW25, 5′TTGGAAATGTGGAAAGGAAGGAC3′ (nt2018-2050) and RT21, 5′CTGTATTTCTGCTATTAAGTCTTTTGATGGG3′(nt3539-3509), nested PCR was conducted by the primers PRO-1, 5′CAGAGCCAACAGCCCCACCA3′ (nt2147-2166) and RT20, 5′CTGCCAGTTCTAGCTCTGCTTC3′ (nt3462-3441). The PCR products were sent to Sangon Biotech for sequencing using ABI 3730XL automated DNA sequence. Eventually, 160 HIV-1 CRF07_BC pol sequences covering 1060 base pairs (HXB2: 2253-3312) were successfully obtained.
2.3 Phylogenetic analysis
HIV-1 subtypes were determined based on branching topology, clustering and split the support of the analyzed sequences and their phylogenetic relationships with HIV-1 reference sequences as described elsewhere.19 The reference sequences were downloaded from Los Alamos HIV Databases based on the HIV-BLAST online tools (www.hiv.lanl.gov/content/sequence/BASCI_BLAST/basic_blast.html), duplicate sequences were removed based on the sequence identifiers and accession numbers, and sequences with more than equal to 5% ambiguous nucleotides were excluded, as described in our previous work.20 A total of 424 reference sequences were identified. Nanjing sequences and reference sequences were aligned and manually edited using the CLUSTAL X21 and BioEdit software version 7.2,22 and the best substitution model was the general time-reversible substitution model with γ distributed rate variation among sites, which was estimated by MEGA X software package.23 A maximum-likelihood tree was reconstructed using FastTree under the General Time Reversible model of nucleotide substitutions and varying sites to assess the clade support. We applied the Shimodaira-Hasegawa approximate likelihood ratio test(SH-aLRT) with 1000 pseudo-replicate.
2.4 HIV-1 CRF07_BC transmission cluster inference
HIV-1 pol sequences from 583 HIV-infected individuals (424 reference sequences and 159 Nanjing isolates) were used to infer a partial HIV transmission cluster. Genetic distance between the pairs was calculated using the Tamura–Nei 93 (TN93) method. Subsequently, a genetic distance threshold of less than equal to 1.0% was used to identify potential transmission clusters.24 Network diagrams were plotted using Cytoscape 3.7.2.25
2.5 Bayesian Markov Chain Monte Carlo evolutionary analysis
To better understand the evolutionary history of HIV-1 CRF07_BC in Nanjing, we estimated the emergence and estimation of time to a most recent common ancestor (tMRCA) and the evolutionary rate of Nanjing CRF07_BC clusters. A Bayesian Markov Chain Monte Carlo approach was implemented in the BEAST package v1.10.5. The analyses were estimated using an uncorrelated lognormal relaxed molecular clock under the general time-reversible (GTR) of nucleotide substitution model with a gamma distribution (G) and a proportion of invariable sites (I). For reliable results, we employed TRACER v1.6 (available at http://tree.bio.ed.ac.uk/software/tracer/). Only traces with an effective sample size (ESS) of more than 200 were accepted. All the trees were visualized and edited with FigTree, v 1.4.0 (http://tree.bio.ed.ac.uk/software/figtree).
2.6 Statistical analyses
Statistical analyses were conducted using R version 3.6.2. The χ2 test, Fisher's exact test, and t test were applied to compare the sociodemographic and genotype distribution differences between clustered and non-clustered patient samples. Statistical significance was considered when P < .05.
2.7 Ethics statement
This study was reviewed and approved by the Human Research Ethics Committee of the Zhongda Hospital Southeast University, Jiangsu Province, China (Approval ID: 2017ZDKYSB045).
3 RESULTS
3.1 Study population demographics of HIV-1 CRF07_BC cases
Based on our previous study, HIV-1 CRF07_BC was one of the most dominant circulating strains (30.9%, 160/518) in Nanjing from 2015 to 2017(data not published). One sequence with more than equal to 5% ambiguous nucleotides was excluded in this study. We investigated 159 HIV-1 CRF07_BC cases with an average age at diagnosis of 30.4 ± 11.2 years, and more than 70% age at diagnosis was younger than 35. Most (152, 95.60%) were male, and MSM was the main transmission route (122, 76.7%). Only 54 (34.0%) of these cases were born in Nanjing, while most were migrants from other cities of China (66.0%). Han ethnicity accounted for the majority of the subjects (149, 93.71%), and 144 (90.6%) declared they had no religion. Among the 159 HIV-1 CRF07_BC positive cases, 90(56.6%) was an active test, whereas 69 (43.4%) were passively tested. Most of them (103, 64.8%) reported having multiple sex partners in the last 6 months, and 45 (28.1%) were students. The mean CD4+T cell count was 430.85 ± 241.10 cells/μL with range (16-1897) (Table 1).
Characteristics | Frequency/Mean | Percentage | |
---|---|---|---|
Sex | Female | 7 | 4.4% |
Male | 152 | 95.6% | |
Age at diagnosis, y | <35 | 117 | 73.6% |
≥35 | 42 | 26.4% | |
Ethnicity | Han | 149 | 93.7% |
Minority nationalities | 10 | 6.3% | |
Religion | Yes | 15 | 9.4% |
No | 144 | 90.6% | |
Birthplace | Nanjing | 54 | 34.0% |
Other cites | 105 | 66.0% | |
Infection route | MSM | 122 | 76.7% |
HE | 35 | 22.0% | |
IDU | 1 | 0.6% | |
Other | 1 | 0.6% | |
Approaches to discovering HIV infectiona | Active test | 90 | 56.6% |
Passively tested | 69 | 43.4% | |
Multiple sex partners in the last 6 mo (n > 1) | Yes | 103 | 64.8% |
No | 56 | 35.2% | |
Occupation | Student | 45 | 28.3% |
Other | 114 | 71.7% | |
CD4+T cells count, cells/μL | / | 430.85 ± 241.10 | / |
- Abbreviations: HE, heterosexual; IDU, intravenous drug use; MSM, men who have sex with men.
- a Active test: Voluntary counseling and testing, Seeking medical attention as patient suspects an infection, Testing after high-risk behavior, HIV self-test; Passively tested: HIV testing as part of physical examination, HIV test before surgery, premarital HIV testing, prenatal screening for HIV infection.
3.2 Phylogenetic characteristics of CRF07_BC isolates from Nanjing
Sequences from Nanjing were combined with 424 reference sequences identified as the most similar sequences from the Los Alamos National Laboratory HIV Sequence Database (http://www.hiv.lanl.gov/). Information on the transmission route was available from 392 of the 424 patients. Of these, 335 (85.5%) reported transmission through MSM contact, 47 (12.0%) through IDU, nine (2.3%) through HE contact and one patient reported infection through blood transfusion. The ML phylogenetic analyses of HIV-1 CRF07_BC revealed two lineages in our CRF07_BC samples (Figure 1). Lineage I comprised of 26 Nanjing strains and 101 reference sequences, and all sequences transmitted through IDU belonged to lineage I, and 64.6% (82/127) of the sequences of lineage I were collected before 2010. While lineage II was a large epidemic lineage, which included 133 Nanjing isolates and 323 reference sequences, MSM (89.5%, 408/456) were the main transmission route of the sequences belonging to lineage II, and 76.8% (350/456) sequences of lineage II were isolated from 2010 or after. Detailed information about the geographic source, sampling year, numbers and risk factor of reference strains are shown in supplemental Table S1.

3.3 Transmission clusters
This study characterized a molecular transmission network among Nanjing and reference sequences. A total of 20 molecular transmission clusters containing 476 sequences (81.7%) were identified. The average cluster size was 2.6, with a minimum of two (12 clusters), and a maximum of 392 (1 cluster). The average degree was 103.3 with a range (1-310) (Figure 2). Among all the links in the network, 99.5% (24 439/24 578) contained MSM. Of those 24439 links, 88.2% (21548/24 439) were shared among MSM, followed by 6.0% (1453/24 439) between MSM and Unknown (Infection route not available), 4.7% (1152/24 439) between MSM and HE, 1.0% (246/24 439) between MSM and those transmitted through blood transfusion, 0.2% (40/24439) between MSM and IDU.

Overall, 67.3% of Nanjing sequences (n = 94) were connected to at least one other individual distributed in 11 clusters, and the average degree was 21.2 with range (1-178). A total of 1939 molecular transmission links contained Nanjing sequences, and 1887 (97.3%) of those were directly linked to reference sequences, whereas only 52 (2.7%) links were formed between Nanjing sequences. An overview of the characteristics of the clustered and nonclustered patients of Nanjing is given in Table S2. The clustered patients were more likely to be male; however, infection routes, approaches to discovering infection, multiple sexual partners and domicile were not found to be associated with cluster membership.
3.4 Estimated timeline of CRF07_BC cluster transmission in Nanjing
The tMRCA for HIV-1 CRF07_BC in our study is 1990.23[1986.64-1995.25], for the early HIV-1 CRF07_BC circulating in Nanjing was estimated to be 1998.71[1997.36-2001.07], which was transmitted primarily through IDU. While the tMRCA of the larger epidemic lineage was estimated to be around 2001.52[2000.11-2003.58] (Figure 3), which was transmitted by MSM and heterosexual routes. Under the relaxed exponential clock model, the evolutionary rate of HIV-1 CRF07_BC was 2.45[2.06-2.88] × 10−3 substitutions per site per year. The mean estimated evolutionary rate for the epidemic cluster was slightly lower at 2.38[2.12-2.65] × 10−3 per site per year with the relaxed exponential clock model.

4 DISCUSSION
HIV-1 CRF07_BC rapidly spread in China for nearly 20 years since it was first reported,26 and it has contributed to a large proportion of HIV-1 strains circulating in China. Since the first case of HIV/AIDS was reported in Nanjing in 1994, the epidemic has spread to different populations in the city, and according to the national direct network HIV/AIDS Case Reporting System, the annual number of newly reported HIV cases has increased.15 As a result of treatment advances in recent years, the number of people living with HIV has increased dramatically. We found that MSM was the most affected subpopulation of HIV-1 CRF07_BC in Nanjing. In addition, the number of new infections among those younger than 35 years had increased. Multiple sex partners and low rates of active HIV test were common among individuals infected with HIV-1 CRF07_BC in Nanjing, consistent with our previous study.27 These HIV risk behaviors continue to pose a great challenge to public health programs designed to reduce HIV incidence.28
In this study, we performed a comprehensive phylodynamic analysis of CRF07_BC sequences from Nanjing. In total, 159 CRF07_BC sequences obtained in our laboratory were added to make the data set more representative. We identified two epidemic lineages: a small lineage comprised sequences mainly from patients infected through IDU, and a larger lineage where the great majority contained sequences from MSM. In the late 1990s and early 2000s, HIV-1 CRF07_BC was mainly circulating among IDUs in Yunnan, Xinjiang, Guangxi, Jiangsu, and Sichuan of China. A few years later, HIV-1 CRF07_BC strains have spread from IDUs to MSMs and the general population.29 Our previous study demonstrated that HIV-1 CRF07_BC accounted for 28.9% of all national infections.30 From the phylogenetic tree and transmission cluster analysis of our study, there have been HIV-1 CRF07_BC strains outbreaks across the country, and it is no longer confined to poverty-stricken areas of certain populations or regions, with potential transmission links between people infected with HIV through different routes of transmission. Two rapidly spreading waves of an effective population size of CRF07_BC infections were identified in the ML tree. The second wave coincided with the expansion of the MSM cluster. The results indicated that the control of CRF07_BC infections in MSM would help to decrease the HIV epidemic in China. The network analysis of HIV-1 CRF07_BC indicated that a large transmission cluster with MSMs, male cases was more likely in clusters, and the most of the potential transmission links occurred between Nanjing sequences and other provinces/cities sequences, such as Beijing, Shanghai, and Shenzhen. Together, this suggests that migration has had a strong impact on active cross-regional transmission.31
Our analyses dated the first introduction of HIV-1 CRF07_BC into Nanjing in 1998 from IDU and to MSMs and HEs. The second epidemic wave was introduced into Nanjing in 2001 from MSM, which was consistent with previous studies.32, 33 Our results indicated that CRF07_BC in Nanjing likely originated from similar strains previously found among IDUs in Yunnan province. We also note that HIV-1 CRF07_BC has adapted for rapid sexual transmission, resulting in the surging HIV-1 epidemic and the emergence of second-generation recombinant strains (CRF07_BC/CRF01_AE) in Nanjing.34, 35 This suggests that recombination between CRF07_BC and other subtypes in Nanjing likely began soon after the epidemic among IDU. Meanwhile, we found that the mean estimated evolutionary rate of our epidemic CRF07_BC cluster was slightly higher than another study.29 Compared to the old CRF07_BC isolate, new isolates showed higher gene variation at both the genomic and subgenomic levels.36 Previous research reported that one large CRF07_BC lineage was identified,37-39 whereas we found two lineages which might indicate the higher evolutionary rate of CRF07_BC in China now. This also demonstrated that there were two waves of HIV-1 CRF07_BC infection transmitted to Nanjing. This increase was accompanied by the emergence of complex patterns of viral recombination, including multiple hybrid variants derived from CRF07_BC and other subtypes.
This study used a cross-sectional survey data set for analyses with a few limitations. First, our study was limited to five districts of Nanjing, and so the study cases may not be representative of all HIV-1 cases in Nanjing. Second, due to HIV-related stigma and discrimination, people living with HIV might avoid going to clinics for fear of having their status disclosed or suffer further stigma and discrimination based on their HIV status. Further, people living with HIV in our study might hide information on HIV-related risky behaviors due to social desirability bias. Third, all reference sequences in the study were obtained from a public database, so we conducted a down-sampling procedure to avoid potential sampling bias.
5 CONCLUSION
HIV-1 CRF07_BC was transmitted into Nanjing more than 20 years from Yunnan and has become one of the most predominant subtypes with a higher evolutionary rate. MSM has a bridging role in the spread of HIV-1 among different infection routes,40 combined with the frequent cross-region transmission, which might have significantly contributed to the complicated transmission pattern of HIV-1 CRF07_BC. We thus recommend the development of more prevention and control efforts tailored to MSM and migrant populations.
ACKNOWLEDGMENTS
The authors gratefully acknowledge the college students in Nanjing who took part in this study, further acknowledgment to the staff in Jiangsu CDC, Nanjing CDC and five districts CDC of Nanjing for patient recruitment and blood sample collection. This work was supported by the Humanities and Social Sciences of Ministry of Education Planning Fund of China (no. 16YJA840014); the Medical Science and Technology Development Foundation, Nanjing Department of Health grant number YKK18176.
CONFLICT OF INTERESTS
The authors declare that there are no conflict of interests.
AUTHOR CONTRIBUTIONS
PW and LZ were responsible for the conception and design of the study; WL, YG, and YH were major contributors in writing original draft; XL and XD were responsible for the acquisition of data, JC and YH contributed to the investigation, THM, SC, and QN helped in the analysis and interpretation of data; XL and JO contributed to the final approval of the version to be submitted. All authors read and approved the final version of the manuscript.