Reply to “Misuse of molecular tools results in misleading dates for the ancestor of the Indo-Pacific humpback dolphin” by Chen
In our recent study (Zhao et al., 2021), based on a large sample-set collected from the western Pacific region to the eastern Indian Ocean, we reconstructed the population history of the Indo-Pacific humpback dolphin (Sousa chinensis). Through calculating two genetic divergence parameters (Nei's dA and FST) of the mtDNA control region (CR), we reported a species-level divergence between dolphins from the Pacific and Indian Ocean regions, and subspecies level divergence between the dolphins from Chinese waters and the Gulf of Thailand. Complete lineage sorting seems to be supported by the fixation of mutations between the samples of these three regions/subregions, except for the three likely occasional migrants found in the Indian Ocean region. Finally, we tried to estimate the substitution rate of the taxon, which was further used to date the historic gene flow from the Gulf of Thailand to the Chinese waters.
Chen (2021) queries many aspects of our research, which falls into three major categories: (1) misuse of the calibration points, which lead to an underestimated substitution rate, and therefore an up-bias in dating the speciation; (2) inappropriate choice of genetic marker (mtDNA control region) for the molecular clock study; and (3) insufficient sampling effort (n < 20) in many of our sampling sites, which lead to a misleading conclusion about population genetic structure. We accept some of Chen's worries, but push back on the others as many of Chen's arguments missed the target.
Underestimation of the substitution rate
Chen's (2021) key criticism was that the calibration points we used were too deep compared to the divergence within the Indo-Pacific humpback dolphins, which consequently substituted the intraspecies or interpopulation mutation rates (relatively high) with the interspecific substitution rate (relatively low). By calling it “short-term mutation rate,” she seemingly followed the current taxonomic definition of genus Sousa, which classified the humpback dolphin from east Asia, Southeast Asia, and the Asian region of the Indian Ocean into one species (Jefferson & Curry, 2015). But what if this definition was over-conservative (Taylor et al., 2017)?
Previous sampling and research efforts were highly disproportional across the “species' range,” mostly constrained in a few locations of China. The data and results based on these efforts probably did not reflect the phylogenetic relationship or population genetic structure of humpback dolphins across the Indo-Pacific region. Prior to our study, Amaral et al. ‘s (2017) work represented the only sampling effort of the species' range in the Indian Ocean region, which found a highly diverged lineage in the Bangladesh waters compared to those of China and east coast of China-Indochina and Malay Peninsulas. Accordingly, it was proposed that “humpback dolphins in the Bay of Bengal may comprise a fifth species” (Jefferson & Curry, 2015) and “the taxonomic affinities of humpback dolphins in the entire Bay of Bengal (i.e., eastern India, Sri Lanka, Bangladesh, and Myanmar) urgently need to be re-examined” (Jefferson & Smith, 2016).
Our findings suggest that the spatial range of this “fifth species” is not restricted to the area defined by Jefferson & Smith (2016), but extending southwards to the west Thai waters, whose divergence (dA) with those of the Pacific region was more than twice the species-level of marine mammals (Rosel et al., 2017). Divergence was also observed, albeit at a relatively lower level, between the east Thai waters (the Gulf of Thailand) and the Chinese waters (Mendez et al., 2013; Zhao et al., 2021). Though it is premature to be conclusive, the divergence between these three regions/subregions (the area from the Bangladesh waters to the west coast of the Malay Peninsula, the Gulf of Thailand, and the Chinese waters) is nothing equivalent to the interpopulation or short-term divergence as indicated by Chen (2021).
Chen (2021) disputed our explanation that the low substitution rate of the humpback dolphin was driven by drift, as the mutation rate of neutral loci is independent of population fluctuation (Lanfear et al., 2014). She ignored the fact that this conclusion was built on a simplified population model with no overlapping generation. While in a more realistic model incorporated with overlapping generations, as in the dolphin population, substitution rates of the neutral loci were found associated strongly with the fluctuation demography (Balloux & Lehmann, 2012).
Declining effective population size (Ne) was first reported for the largest population of humpback dolphins in the Pearl River Delta region, which was attributed to the shrinking estuarine habitat as a result of the sedimentation process (Lin et al., 2016). Declining Ne was also detected in the populations from the Leizhou Peninsula and the Beibu Gulf (Zhang et al., 2020), and may be shared by most of the unexamined populations. Therefore, variation in the substitution rate of humpback dolphins compared to other delphinids cannot be excluded.
Inappropriate choice of genetic marker for the molecular clock study
We agree and embrace this criticism. We used CR for molecular clock analysis to date the most recent gene flow detected between the Gulf of Thailand and the Chinese waters. In this context, CR is the only genetic marker available for most of the study locations.
To reestimate the substitution rate of the CR used by our study, we reran the BEAST simulation with six calibration points (Cetacea = 36.4–52.4 mya; Mysticeti = 25.2–36.4 mya; Ziphiidae = 13.2–23 mya; Phocoenidae + Monodontidae = 7.5–19.5 mya; Delphinidae = 8.5–19.5 mya; Delphininae = 3.98–8.5 mya; McGowen et al., 2020), which resulted in a point estimate of the most recent common ancestor (TMRCA) of currently recognized Indo-Pacific humpback dolphins as 6.62 mya (95% HPD: 4.46–8.54 mya). This estimate is substantially smaller than the one we previously reported, but relatively higher than those based on multiple gene markers (2.57–5.23 mya; McGowen et al., 2009), on target sequence capture (4.39–5.23 mya; McGowen et al., 2020), on the whole mitochondrial genomes (2.4 mya; Xiong et al., 2009), or on the genomic data (4.8 mya; Zhang et al., 2020). Given the fact that CR does not evolve in a clock-like mode, the choice of gene marker may account for part of this discrepancy. Besides, the samples from the Thai waters may provide more ancestral phylogenetic information than those from Chinese waters that have been the most used in the early studies (McGowen et al., 2009, 2020; Xiong et al., 2009; Zhang et al., 2020). Therefore, the sampling scheme may also have impacted the results.
The recalculated CR substitute rate for the Indo-Pacific humpback dolphins was 8.01 × 10−9 (95% HPD 6.32 × 10−9–9.12 × 10−8). Accordingly, the date of the most recent gene flow from the Gulf of Thailand to the South China Sea was translated into 31.2 ka, 95% CI [27.4, 39.6], which corresponds to the last time period when the sea level fell below −80 m or lower (Zong et al., 2016). This new date is more congruent with the early hypothesis that the population biology of species may be linked with the coastline development caused by the sea level fluctuation (Lin et al., 2010), since the shortening coastline, and therefore the distance between the humpback dolphins of eastern Thai waters and the South China Sea (Voris, 2000), during the marine regression should trigger the last detectable gene flow within this region.
Insufficient sampling effort
Because our sample size was <20 individuals in most sampling sites, Chen (2021) dismissed our results about the population genetic structure of Indo-Pacific humpback dolphins. We are fully aware that the population genetic structure will be more precisely depicted with increasing sample size and the number of genetic markers. However, collecting a perfectly representative data set for this kind of biological sample in humpback dolphins is extremely difficult, which in fact has never been achieved in the past several decades (Amaral et al., 2017, 2020; Frère et al., 2008, 2011; Lin et al., 2010, 2012; Mendez et al., 2013; Zhang et al., 2020). Besides, for five out of seven sampled populations, the sample size was >15, which comprised a relatively high proportion of these populations of small sizes (e.g., 26% for the Xiamen population and 10% for the Beibu Gulf population; Lin et al., 20211; Zeng et al., 2020), and therefore already represent the populations better than what Chen (2021) seems to believe. If to have the conclusive result is the goal, waiting for a perfect data set that is not feasible is not the best way to make advances in science and every small step should be encouraged, as long as the potential limitations of the study (e.g., sample size) are well acknowledged, and the results are not overinterpreted or discussed as definitive and validated. After all, to call a result “misleading” should be substantiated with new data or evidence, especially when this result “seems concordant with the results derived from genomic single nucleotide polymorphism (SNP) and chromosome conformation capture (Hi-C) data.”
ACKNOWLEDGMENTS
This study was funded by grants from the National Natural Science Foundation of China (42076159), the Natural Science Foundation of Fujian Province, China (2021 J06031), China-ASEAN Maritime Cooperation Fund (HX04-210901), and the Natural Science Foundation of Guangdong Province, China (2018A030313870).
AUTHOR CONTRIBUTIONS
Liyuan Zhao: Conceptualization; formal analysis; validation; writing – review and editing. Watchara Sakornwimon: Resources; validation. Wenzhi Lin: Conceptualization; Methodology; Writing original draft; Software; Writing review & editing; Supervision. Peijun Zhang: Resources; validation. Rachawadee Chantra: Resources; validation. Yufei Dai: Validation. Reyilamu Aierken: Investigation; validation. Fuxing Wu: Validation. Songhai Li: Validation. Kongkiat Kittiwattanawong: Resources; validation; supervision. Xianyan Wang: Conceptualization; supervision; validation.