Highly variable mitochondrial chromosome content in a holoparasitic plant due to recurrent gains of foreign circular DNA
Abstract
Multichromosomal mitochondrial genomes (mtDNAs) in eukaryotes exhibit remarkable structural diversity, yet intraspecific variability and the origin of the individual chromosomes remain poorly understood. We focus on a holoparasitic angiosperm with an mtDNA consisting of 65 chromosomes largely composed of foreign DNA acquired by horizontal gene transfer (HGT) from its mimosoid hosts. The frequency, timing and population dynamics of these HGT events have not been examined. Here, we sampled different individuals of the holoparasite Lophophytum mirabile, along with their host plants, to assess mtDNA intraspecific variability and capture recent events that may bring insights into the HGT process. We also gathered mitochondrial data from 43 mimosoids to identify older and recent HGT events and assess precisely the proportion of foreign DNA. Through comparative genomic and evolutionary analyses, we uncovered great intraspecific variability in chromosome content and defined the mitochondrial pangenome of L. mirabile with 105 distinct chromosomes. The estimated foreign content reaches 93.5% of the mtDNA, including 73 fully foreign chromosomes that support the circle-mediated HGT model as a key mechanism for their acquisition. We inferred recurrent DNA transfers from the host plants, leading to new mitochondrial chromosomes that replicate autonomously. Our results emphasize the importance of adopting a pangenomic approach to fully capture the genetic diversity and evolution of multichromosomal mitochondrial genomes. This study shows that HGT can strongly influence the mtDNA content and generate enormous intraspecific variability even in geographically close individuals.
1 INTRODUCTION
The mitochondrial genome (mtDNA) architecture across eukaryotes is diverse, with the traditional view centred around a single circular chromosome that retains the ancestral bacterial structure (Smith & Keeling, 2015). However, many eukaryotic lineages, such as plants, parasitic metazoa, red algae, or protists, possess mtDNAs with a multichromosomal or fragmented architecture (Lavrov & Pett, 2016; Lee et al., 2023; Lukeš et al., 2005; Smith et al., 2012; Spencer & Gray, 2011; Sweet et al., 2022; Vlcek et al., 2011; Wu et al., 2022). Our understanding of the intraspecific variability of multichromosomal mtDNAs is limited, especially in terms of the chromosome content, the origin, and the evolutionary forces that govern their population dynamics (Lee et al., 2023; Spencer & Gray, 2011; Yu et al., 2022; Zhou et al., 2023). This knowledge gap requires a pangenomic approach that captures the spectrum of genetic diversity in species with multichromosomal mtDNAs.
A pangenome encompasses the entire set of genomic elements within a species or clade, providing a more comprehensive perspective than single reference-based genomic approaches (Golicz et al., 2020). It consists of core elements, which include sequences shared by all individuals, and variable or accessory elements, which are absent in one or more individuals. Initially, pangenomic analyses were employed to explore bacterial genetic diversity, but their application has since expanded to eukaryotic lineages. Pangenomic studies of plant mtDNAs focused on species with a single circular mitochondrial chromosome and examined structural rearrangements and genetic diversity (Allen et al., 2007; Darracq et al., 2011; Davila et al., 2011; Khachaturyan et al., 2023; Li & Cullis, 2023; Liu et al., 2022; Van de Paer et al., 2018; Wu et al., 2020). However, a pangenomic approach has rarely been applied to multichromosomal mtDNAs in plants (Štorchová et al., 2018; Wu et al., 2015; Zhou et al., 2023), although they are present in numerous flowering plants (Shen et al., 2024; Wu et al., 2022).
A key study on Silene noctiflora (Caryophyllaceae) revealed sequence and structural similarities among the multichromosomal mtDNAs of different individuals (Wu et al., 2015). These mtDNAs differ in chromosome number, particularly in the presence or absence of non-coding chromosomes, for which their origin remains unknown (Wu & Sloan, 2019). The precise molecular mechanisms underlying chromosome gain or genome fragmentation into autonomous subgenomic molecules are still uncertain (Lee et al., 2023; Sweet et al., 2022; Wu & Sloan, 2019; Yu et al., 2022; Zhou et al., 2023). Plausible mechanisms underlying chromosome or DNA gains in plant mitochondria include endosymbiotic gene transfer from the nuclear or plastid genomes or duplications followed by rearrangements and sequence divergence (Mower et al., 2012). Alternatively, horizontal gene transfer (HGT) could contribute to diverse DNA tracts and play a crucial role in the evolution of mitochondrial and nuclear pangenomes (Brockhurst et al., 2019; Raimondeau et al., 2023; Roulet et al., 2024).
HGT involves the transfer of DNA between unrelated organisms, and parasitic plants are particularly prone to acquiring foreign DNA (Petersen et al., 2020; Sanchez-Puerta et al., 2023). The holoparasitic angiosperm Lophophytum mirabile ssp. bolivianum (Balanophoraceae) stands out because the foreign DNA content acquired from its mimosoid hosts (Fabaceae, tribe Mimoseae) climbs to 74% of its multichromosomal mtDNA (Roulet et al. 2024; Sanchez-Puerta et al., 2017). Mimosoid-derived regions are distributed across all 65 circular chromosomes, with 32 mostly foreign chromosomes (Roulet et al., 2024). Preliminary comparisons among individuals of L. mirabile revealed variable mitochondrial chromosome content similar to that observed in Silene (Roulet et al., 2024; Wu et al., 2015). Moreover, foreign chromosomes may result from the circularization of 7–18 kb regions of the mimosoid mtDNAs (Roulet et al., 2024), potentially driving genomic diversity in the mtDNA of holoparasitic Lophophytum spp., via pervasive, ancestral, and recent transfers from their hosts.
The holoparasite L. mirabile emerges as a valuable model system to examine the population dynamics of the foreign DNA content and the evolutionary forces and molecular mechanisms that act on an HGT event. This study is timely given the availability of a large set of mimosoid genomic data, as well as comparative mitochondrial data from four genera within this small family of holoparasites. Here, we gathered mtDNA data from five individuals of L. mirabile and from more than 40 mimosoid species, including the actual hosts, to address the following questions: (1) How large is the mitochondrial pangenome of L. mirabile?; (2) How does mitochondrial chromosome content vary among individuals?; (3) What is the variability of L. mirabile individuals in terms of genetic diversity and gene content and origin?; (4) Is there evidence of chromosome gain or loss?; (5) Do the foreign chromosomes recombine with native chromosomes upon HGT?
2 MATERIALS AND METHODS
2.1 Sample collection
We analyzed the mitochondrial DNA (mtDNA) of five Lophophytum mirabile individuals. Two of these, Lm_Calilegua1 (Sanchez-Puerta et al., 2017) and Lm_Calilegua3 (Garcia et al., 2021), were the subjects of previous studies (Table S1). The remaining samples were collected in Santa Clara (Jujuy, Argentina; Lm_SantaClara), Calilegua (Jujuy, Argentina; Lm_Calilegua2), and Santa Cruz (Bolivia; Lm_Bolivia), respectively (Figure 1; Table S1). The geographic distance between these individuals was calculated using the ‘geosphere’ package in the R software (Figure 1). Lm_Calilegua1 and Lm_Calilegua3 were collected in the same location but four years apart.

Additionally, we collected seeds from the two mimosoid host plant species reported for these five individuals (Table S1). Lm_SantaClara was parasitizing Senegalia praecox (Tribe Mimoseae, Fabaceae), while individuals Lm_Calilegua1-2-3 and Lm_Bolivia were associated with Anadenanthera colubrina var. cebil (Tribe Mimoseae, Fabaceae). The correct taxonomic identification of these plants presented challenges due to the significant morphological homoplasy within the Tribe Mimoseae (Ringelberg et al., 2022). To accurately determine their identity, we conducted maximum likelihood phylogenetic analyses based on the matK gene, including 456 publicly available sequences from mimosoid species (Azani et al., 2017).
2.2 DNA extraction and sequencing
Total DNA was extracted from the inflorescence of Lm_SantaClara, Lm_Calilegua2, and Lm_Bolivia using the DNeasy Plant Maxi kit (Qiagen) (Table S1). For Senegalia, DNA was extracted from young leaves using the cetyl-trimethyl-ammonium bromide (CTAB) method (Doyle, 1991), with the addition of polyvinyl polypyrrolidone. The mtDNA of Anadenanthera was previously assembled by our group (Roulet et al., 2024). Sequencing of Lm_Calilegua2, Lm_SantaClara, Lm_Bolivia, and Senegalia was performed using DNBseq technology, generating 100- or 150-bp reads, reaching a total of approximately 6 GB of clean data per sample.
2.3 Assembly of mitochondrial genomes
The mtDNAs of Senegalia, Lm_SantaClara, Lm_Calilegua2 and Lm_Bolivia were assembled using SPAdes v.3.15.2 (Bankevich et al., 2012) (parameters: --careful -k 31,51,77 --only-assembler). Mitochondrial contigs were visualized in Bandage v.0.8.1 (Wick et al., 2015) through BLASTn analysis (Camacho et al., 2009) using a custom database of Mimoseae and Balanophoraceae mitochondrial DNA downloaded from NCBI (Table S2). The selected mitochondrial contigs were extended using SSAKE v.3.8.5 (Warren et al., 2007) and assembled with GetOrganelle v.1.7.5 (Jin et al., 2020). Subsequently, manual joining was performed with CONSED v.29 based on paired-end reads to ensure accurate assembly (Gordon & Green, 2013). The circular architecture of the mitochondrial chromosomes was confirmed by paired-end reads exhibiting pairwise mappings at the beginning and end of individual contigs, respectively.
RNAseq data can be used to reconstruct complete mitochondrial genomes (Smith, 2013; Tian & Smith, 2016). Using this approach, the mtDNA of Lm_Calilegua3 was assembled from RNAseq data (Garcia et al., 2021). The RNAseq data was derived from a total RNA extraction depleted in rRNA and sequenced with Illumina Hiseq2500 technology, generating 562,330,328 paired reads of 101 bp in length with an average insertion size of 138 bp (Table S1). The assembly was carried out using SPAdes v.3.15.2 (Bankevich et al., 2012) (parameters: --rna --careful -k 31,51 --only-assembler). Mitochondrial contigs were subsequently identified, extended, and joined using the same strategy applied for the DNA sequencing data.
In addition, we took advantage of available DNAseq data from 29 mimosoid species in public databases to assemble de novo mitochondrial genomic data (Table S3). DNAseq sequence data were downloaded from the SRA database, selecting data sets larger than 500 MB. Depending on the type of sequencing, different assembly strategies were employed for each mitochondrial genome using SPAdes (Table S3). Mitochondrial contigs were identified as previously described using BLASTn and visualized with Bandage v.0.8.1 (Wick et al., 2015). To minimize redundancy from different assemblies and reduce the number of contigs, contigs from each species were merged based on sequence overlap using Geneious Prime v.2024.0.3 (Kearse et al., 2012). The parameters applied were a minimum of 100% identity in the overlap and a minimum overlap length of 200 bp.
2.4 Annotation of mitochondrial genomes
The mitochondrial genes of L. mirabile individuals were annotated based on the mtDNA of Lm_Calilegua1 (Sanchez-Puerta et al., 2017), while those of Senegalia were annotated based on the mtDNA of Acacia ligulata (Sanchez-Puerta et al., 2019). All annotations were conducted using Geneious Prime v.2024.0.3 (Kearse et al., 2012). Transfer RNA (tRNA) genes were identified using tRNAscan-SE (Lowe & Eddy, 1997), with default search parameters and “infernal without HMM filter”, with bacterial, organellar, and general sequences as references. Detection of plastid-derived sequences longer than 200 bp was achieved through BLASTn searches against a custom plastid DNA database (Table S2). Genome-wide repeat content analysis was executed using the get_repeats.sh script (Gandini et al., 2019).
The DNA read depth for each mitochondrial chromosome in L. mirabile individuals and in the mtDNA of Senegalia was calculated using Bowtie2 (Langmead & Salzberg, 2012) with the following parameters: -end-to-end -very-fast. The graphical representation of the organellar genomes and their read depth were generated using the R software. The percentage of mitochondrial-like DNA in the mtDNAs was estimated by BLASTn analysis against a custom mitochondrial database (Table S2) while excluding any BLASTn matches involving L. mirabile individuals.
2.5 Analysis of intraspecific variability
A collinearity analysis was conducted using BLASTn to identify similar sequences and investigate the intraspecific variability of the mtDNA across the five individuals of L. mirabile. A chromosome or contig (>6 kb) found in two individuals was considered homologous if either the query coverage or subject coverage was greater than 70%. In cases where both coverages exceeded 60%, it was also deemed homologous. The number of mismatches, gap openings and gaps were obtained from these alignments. The intraspecific variability of the available plastomes of Lm_Calilegua2, Lm_SantaClara and Lm_Bolivia was also analyzed.
Collinearity graphs between mtDNAs were obtained based on whole-genome alignments conducted with LASTZ v1.0.4 implemented in AliTV v1.0.6 (Ankenbrand et al., 2017). Pairwise mitochondrial genome comparisons were made for individuals of L. mirabile and for selected mimosoids.
2.6 Phylogenetic origin of the mitochondrial chromosomes
To investigate the origin of the chromosomes in the L. mirabile individuals, BLASTn searches of each chromosome were conducted against a custom mtDNA database (Table S2), including the mitochondrial data assembled from different mimosoid plants (Table S1). The percentage of foreign content for each chromosome was calculated by determining the total number of BLASTn hits covered by mimosoid plants with greater than 90% identity and >200 bp. The proportions of foreign content in each chromosome were visually represented using ggplot2 in R software (Wickham, 2011).
Results from BLASTn searches with an e-value <2x10−10 were visualized using the Sushi R package v.1.20.0 (Phanstiel et al., 2014). BLASTn hits were organized in rows based on different taxonomic categories for better visualization. These categories encompassed individuals of L. mirabile, other species of Balanophoraceae, as well as species from the order Santalales, mimosoid plants, other legumes, and other angiosperms (Table S2). In cases where multiple overlapping matches occurred within the same taxonomic category, hits with the highest percentage identity were given priority, placing them above others.
2.7 Estimating divergence time of L. mirabile individuals and HGT events
The divergence time among the L. mirabile individuals and the timing of the HGT events were inferred under the assumption of a uniform substitution rate (Zuckerkandl & Pauling, 1965) using the mitochondrial substitution rate estimated for intergenic regions of Arabidopsis (Christensen, 2013). For this, the number of substitutions in non-coding native mitochondrial tracts of L. mirabile was calculated in pairwise comparisons among L. mirabile individuals, and in non-coding foreign tracts against mimosoid legumes.
2.8 Phylogenetic origin of the genes
Phylogenetic analyses were performed for mitochondrial genes. Multiple nucleotide sequence alignments were constructed by including nucleotide sequences from various angiosperms (Table S4). For genes smaller than 120 bp, phylogenetic analyses were performed using 300 bp upstream and downstream of the gene to provide sufficient context for reliable inference. Nucleotide sequences were aligned using muscle in AliView v.1.27 (Larsson, 2014). We identified the best-fit nucleotide substitution model using jModelTest and constructed maximum likelihood trees in RAxML v.8.2.11 (Stamatakis, 2014) with the GTRGAMMAI model, including 1,000 fast bootstrapping pseudoreplicates. The resulting trees were visualized using FigTree v.1.4.4. Upon visual examination of gene alignments, the chimeric nature of two genes was evaluated by Geneconv v.1.81a (Sawyer, 1989). For chimeric genes (i.e. composed of both native and foreign sequences), separate phylogenetic trees were constructed for native and foreign regions.
3 RESULTS
3.1 Mitochondrial genomes of Lophophytum mirabile individuals
The mitochondrial DNA (mtDNA) of the five Lophophytum mirabile individuals exhibited similar structural characteristics, such as a multichromosomal organization with circular chromosomes ranging between 6.6 and 22.5 kb (Table 1). The circular nature of a few of these chromosomes was previously confirmed by Southern Blots (Roulet et al., 2024). However, differences were observed in chromosome number and total genome size. Among the individuals, Lm_Calilegua3 possessed the largest mitochondrial genome, while Lm_Calilegua2 had the smallest. The DNA read depth was even within each mitochondrial chromosome and varied 2-fold among chromosomes within each individual (Figure S1; Table S5). The lower mitochondrial DNA read depth in Lm_Bolivia precluded the complete assembly of the mtDNA. The total repeat content of these individuals ranged from 10.63% to 16.28% of the mtDNA, with a GC content spanning 44.1% to 44.5% (Table 1). Excluding duplicates, the protein and rRNA gene content was the same in all individuals, while the tRNA content and the number of duplicated protein-coding genes varied (Table 1).
Lophophytum mirabile individuals | |||||
---|---|---|---|---|---|
Lm_Calilegua1 | Lm_Calilegua2 | Lm_Calilegua3 | Lm_SantaClara | Lm_Bolivia | |
Genome length (bp) | 821,906 | 749,777 | 867,716 | 811,032 | 763,684 |
Number of chromosomes | 65a | 60 | 67 circular chr, 3 linear contigs | 66 | 35 circular chr, 62 linear contigs |
GC content (%) | 44.5 | 44.4 | 44.4 | 44.4 | 44.1 |
Protein-coding genes (including repeats) | 35 (44) | 35 (42) | 35 (42) | 35 (46) | 35 (37) |
rRNA-coding genes | 3 | 3 | 3 | 3 | 3 |
tRNA-coding genes | 5 | 5 | 5 | 7 | 12 |
Total repetitive content (bp; % of genome) | 133,816; 16.28% | 96,612; 12.89% | 132,651; 15.28% | 118,388; 14.6% | 81,142; 10.63% |
Chloroplast-derived sequences (% of genome) | 0.95% | 1.00% | 1.69% | 0.87% | 0.95% |
Mitochondrial-like sequences (% of genome) | 97.76% | 97.54% | 97.87% | 97.72% | 96.33% |
References | Sanchez-Puerta et al. 2017; Roulet et al. 2024 | This study | This study | This study | This study |
Status of mtDNA | Complete | Complete | Draft | Complete | Draft |
GenBank accession numbers | KU992322-KU992380, KX792461 | PV200216-PV200275 | b | PV200790-PV200855 | PV247579-PV247675 |
- a Sanchez-Puerta et al. (2017) assembled the mtDNA into 60 circular chromosomes. However, based on new evidence, Roulet et al. (2024) showed that the mtDNA consisted in 65 circular chromosomes, by splitting four larger chromosomes into nine smaller ones.
- b The mitochondrial contigs of this individual based on RNAseq data are available in figshare (https://doi.org/10.6084/m9.figshare.28458254).
3.2 Great intraspecific variability in L. mirabile mtDNA in terms of chromosome content
Collinearity analyses among L. mirabile individuals revealed that their mtDNAs differed in the presence or absence of entire mitochondrial chromosomes (Figure 2). The degree of similarity in chromosome content was correlated with the geographic distance (Figure 1) and phylogenetic relationships (Ceriotti et al., 2025) among individuals. Lm_Calilegua1, Lm_Calilegua2, and Lm_Calilegua3 exhibit the highest level of chromosome sharing, whereas the lowest level is observed between Lm_Bolivia and the other individuals. The proportion of shared mtDNA among L. mirabile individuals ranges from 50 to 100% (Figure 2b). The variability in mtDNA content is even more striking when comparing L. mirabile with its close relatives L. pyramidale and Ombrophytum subterraneum (Figure 2), as previously described (Roulet et al., 2024). In contrast, the plastid genome exhibits a high level of conservation, with >99% of the genome shared between L. mirabile individuals and over 90% with their closest relatives (Figure 2b).

The decoding of the mtDNAs of five individuals led to the first report of a mitochondrial pangenome for L. mirabile. The mitochondrial pangenome comprises at least 293 circular chromosomes and 65 linear contigs >1 kb, which may represent additional circular chromosomes leading to an increase in the pangenome size (Table 1). Within this pangenome, we identified 105 distinct types of mitochondrial chromosomes (circular or linear >6 kb), collectively forming the mitochondrial “panchromosome” of L. mirabile (Figure 3). Among the distinct mitochondrial chromosomes, 30 are shared across all individuals, termed “core chromosomes”. The remaining chromosomes (75) exhibit variable presence among individuals, including “soft-core” (only absent in a single individual) and “shell chromosomes” (absent in two or three individuals). Among these variable chromosomes, we identified 26 as soft-core, with 24 absent in Lm_Bolivia but present in the other four individuals. Given the unfinished status of the mtDNA of Lm_Bolivia, it is possible that some of these soft-core chromosomes are actually present in Lm_Bolivia. We also identified eight shell chromosomes shared by different combinations of three individuals. Lastly, we identified 41 “unique chromosomes”, which were exclusively identified in a single individual. Lm_Calilegua3 possesses six, Lm_Calilegua1 has two, Lm_SantaClara has ten, and Lm_Bolivia has 23 unique chromosomes, while no unique chromosomes were found in Lm_Calilegua2.

The mitochondrial panchromosome of L. mirabile was also classified into coding and non-coding chromosomes based on the presence of functional protein or rRNA coding genes identified in Lm_Calilegua3 (Garcia et al., 2021). Note that tRNAs' functionality remains unknown. The 24 coding chromosomes are present in all individuals and represent the majority of the core chromosomes (Figure 3). The non-coding chromosomes carry tRNAs, non-functional genes, or are entirely devoid of known genes and are variably present among individuals (except four) and include all the soft-core, shell and unique chromosomes (Figure 3).
The chromosomes shared between L. mirabile individuals exhibit high sequence identity (Tables S6-S10). In general, the number of single nucleotide polymorphisms (SNPs) outnumbered the number of gap openings, and there were only infrequent structural variations. The frequency of SNPs, gaps and chromosome length variations increases with the geographic distance between individuals (Figure S2a). For example, the alignments (totalling ~750 kb) between the mitochondrial assemblies of Lm_Calilegua1 and Lm_Calilegua2 depict 46 mismatches and 22 gap openings, with no genomic rearrangements and no large differences in chromosome lengths. Lm_Calilegua1 and Lm_Bolivia mitochondrial sequences differ by 955 SNPs and 621 gap openings across the 260 kb shared regions dispersed across 26 chromosomes, which also differ in length. A similar pattern is observed in the plastid genomes when comparing Lm_Calilegua2, Lm_SantaClara, and Lm_Bolivia (Figure S2b).
3.3 The mitochondrial panchromosome of L. mirabile is fully foreign, except for a few native genes
We wished to evaluate the phylogenetic origin of the L. mirabile mitochondrial panchromosome and assess the extent of foreign DNA donated from mimosoid hosts. For this, we increased the diversity of mimosoid mitochondrial data available to include not only the current host species but a greater diversity to recognize both recent and older transfer events (Figure S3). We successfully assembled de novo mitochondrial genomic data from 29 mimosoid species using DNAseq data available in public databases (Table S3). We also analyzed the two host species that are parasitized by the individuals of L. mirabile studied here. Phylogenetic analyses of the matK gene confirmed that the host species of L. mirabile are Anadenanthera colubrina and Senegalia praecox (Figure S4). The mtDNA of Senegalia, was larger (819 kb) than that of Anadenanthera (678 kb) and assembled into five linear contigs (Figure S5). Overall, we compiled mitochondrial data from a total of 43 mimosoids distributed across 12 of the 17 lower-level clades of the tribe Mimoseae (Bruneau et al., 2024) and one unclassified genus (Figure S3). This diversity of mitochondrial data provides a broad representation of the Mimoseae, encompassing 70% of the lower-level clades. Collinearity analyses revealed inter and intraspecific variability among the mimosoid mitochondrial data, with only a few relatively short syntenic blocks and several structural rearrangements (Figure S6).
We inferred the phylogenetic origin of the L. mirabile mitochondrial panchromosome based on the sequence identity and the length of the homologous regions when compared with the mitochondrial data from mimosoids and with all other available angiosperms. The mitochondrial panchromosome of L. mirabile (i.e. 105 distinct chromosomes) exhibits 93.52% foreign content based on BLASTn hits to mimosoid mitochondrial data with >200 bp and > 90% identity. Hits against mimosoids were larger and showed a higher degree of identity than those to the other taxonomic categories, including other genera of Balanophoraceae or Santalales (Figures S7-S8; Table S11). In fact, the regions that were vertically inherited in the mtDNA of L. mirabile are limited to the few native and chimeric genes (see below) and their flanking regions; the rest have been lost or replaced by foreign mitochondrial DNA.
The foreign DNA is distributed across all chromosomes, with 73 fully foreign chromosomes (each showing 90–100% sequence coverage relative to mimosoid mitochondrial data); 23 mostly foreign chromosomes (with 60–89% foreign content); and nine chromosomes with less than 60% foreign content (Figures 3, S7). The majority of the fully foreign chromosomes are non-coding (94.5%), including 32 unique and 24 soft-core chromosomes (Figure 3; Figure S7). In contrast, over 60% of the mostly foreign and less-foreign chromosomes contain coding genes and represent mainly core chromosomes.
3.4 Several foreign chromosomes derive from continuous mitochondrial tracts in mimosoid hosts flanked by direct repeats
Out of the 73 fully foreign chromosomes, 17 show a continuous BLASTn hit with mitochondrial data from a single mimosoid plant species that covers the whole chromosome length (Figure 3; Table S12). The mimosoid species that share these chromosome-long hits include mainly Anadenanthera, and other species from the Parkia clade (Vachellia and Parkia). Continuous hits with the host species Anadenanthera colubrina are primarily found with the unique chromosomes of Lm_Calilegua1 and Lm_Calilegua3, while those with Vachellia appear predominantly with the soft-core chromosomes (Table S12). Noticeably, the continuous BLASTn hits in the mimosoid mtDNAs are flanked by short, direct repeats (6–117 bp, Table S12), which were likely involved in the repair and recombination process that led to the circular chromosomes, as previously proposed (Roulet et al., 2024).
In contrast, virtually no continuous hits were observed in the unique chromosomes of Lm_Bolivia or Lm_SantaClara, not even with the current host of Lm_SantaClara, Senegalia praecox. This suggests that those unique chromosomes in Lm_SantaClara were recently acquired from unsampled mimosoids or represent older transfers from ancestral mimosoids. Meanwhile, even though the host species of Lm_Bolivia is Anadenanthera colubrina, the actual host plant (or any other individual growing in the area) was not sampled. The high intraspecific divergence of the mtDNAs between geographically distant individuals of A. colubrina (Figure S6) may preclude the identification of large, continuous mitochondrial tracts with the unique chromosomes of Lm_Bolivia. It is possible that these chromosomes in Lm_Bolivia were recently acquired from an unsampled population of A. colubrina.
3.5 Ancestral and more recent acquisitions of non-coding chromosomes in the mitochondrial pangenome of L. mirabile
The timing of the horizontal acquisitions of non-coding chromosomes was estimated based on the DNA sequence divergence assuming uniform substitution rates (Figure 4). First, the divergence time among L. mirabile individuals was inferred based on the few native mitochondrial regions. The phylogenetic relationships among individuals match the geographic distance among them. The range of sequence identity of the foreign non-coding chromosomes among L. mirabile individuals (Tables S6-S10) and with the mimosoids indicates that both ancient and recent horizontal transfers occurred (DataSet S1). A single ancestral acquisition of a non-coding core chromosome maintained by all the L. mirabile individuals was identified in branch A (Figures 4, S7). The divergence of this chromosome among the L. mirabile individuals is equivalent to that of the native regions, indicating that it was vertically inherited. Other ancestral acquisitions are concentrated in the common ancestor of the Calilegua and Santa Clara individuals (branch B), and include the majority of the soft-core chromosomes. Additional ancestral but more recent acquisitions are specific to the Calilegua individuals (branch C) and include most of the shell chromosomes. These chromosomes show the expected divergence among the Calilegua individuals. We identified only three potential cases of chromosome loss in Lm_Calilegua2 (Figure S7). On the other hand, unique chromosomes of the individuals of Calilegua represent recent acquisitions to a single individual (instead of ancestral events followed by chromosome losses), as indicated by the much greater sequence identity with mimosoid donors, in particular with the individual of Anadenanthera from Jujuy (DataSet S1), than the expected divergence in ancestral gains.

Unlike the Calilegua individuals, the origin of the fully foreign, unique chromosomes in Lm_Bolivia and Lm_SantaClara remains enigmatic. In the case of Lm_Bolivia, we hypothesize that the unique chromosomes came from the Anadenanthera individuals they parasitize. However, the lack of data on Anadenanthera samples from Bolivia limits the support for this conclusion, as intraspecific variability in Anadenanthera is high (Figure S6). The case of Lm_SantaClara is puzzling as we expected to find evidence of recent acquisitions from its current host, Senegalia. Instead, the unique chromosomes share larger tracts and have a higher identity with the mtDNAs of different species within the Parkia clade and not with Senegalia (DataSet S1). These findings indicate that the donor belongs to the Parkia clade, but the specific donor and timing of the HGT events remain unknown. To address this gap, future efforts should prioritize expanding the sampling of species of the Parkia clade across South America, particularly in regions overlapping the distribution of L. mirabile.
Some of the shared foreign DNA tracts in different L. mirabile individuals are the result of convergent HGT, involving the transfer of the same tract of mtDNA to different lineages independently (Figure 4). This result is supported by a much higher sequence identity of the foreign tracts with mimosoids than among L. mirabile individuals (DataSet S1). These convergent gains include five core, non-coding chromosomes (which should be renamed as soft-core chromosomes) and a few soft and shell chromosomes (Figure S7). This observation is expected, given the high frequency of HGT in L. mirabile.
3.6 Intraspecific variability in the gene content of L. mirabile
We identified 53 distinct genes in the mitochondrial pangenome of L. mirabile, including 42 protein-coding genes (PCGs), three ribosomal RNA genes, and eight transfer RNA genes (Figure S9). All 42 PCGs are present across the five L. mirabile individuals; 32 are single-copy genes, while 10 are duplicated in at least one individual. A previous study of Lm_Calilegua3 confirmed that a single copy of each PCG is functional (Garcia et al., 2021) (Figure 3; Figure S9). PCG copies exhibit a percent identity exceeding 96% across all individuals, with Lm_Bolivia showing the highest divergence (Figure S10).
Duplicate gene copies vary in their presence/absence across individuals (Figure S9). For example, two copies of atp8 are present across all L. mirabile individuals except Lm_Bolivia, which contains three copies. Two other genes (rpl10 and rps19) have duplicate copies in all individuals except Lm_Bolivia. Additionally, two genes (atp9 and rps4) are duplicated exclusively in the Calilegua individuals. Some duplicate PCGs are specific to a single individual, such as Lm_Calilegua1 (cox3 and sdh4), Lm_SantaClara (cox2 and nad5x3), and Lm_Bolivia (atp8). This variability is correlated with the presence/absence of chromosomes (Figure 3).
All L. mirabile individuals contain single copies of the three ribosomal genes (rrn5, rrnL, and rrnS), while there is variability in the presence of the eight tRNAs (Figure S9). Each individual has at least one copy of trnfM-CAU, trnI-CAU, and trnQ-UUG, while the presence of the remaining tRNAs is sporadic. The functionality of tRNAs in L. mirabile has not been examined (Garcia et al., 2021), but we hypothesize that the variably present tRNAs are likely, not functional.
3.7 Origin of genes in L. mirabile individuals
Across all L. mirabile individuals, 33 of the 35 functional PCGs share the same origin—either foreign, chimeric, or native and were vertically inherited from their common ancestor (Figure 5; Figure S11). Among these, 20 genes are entirely foreign, three are chimeric, and 12 are native (Roulet et al., 2020; Sanchez-Puerta et al., 2019). The other two PCGs, cox1 and nad9, are chimeric in Lm_Bolivia, while native in the other individuals (Figure S11). Additionally, duplicate copies of the PCGs are either foreign or chimeric. For instance, all copies of the cox2 gene from L. mirabile individuals cluster within the Mimoseae clade with strong bootstrap support (Figure 6a). Conversely, most native genes from L. mirabile individuals group within the Balanophoraceae clade (order Santalales) with high bootstrap support (≥ 70%). For instance, the ccmFn genes of the L. mirabile individuals cluster with other Balanophoraceae species (bootstrap support = 100%) (Figure 6b). All three ribosomal genes shared across L. mirabile individuals are native, while the eight tRNAs are foreign based on phylogenetic analyses and BLASTn results (Figure S8-9; Figure S11).


4 DISCUSSION
Our study reveals the extensive mitochondrial chromosome diversity in Lophophytum mirabile, highlighting a mitochondrial pangenome enriched with foreign chromosomes from mimosoid host plants. The intraspecific variability points to horizontal gene transfer (HGT) as the primary evolutionary force driving both the incorporation of new chromosomes from host plants and the differentiation of the mitochondrial genomes among individuals. HGT not only generates variability within L. mirabile but also distinguishes it from closely related species in the Balanophoraceae family, emphasizing the role of HGT in shaping its mitochondrial pangenome. These findings contrast with the multichromosomal mtDNAs from different individuals of Thonningia sanguinea (Balanophoraceae) that exhibit identical chromosome content and minimal HGT impact (Zhou et al., 2023).
4.1 The L. mirabile mitochondrial pangenome was largely modified by recurrent transfers of host-derived mitochondrial DNA
The mitochondrial pangenome of L. mirabile demonstrates significant intraspecific variability, driven by the repetitive gain of foreign mitochondrial chromosomes. Within this pangenome, we identified 105 distinct chromosomes larger than 6 kb that constitute the “mitochondrial panchromosome”. These chromosomes are classified into four groups: 30 core (shared by all individuals), 26 soft-core (absent from a single individual), eight shell (detected in three or two individuals), and 41 unique (present in a single individual) chromosomes. The structure of the mitochondrial panchromosome suggests that a larger sampling of individuals could uncover new chromosomes, expanding the L. mirabile mitochondrial pangenome and panchromosome.
Earlier studies based on a single individual of L. mirabile underestimated the true content of foreign DNA as a result of the limited amount of mimosoid data available, which included only six of the 3,500 mimosoid species representing five of 17 described lower-level clades (Bruneau et al., 2024; Roulet et al., 2024; Sanchez-Puerta et al., 2017, 2019). This study also confirmed that analyzing foreign DNA in a single individual overlooks significant amounts of foreign DNA that vary both among and within populations. By gathering mitochondrial data from five individuals of L. mirabile and from 43 diverse mimosoids (sampling 11 of the 17 lower-level clades and one of the two grades) the estimated foreign DNA content in the mitochondrial pangenome of L. mirabile reaches 93.52%. The mtDNA of L. mirabile is a chimera consisting of foreign mitochondrial DNA acquired from different mimosoid hosts along its evolution, in addition to a minimal fraction of native DNA. This is the largest proportion of foreign DNA in a mitochondrial genome or any other genome in the entire tree of life, although greater amounts of foreign DNA have been identified in much larger bacterial, mitochondrial or nuclear genomes (Dunning Hotopp, 2011; Kloub et al., 2021; Rice et al., 2013).
4.2 Evidence of circle-mediated HGT in L. mirabile
The efforts to gather a wide sampling of mimosoids paid off, as we were able to identify 73 fully-foreign chromosomes. Many of these chromosomes exhibit high similarity to continuous mitochondrial tracts in donor mimosoid plants. Furthermore, in numerous instances, the horizontally transferred DNA tracts were flanked by short, direct repeats in the donor mimosoid plants. These results support the “circle-mediated HGT model”, which postulates that foreign DNA tracts circularize and form autonomous foreign chromosomes without the need to be integrated into the resident mitochondrial genome (Roulet et al., 2024). This model explains the chromosomal diversity observed in L. mirabile.
After germination, L. mirabile penetrates the root of its mimosoid host and establishes vascular connections that facilitate the exchange of macromolecules and organelles, as demonstrated in experimental grafts (Hertle et al., 2021). These transfers may occur at the endophytic phase, when a small number of parasitic cells reside within the host roots (Gonzalez & Sato, 2016). Upon mitochondrial transfer, host mitochondria fuse with the native ones, as proposed in the “fusion-compatibility model” (Rice et al., 2013). Rather than immediately integrating into the native mtDNA, regions of the foreign DNA can circularize into smaller molecules, providing stability against exonuclease degradation. These circular molecules may originate through microhomology-mediated repair, using the flanking short, direct repeats in the foreign mitochondrial DNA, as proposed in the “circle mediated HGT model” (Roulet et al., 2024).
4.3 Multiple cases of ancestral, convergent, and recent gains of mitochondrial chromosomes via HGT, and rare chromosome losses
The 81 non-coding mitochondrial chromosomes of L. mirabile exhibit variable presence across individuals and could be the result of recent acquisitions in specific individuals (or convergently in more individuals independently), or ancestrally acquired and subsequently lost in certain lineages or individuals. Temporal analyses indicate that the origin of non-coding chromosomes results from recent, ancestral, and convergent HGT events, with only rare outright losses (Figure 4). These findings underscore the dynamic nature of the L. mirabile mitochondrial pangenome.
The individuals Lm_Calilegua1 and Lm_Calilegua3 harbour fully foreign, unique chromosomes that exhibit homologous and continuous segments with Anadenanthera mitochondrial DNA, with exceptionally high sequence identity. This pattern strongly suggests recent HGT events specific to these individuals from their current host, Anadenanthera.
Many non-coding chromosomes are shared among subsets of L. mirabile individuals as a result of ancestral acquisitions that predate their divergence. Only one non-coding chromosome appears to have been ancestrally acquired and vertically inherited across all L. mirabile individuals. Other chromosomes were likely acquired in the common ancestors of specific groups, such as the Calilegua and Santa Clara individuals, or exclusively within the Calilegua lineage. Remarkably, 70% of these ancestral foreign chromosomes show high sequence identity with species from the Parkia clade, suggesting that the ancestral mimosoid donor likely belonged to this lineage.
The dynamic nature of HGT in L. mirabile is further emphasized by cases of convergent acquisitions. Convergent HGT events are not only evident among L. mirabile individuals but also between L. mirabile and L. pyramidale. Temporal analyses of L. mirabile individuals indicate that five non-coding core chromosomes were independently acquired by Lm_Bolivia and the common ancestor of the other individuals, highlighting the repeated and opportunistic nature of HGT in this system.
In contrast to the multiple chromosome gains inferred, only three putative chromosome losses in Lm_Calilegua2 were detected (Figure 4); however, it is also possible that this individual never carried those chromosomes. Other potential chromosome losses could be inferred from the evolutionary history of the unique chromosomes found in Lm_SantaClara, which could be the result of ancestral gains followed by losses in the individuals from Calilegua.
4.4 The fate of the foreign non-coding chromosomes in L. mirabile: autonomous replication until they merge
Upon HGT, the foreign circular chromosomes are maintained separately from the other chromosomes, replicating autonomously. Many of these non-coding chromosomes were acquired thousands of years ago in the ancestor of the Calilegua and Santa Clara individuals, indicating that once the chromosomes are able to replicate, they are not frequently lost, even though they are likely evolving under genetic drift. It is possible that the high number of copies of each chromosome might delay their loss through random sorting. Rolling circle-replication has been reported for the circular mitochondrial chromosomes in the close relative Rhopalocnemis phalloides (Yu et al., 2022) and might also take place in L. mirabile. In addition, the recently acquired chromosomes do not recombine or integrate with other chromosomes immediately, likely due to the very few repeats found in these chromosomes precluding homologous recombination, which is the most frequent recombination pathway in plant mitochondria (Chevigny et al., 2020; Gandini et al., 2023; Garcia et al., 2019; Gualberto & Newton, 2017). Eventually, they can recombine with other chromosomes by homologous recombination across homologous genes, leading to chimeric chromosomes as we observed in the core coding chromosomes (e.g. LmBolivia_contig33 carries a chimeric cox1 gene as a result of a recombination event specific to this lineage). This can lead to a situation in which a single chromosome can host genes from different origins or a fully foreign chromosome may carry a native gene. Fully foreign chromosomes may also undergo homologous recombination using homologous foreign tracts acquired in a previous HGT event. Alternatively, chromosomes may suffer rearrangements and merge with other chromosomes through less frequent non-homologous pathways. Overall, the results suggest that an acquired foreign chromosome is more often lost (or unrecognized) through its recombination with other chromosomes than through random sorting of intact circular chromosomes.
4.5 Gene content and phylogenetic origin of mitochondrial genes in L. mirabile
The mitochondrial pangenome of L. mirabile reveals variability in both the presence and phylogenetic origin of genes, including native, foreign, and chimeric genes. However, regardless of their origin, all individuals retain at least one copy of the mitochondrial genes essential for splicing, protein synthesis, and respiration that are conserved across angiosperms (Adams et al., 2002; Richardson et al., 2013). This is consistent with the evidence indicating that L. mirabile mitochondria maintain a functional oxidative phosphorylation system (Gatica-Soria et al., 2024).
Many of the foreign or chimeric mitochondrial genes are ancestral and were maintained in the five individuals studied. The functionality analysis of protein-coding genes in Lm_Calilegua3 indicates that most of these genes acquired in the common ancestor of the five L. mirabile individuals or earlier (in the shared ancestor of L. pyramidale and L. mirabile) are functional, with the exception of duplicate copies of rps12 and atp8. In contrast, genes acquired after the divergence of the L. mirabile individuals are likely nonfunctional because they coexist with the functional copy, except for those that recombine with such copy leading to a chimeric functional gene, i.e. cox1 and nad9 in Lm_Bolivia.
5 CONCLUSIONS
Studying the mitochondrial DNA from five L. mirabile individuals revealed remarkable variability in the presence and absence of mitochondrial chromosomes. These chromosomes are acquired through circle-mediated HGT from host plants, resulting in a mitochondrial pangenome enriched with foreign genetic material from mimosoid donors. Our findings emphasize the importance of understanding intraspecific variation in multichromosomal mitochondrial genomes and adopting a pangenomic approach to fully capture the genetic diversity in angiosperms, which generally show highly dynamic mitochondrial DNA. This study shows that HGT can strongly influence the mtDNA content and generate enormous intraspecific variability even in geographically close individuals.
AUTHOR CONTRIBUTIONS
LG-S and MVS-P conceived and designed the research.
LG-S, MER, and WDT performed data analysis.
LG-S and MER extracted DNA from L. mirabile and its hosts.
HS identified and collected L. mirabile and host species.
MEB identified, collected, and extracted DNA from mimosoid plants.
LG-S, MER, and MVS-P drafted and revised the manuscript.
ACKNOWLEDGEMENTS
We thank C. L. Gandini and L.F. Ceriotti for their help with the computational analyses. This work used the SARTOI Cluster from IBAM (CONICET-UNCuyo).
FUNDING INFORMATION
This work was supported by grants from Fondo para la Investigación Científica y Tecnológica (grant numbers PICT2020-01018 and PICT2021-GRT_TI-00435) and Universidad Nacional de Cuyo (grant number 06/A092-T1).
Open Research
DATA AVAILABILITY STATEMENT
The sequence data and the mitochondrial sequences generated or or analyzed in this study are available in GenBank (PV247579–PV247675, PV200216–PV200275, PV200790–PV200855, PV200785–PV200789), the Sequence Read Archive (SRA: SRR31569850, SRR31569849, SRR31569847, SRR31707307), and Figshare (https://doi.org/10.6084/m9.figshare.28458254).