Metabolome and transcriptome reveal the biosynthesis of flavonoids and amino acids in Isatis indigotica fruit during development
Abstract
Isatis indigotica Fort. is a famous medicinal plant that is also used as a natural dye and functional vegetable. The characteristics of the I. indigotica fruit during development are largely unknown, information that is essential for the exploitation and seedlings cultivation of I. indigotica. In this study, the biochemical, metabolite characteristics and gene expression profiling of I. indigotica at four developmental stages were investigated. A total of 428 metabolites were detected and categorized into 17 categories. High contents of anthocyanins, especially cyanidin 3-glucoside, might contribute to the purple colouration of I. indigotica fruits. Moreover, dozens of flavonoid components, including taxifolin, quercetin, astragalin and isovitexin 2″-O-beta-D-glucoside, and several other active components were also up-regulated in mature fruits. The abundance of antioxidants might endow a significantly stronger antioxidant activity of mature I. indigotica fruits compared to many other reported species. Enrichment analyses revealed that flavonoid and anthocyanin biosynthesis genes were mostly enriched in up-regulated gene sets during fruit development. The up-regulated structural genes, including IiCHS, IiCHI, IiF3H, IiDFR, IiANS, IiFLS, IiUGT, and transcription factors such as IiMYBs, IibHLHs and IiNACs were identified as candidate regulators of flavonoid and anthocyanin biosynthetic pathway. Furthermore, biosynthesis of amino acids was enriched in all pairwise comparisons of metabolites in fruits at four developmental stages. The differential accumulation of amino acids might result from the differentially expressed genes involved in amino acid biosynthesis. Taken together, these findings provide a comprehensive understanding of metabolite profiling and gene expression patterns in I. indigotica fruit during maturity, which is useful for pharmaceutical extractions and seedling cultivation of I. indigotica.
1 INTRODUCTION
Isatis indigotica Fort. is a biennial herb of the Brassicaceae family and is widely cultivated throughout China, India and Southeast Asia (Liang et al., 2016). This medicinal plant has significant applications in traditional medicine and the pharmaceutical industry.
Both the root (Banlangen) and leaf (Daqingye) of I. indigotica have been included in the Chinese Pharmacopoeia since 1985. They are often used to alleviate symptoms such as fever, headache, and sore throat. Noticeably, Banlangen has been employed to treat mild symptoms of coronavirus disease 2019 (COVID-19) (Yin and Chang, 2021). Additionally, Qingdai (Indigo naturalis), derived from the leaves and stems via fermentation, is used to treat wounds, acne, erysipelas and carbuncles (Chen, 2013).
I. indigotica contains a wide array of bioactive compounds, including alkaloids, organic acids, flavonoids, polysaccharides, lignans, nucleosides, amino acids, and steroids. Nie et al. (2022) identified 392 compounds in 17 categories in Banlangen. Among these, Banlange polysaccharide (IRPS) is noted for its anti-tumor effect and immune-enhancing properties (Wang et al., 2020a). D-mannose is known to interfere with glucose metabolism, reduce fat deposition, regulate intestinal flora, and play a role in immune regulation (Yang, 2015; Xu et al., 2021). In recent years, studies have also shown that indigo and indirubin present in I. indigotica, both of which were used in the textile industry for hundreds of years (Liu et al., 2014), have pharmacological activities, such as liver protection and anti-microbial activities, with indirubin also demonstrating anti-tumour effects (Li et al., 2016a). Due to its rich bioactive components and health-promoting properties, I. indigotica is also consumed as a functional vegetable with antiviral benefits. Despite the progress towards identifying the chemical constituents and their pharmacological activities of I. indigotica leaves and roots, the biochemical and genetic profile of its fruit during development has remained largely unexplored.
Fruits, which develop from the gynoecium, are essential for protecting and nourishing seeds (Müntz et al., 1978). In medicinal plants, fruits often accumulate numerous bioactive components that provide significant therapeutic benefits (Wang et al., 2018; Zhao et al., 2020; Ma et al., 2020). For instance, wolfberry (Lycium barbarum) fruits are rich in secondary metabolites like anthocyanins, flavonoids, and betalains, which contribute to their medicinal properties (Wang et al., 2018). The biosynthesis of flavonoids and phenylpropanoids in wolfberry is especially prominent in the later stages of development (Zhao et al., 2020). Similarly, sea buckthorn (Hippophae rhamnoides L.) fruits are valued for their high flavonoid content and pharmacological benefits (Ma et al., 2020; Tkacz et al., 2020). Despite the well-documented importance of fruit bioactivities in other medicinal plants, the metabolomic profile of I. indigotica fruit, as well as the gene expression patterns regulating the accumulation of these metabolites, remains largely unknown.
Metabolomics has become a widely used tool for biomarker discovery and to identify metabolites that can influence a cell's or an organism's phenotype (Rinschen et al., 2019). For I. indigotica, metabolomic analysis has given valuable insights, for example, Li et al. (2017) found that high concentrations of CO2 reduced flavonoid biosynthesis, sucrose metabolism and pyrimidine metabolism. Similarly, nitrogen deficiency has been shown to modify both the primary and secondary metabolic profiles in the roots and leaves of I. indigotica (Cao et al., 2019). Additionally, Liu et al. (2022) indicated that treatment with methyl jasmonate increased flavonoid, lignin and soluble protein content while reducing the soluble sugar levels in I. indigotica. These findings underscore the usefulness of metabolomics in screening and identifying the active compounds and offer guidance for determining the optimal harvesting period for I. indigotica.
This study revealed apparent phenotypic differences of I. indigotica fruits at different developmental stages. The observed increase in total anthocyanin and flavonoids likely explains the gradual deepening in fruit colour and the increased activity of 2,2-diphenyl-1-picrylhydrazyl (DPPH) and 2,2′-azinobis-(3-ethylbenzthiazoline-6-sulphonate) (ABTS).
Furthermore, the expression patterns of differentially expressed genes involved in flavonoids and amino acid biosynthesis were investigated during the development of I. indigotica fruits, providing insights into the metabolic dynamics during fruit development. This study offers a preliminary understanding of the complex regulatory networks governing metabolite accumulation, which will expand the utilization of I. indigotica.
2 MATERIALS AND METHODS
2.1 Plant material
Isatis indigotica plants were planted in plastic pots measuring 15 cm in diameter and 25 cm in height. Each pot containing two seedlings was filled with a substrate consisting of a 2:1 (v/v) mixture of nutrition soil and vermiculite. The plants were grown in a greenhouse at Huaihua University (27°33′39″N, 109°59′08″E), Hunan Province, China, under natural growth conditions from April 2021 to May 2022. The seedlings were fertilized every two weeks with fertilizer nutrient solution (N-P2O5-K2O), and watered as needed. The fruits were collected at four development stages, namely: 30 days after flowering (DAF) (Iin1), 37 DAF (Iin2), 44 DAF (Iin3) and 51 DAF (Iin4) (Figure 1). For each biological replicate, the fruits were collected from at least four individual plants from the fifth to the fifteenth fruits of the spike. Thereafter, these samples were mixed together, frozen in liquid nitrogen, and stored at −80°C for further analysis.

2.2 DPPH radical scavenging assay
The ability of the plant extracts to scavenge DPPH free radicals was determined using specific detection kits (A153-1-1, Nanjing Jiancheng Bioengineering Institute) according to the manufacturer's instructions. A total of 40 μL sample extract, freshly made in methanol, was mixed with 60 μL DPPH reaction solution. After 30 min of incubation in the dark at room temperature, the absorbance was measured at 517 nm on an ELISA assay system (Infinite M200 pro). Methanol was used as a blank. A standard curve was prepared with different concentrations of Trolox (20 ~ 140 μM), and the results were calculated using the following equation:
DPPH value (μmol TE g−1 FW) = c × V × t/m, where c is the trolox concentration (μmol ml−1) of the corresponding standard curve of the diluted sample, V is the sample volume (ml), t is the dilution factor, and m is the weight of the sample fresh matter (g).
2.3 ABTS radical cation scavenging activity
The ABTS assay was performed using a specific detection kit (A015-2-1, Nanjing Jiancheng Bioengineering Institute) according to the manufacturer's instructions. The [2,2′-azino-bis (3-ethylbenzthiazoline-6-sulfonic acid), ABTS] radical cation was made in 75 mM potassium phosphate buffer saline solution (PBS) (pH 7.4). The working solution of ABTS radical cation (potassium persulphate, 140 mM, 0.088 mL) and ABTS (7 mM, 10 mL) in PBS was prepared at the time of analysis, and the stock solution was adjusted by PBS to reach an absorbance value of 0.70 ± 0.20 (734 nm). Briefly, 5 μL of each sample and 195 μL ABTS radical cation solution were added to a tube and incubated for 6 min. Then, the absorbance was read at 734 nm. The control assay was made with distilled water in place of the samples. A standard curve was prepared using Trolox at different concentrations (0.15 ~ 1.5 mM), and the results were expressed as μmol of Trolox equivalents per g of sample (μmol TE g−1 FW) according to the calculation methods of DPPH.
2.4 Measurements of anthocyanin, chlorophylls and carotenoid contents
The content of total anthocyanins in fruit was determined as previously described (Zeng et al., 2014). Briefly, powdered fruit samples (0.1 g) were extracted in 1 mL methanol (2% formic acid) for 24 h, at 4°C in darkness. Subsequently, samples were centrifuged at 13,400 g for 20 min at 4°C. The absorbance was measured at 534 nm.
The contents of chlorophyll a, chlorophyll b, and carotenoids in fruit were determined as described by Wintermans and De Mots (1965). Briefly, 0.2 g fresh samples were added to 80% ethanol, shaken, and placed in darkness for 24 h. The chlorophyll a, chlorophyll b, and carotenoid contents were measured at 663, 645 and 470 nm and then were calculated according to Wintermans and De Mots (1965).
2.5 Metabolomics analysis
For metabolites extraction, 0.1 g sample was loaded into 600 μL of methanol (stored at −20°C), containing 2-amino-3-(2-chloro-phenyl)-propionic acid (4 ppm), vortex for 30 s, prior to a tissue grinding for 90 s at a frequency of 60 Hz. The sample was treated by ultrasound for 15 min at 40 kHz in an ice-water bath. The mixture solution was centrifuged at 13,400 g, 4°C for 10 min. The supernatant was filtered by 0.22 μm membrane and transferred into the detection bottle for high-performance liquid chromatography–tandem mass spectrometry (UPLC–MS/MS) detection.
Metabolomics detection was carried out on an ACQUITY UPLC System (Waters Corp.), which was combined with an ACQUITY UPLC HSS T3 (150 × 2.1 mm, 1.8 μm) (Waters Corp.) to separate the metabolic extracts. The column was maintained at 40°C. The flow rate and injection volume were set at 0.25 mL min−1 and 2 μL, respectively. For LC-ESI (+)-MS analysis, the mobile phases consisted of (C) 0.1% formic acid in acetonitrile (v/v) and (D) 0.1% formic acid in water (v/v). Separation was conducted under the following gradient: 2% C hold for 1 min, 2 to 50% C over 8 min, 50 to 98% C over 3 min, 98% C hold for 1.5 min, 98 to 2% C within 0.5 min, finally at 2% C kept for 6 min. For LC-ESI (−)-MS analysis, the analytes were carried out with (A) acetonitrile and (B) ammonium formate (5 mM). Separation was conducted under the following gradients: 2% A hold for 1 min, 2 to 50% A over 8 min, 50 to 98% over 3 min, 98% A hold for 1.5 min, 98 to 2% A over 0.5 min, finally at 2% A kept for 3 min, using a flow rate of 0.40 mL min−1. The operating conditions of MS were as follows: spray voltage, 3.50 kV and − 2.50 kV for ESI (+) and ESI (−), respectively; capillary temperature, 325°C. ESI (+) and ESI (−) ion scan mode with the range (m/z) 100–1000 Da was adapted to acquit signal. The declustering potential was 80 V; spray gas, 50 psi; auxiliary heating gas, 50 psi; curtain gas, 30 psi; source heating temperature, 500°C; normalized collision energy, 30 eV.
The raw mass spectrometry files generated by LC–MS were converted to mzXML file format by the MSConvert utility in the Proteowizard package (v3.0.8789) (Smith et al., 2006). The XCMS package (Navarro-Reig et al., 2015) was used for data processing, such as baseline filtering, peak area extraction and alignment, retention time correction, noise removal, and deconvolution. The mass spectrometry peak area data was normalized using Pareto scaling. At the same time, variables for quality control (QC) samples with relative standard deviations (RSD) > 30% were excluded, and log10 logarithmic processing was performed to obtain the final data matrix for subsequent analysis (Dunn et al., 2011). Additionally, the obtained MS/MS spectra data was matched against the authentic standards available in the Human Metabolome Database (HMDB) (https://www.hmdb.ca/), massbank (Horai et al., 2010), Kyoto Encyclopedia of Genes and Genomes (KEGG) (Ogata et al., 1999) and Metlin (http://metlin.scripps.edu/) databases for compound identification of metabolites.
2.6 Transcriptomic analysis
The total RNA was extracted from samples using RNAprep pure Plant Kit (DP441, TIANGEN), and the integrity of the total RNA was assessed as previously described (Huang et al., 2021). The purified RNAs from each sample were used for library construction using the Illumina Library Prep Kit (Illumina), and the quality of these libraries was assessed using the Bioanalyzer 2100 system (Agilent Technologies). Then, these constructed libraries were sequenced on the Illumina HiSeq X Ten platform. The raw reads were processed to obtain clean reads using the Trimmomatic software after removing adaptors, low-quality and ambiguous bases (Bolger et al., 2014). Next, the clean reads were mapped to the Isatis indigotica reference genome (NCBI accession number VHIU00000000) (Kang et al., 2020), using HISAT 2 (Kim et al., 2015). The fragments per kilobase million (FPKM) value was calculated for gene-level quantification using the StringTie (v1.3.3b) (Pertea et al., 2015). Differential expression genes (DEGs) were identified by using the DESeq2 (v1.24.0) with false discovery rate (FDR) < 0.05 and log2 |fold-change| ≥ 1. GO and KEGG enrichment analyses were performed on the annotated DEGs using topGo and clusterprofiler packages, respectively (Kanehisa et al., 2008). The RNA-Seq data are available in the National Center for Biotechnology Information (NCBI) under accession number PRJNA953646.
2.7 Quantitative real-time polymerase chain reaction (qRT-PCR)
The qualified RNA was used as template to synthesize first-strand cDNA using the FastKing RT kit (Tiangen). The specific primers were obtained using Primer 3, and the primers used in this study are listed in Table S1. The PCR was performed according to the protocol described in previous studies (Huang et al., 2021). The IiEF-1-delta gene was used as the internal reference gene. qRT-PCR was performed using the SYBR Green PCR kit (Tiangen) on the Bio-Rad CFX96 Touch q-PCR System (Bio-rad). The 2-ΔΔCT method was used to analyze relative transcript abundances. All selected genes underwent triplicate analysis to verify the accuracy of the transcriptome data.
2.8 Integrated transcriptomic and metabolomic analysis
According to the results of metabolomic and transcriptomic analysis, different flavonoid metabolites, DEGs involved in the flavonoid biosynthesis pathway, and DEGs encoding transcription factors (TFs) were selected for the integrative analysis. Correction coefficients between metabolites and transcripts were calculated using Pearson's correlation analysis (p < 0.05). Correlation coefficients with a value of R2 > 0.9 were selected to construct a network, which was visualized using the Cytoscape software (v3.5.1).
2.9 Statistical analysis
Data are presented as the means ± standard deviation (n = 3). A student's t-test was performed using SPSS 22.0 software to determine the levels of significance (p < 0.05). Asterisks represent statistical differences between samples (*p < 0.05, **p < 0.01).
3 RESULTS
3.1 Changes in phenotypes, and anthocyanin, chlorophyll and carotenoid contents
I. indigotica fruits were collected at four different development stages (30, 37, 44 and 51 DAF) to establish the phenotypic and physiological changes. As the fruit matured the colour gradually shifted from green (30 DAF) to dark purple (51DAF), and the total anthocyanin contents increased significantly, rising over 500 folds from 0.09 ± 0.04 OD530-OD600 g−1 FW at 30 DAF to 45.54 ± 4.40 OD530-OD600 g−1 FW51 DAF (Figure 1B). In contrast, the chlorophyll content, including chlorophyll a and chlorophyll b, decreased with 41 and 36%, respectively, by Iin4. Carotenoid levels exhibited an initial rise, peaking at Iin3, followed by a decrease (Figure 1B).
3.2 DPPH and ABTS antioxidant assay
Considering the strong antioxidant properties of anthocyanins, we evaluated the antioxidant capacity of I. indigotica fruit extracts using the DPPH- and ABTS-radical scavenging assays. The extracts showed significant antioxidant capacity, especially at Iin4 (Figure 1B). The antioxidant capacity of DPPH significantly increased during fruit maturation, ranging from 2.0316 ± 0.25 mmol to 3.9874 ± 0.14 mmol TE g−1 FW. Similarly, the antioxidant activity of ABTS increased with fruit development, increasing 6.81 in Iin3 and 7.86 times in Iin4 compared with Iin1.
3.3 Metabolomic profiling of I. indigotica fruits at four growth stages
To obtain an overview of the metabolic changes during fruit maturation, we performed an untargeted metabolomic analysis on fruits collected at four different developmental stages. The principal component analysis (PCA) showed a distinct separation between the developmental stages, with a high consistency across the biological replicates, confirming the reliability of the data (Figure S1). These results indicated that the metabolome data had good repeatability and reliability. A total of 428 metabolites were identified and grouped into 17 categories, including amino acids and peptides (16.36%), carbohydrates and their conjugates (14.25%), and fatty acyls (12.15%), among others (Figure 2A, Table S2). The relative metabolite contents (peak area) of the 17 categories were compared and showed that the amino acids and peptides were the most abundant metabolites, followed by carbohydrates and carbohydrate conjugates and benzene and substituted derivatives (Figure 2B).

We performed pairwise comparisons of the four groups to identify differences in accumulated metabolites. Using a significance cutoff of p ≤ 0.05 and VIP >1, we identified 44 DAMs (9 up- and 35 down-regulated) between Iin1 and Iin2. Between Iin2 and Iin3, 101 DAMs were detected (49 up- and 52 down-regulated) and 71 (34 up- and 37 down-regulated) were identified between Iin3 and Iin4 (Figure 2C). Notably, the most significant changes occurred between Iin2 vs. Iin3, marking this as a key period for metabolite accumulation in I. indigotica fruit.
Between Iin1 and Iin2, 18 pathways were enriched (p < 0.05), including amino acid biosynthesis, arginine biosynthesis, and 2-oxocarboxylic acid metabolism. In the Iin2 vs. Iin3 comparison, 26 pathways were enriched, such as ABC transporters, sphingolipid signaling, and phenylpropanoid biosynthesis. Finally, the transition from Iin3 to Iin4 showed enrichment in pathways like amino acid biosynthesis, galactose metabolism, and flavonoid biosynthesis, highlighting the importance of these metabolic processes during fruit maturation (Figure S2).
3.4 Comparative analysis of DEGs among fruits from four growth stages
To investigate the gene expression pattern of I. indigotica fruits, we conducted a comparative transcriptomic analysis of fruits from four developmental stages. The PCA analysis showed a clear clustering of samples within each developmental stage, with a distinct separation between clusters (Figure S3A). A total of 5,910 DEGs were identified. In the Iin1 vs. Iin2 comparison, 2,746 DEGs were found, with a nearly even split between upregulated (1,372) and downregulated (1,374) genes. The largest number of DEGs (3,872; 1,612 up-regulated and 2,260 down-regulated) were found comparing Iin2 and Iin3, while the smallest number was found in Iin3 vs. Iin4 (612 DEGs; 307 up-regulated and 305 down-regulated) (Figure S3B). 1,700, 2,704 and 256 DEGs were unique to Iin1 vs. Iin2, Iin2 vs. Iin3 and Iin3 vs. Iin4, respectively (Figure S3C). KEGG enrichment analysis highlighted several key pathways involved in fruit development. Between Iin1 and Iin2, over-represented pathways included starch and sucrose metabolism, base excision repair, and biosynthesis of secondary metabolites (Figure S3D). The Iin2 vs. Iin3 comparison revealed enrichment in pathways related to photosynthesis, flavonoid biosynthesis, and phenylalanine metabolism. For Iin3 vs. Iin4, pathways such as secondary metabolite biosynthesis and glucosinolate biosynthesis were prominent. These findings underscore the dynamic metabolic and developmental processes occurring during I. indigotica fruit maturation. These results suggest significant transcriptional changes, particularly between the early and mid-stages of fruit development.
3.5 Clustering analysis of all DEGs
To investigate the expression patterns of all DEGs during fruit development, clustering analysis was performed using the K-means clustering algorithm (Figure 3). The analysis revealed four distinct, with clusters 1 and 2 showing down-regulation and cluster 3 and 4 showing up-regulation. DEGs in cluster 1 showed a general decreasing trend, with pathways related to pentose phosphate pathway, phenylpropanoid biosynthesis,and carbon metabolism being over-represented (Figure 3). In contrast, genes in cluster 4 displayed increasing expression levels, particularly those involved in flavonoid and anthocyanin biosynthesis (Figure 3). These genes likely play a crucial role in metabolite accumulation during fruit maturation.

3.6 DEGs related to flavonoid biosynthetic processes
The KEGG enrichment analysis between Iin2 and Iin3, a critical stage in fruit development, revealed that many genes were enriched in the flavonoid biosynthesis pathway (Figures 3 and S3). This suggests that flavonoid biosynthesis plays an important role in the morphological changes and metabolite accumulation of I. indigotica fruits. A gene-metabolite regulatory network was constructed to identify key candidate genes involved (Figure 4). A total of 20 flavonoids metabolites were identified, including epicatechin, catechin, isoquercitrin, cyanidin 3-glucoside (Figures 4 and 5A and Table S2). Among these, epicatechin, catechin, and glycitein were most abundant in Iin1. Isoquercitrin, isorhamnetin, luteolin, naringenin, procyanidin B2, fisetin and eriodictyol were highly accumulated in Iin2. In Iin3, the taxifolin, cyanidin 3-glucoside, and luteolin 7-glucoside showed significant accumulation. The contents of taxifolin, cyanidin 3-glucoside, kaempferide, quercetin, astragalin, isovitexin 2″-O-beta-D-glucoside, 1-O-gallyoyl-beta-D-glucose, (−)-epigallocatechin and quercetin 3-O-beta-D-glucosyl-(1- > 2)-beta-D-glucoside significantly increased, and the levels of isoquercitrin, naringenin, epicatechin and catechin significantly decreased in Iin4 compared with Iin1.


The high levels of flavonoids in Iin3 and Iin4 are consistent with the elevated expression of several structural genes involved flavonoid biosynthesis, including 4-coumarate:coenzyme A ligase (Ii4CL) (EVM0026468), phenylalanine ammonia-lyase (IiPAL) (EVM0015021 and EVM0019502), cinnamate 4-hydroxylase (IiC4H) (EVM0016195 and EVM0003767), chalcone isomerase (IiCHI) (EVM0031582 and EVM0002220), IiCHS (EVM0029225), flavonol synthetase (IiFLS) (EVM0018542), anthocyanin synthase (IiANS) (novel.1702), IiDFR (EVM0012564), flavanone 3-hydroxylase (IiF3H) (EVM0008651) and four UDP-glycosyltransferases (IiUGT) (EVM0010027, EVM0004999, EVM0000307 and EVM0021425). In contrast, only a few genes related to flavonoid biosynthesis exhibited elevated expression levels in Iin1 and Iin2, namely two Ii4CLs (novel.1993 and EVM0010151) and two IiCHIs (EVM0019437 and EVM00228819).
It is well known that the structural genes and transcription factors (TFs) work synergistically to modulate flavonoid biosynthesis. Our analysis identified a total of 329 TFs in DEGs set, including 52 IiMYBs and MYB-related, 32 IibHLH, 30 IiNACs, 20 IiB3, 15 IiHD-ZIPs, 14 IiMADSs, 14 IibZIPs, 13 IiWRKYs and 139 other genes encoding TFs. To further identify key candidate genes and TFs associated with flavonoid biosynthesis, we investigated the correlation between structural genes, TFs and flavonoid metabolites (Figure 5). According to the correlation analysis between structural genes and flavonoids (Figure 5B), we found that all identified flavonoid glycosides were positively correlated with most structural genes, apart from two Ii4CLs (EVM0010151 and novel.1993) and two IiCHIs (EVM0022819 and EVM0019437). Noticeably, taxifolin showed a significantly positive correlation with Ii4CL (EVM0026468), IiCHS (EVM0029225), IiF3H (EVM0008651) and IiDFR (EVM0012564) (correlation coefficient, CC = 1.0), while it was negatively correlated with IiCHI (EVM0019437) (CC = −1.0). The cyanidin 3-glucoside exhibited a close relationship with IiUGT (EVM0021425) with a CC of 0.86. Further, Pearson's correlation analysis showed that 55 TF-encoding genes had strong correlation coefficient values (R2 ≥ 0.9, p ≤ 0.01) (Figure 5C). These TF-encoding genes included six IiZF-HDs, three IiMYBs, eight IiNACs, seven IibHLHs, four IiERFs, three IiMIKC_MADSs and 24 others. We identified 21 TFs that had a close relationship with (−)-epigallocatechin, such as three IibHLHs (EVM0004975, EVM0012918 and EVM0019907), IiMYB (EVM0023366), three IiNACs (EVM0022095, EVM0023808 and EVM0028280). With CC ≥0.9, 21 TFs had a close relationship with taxifolin, such as IiMYB (EVM0018813), four IibHLHs (EVM0002821, EVM0009452, EVM0029097 and novel.952), three IiNACs (EVM0011715, EVM0011969 and EVM0023808), two IiZF-HDs (EVM0018524 and EVM0019703).
3.7 DEGs involved in amino acids metabolism
A diverse array of amino acids was detected in I.indigotica fruits, and their levels changed throughout fruit development (Figure 6). The content of histidine (His), phenylalanine (Phe), tryptophan (Trp) and citrulline (Cit) significantly increased in Iin3 and Iin4, especially of Phe and His. In contrast, the levels of arginine (Arg), leucine (Leu), isoleucine (Ile), aspartate (Asp), glutamate (Glu) and asparagine (Asn) decreased during the development of I.indigotica fruits. The level of glutamine (Gln), serine (Ser), threonine (Thr), proline (Pro), tyrosine (Tyr), methionine (Met) and lysine (Lys) initially increased, followed by a decrease in Iin3 and Iin4. Notably, amino acids including Gln, Ser, Pro, Asp, Tyr, Asn, Met and Lys were found to be abundant in Iin2.

To elucidate the molecular basis of amino acid accumulation at four different developmental stages of I. indigotica fruits, we systemically identified genes encoding biosynthetic enzymes as well as those involved in the initial steps of amino acid catabolism. A total of 31 DEGs encoding 17 enzymes were identified, their expression levels are presented in Figure 6. The high expression levels of shikimate dehydrogenases (IiMDHs) (novel.937 and EVM0027691), aspartate kinases (IiAKs) (EVM0012157 and EVM0014845) and asparagine synthase (IiAspS) (EVM0009283) correlated with the increase of Asp, Asn, Lys, Met and Thr in Iin1 and Iin2. The high accumulation of Phe may be due to the elevated expression levels of chorismate mutases (IiCMs) (EVM0007160 and EVM0020344) in Iin3 and Iin4. Similarly, the higher Pro content was accompanied by the higher expression level of glutamate-5-semialdehyde dehydrogenase (IiGSADH) in Iin2.
3.8 Verification of the DEGs
We used qRT-PCR to assess the transcript abundance of 15 significant DEGs between I. indigotica fruits at four growth stages to confirm the expression of the putative DEGs linked to the accumulation of metabolites (Figure 7). The RNA-seq gene expression results for these selected DEGs were generally consistent with their qRT-PCR expression patterns. Thus, the acquired data confirmed the validity of DEGs predicted to participate in flavonoid biosynthesis during I. indigotica fruits at four growth stages.

4 DISCUSSION
In recent years, the growing interest in functional foods has largely been driven by a shift in consumer preference toward natural ingredients and products with health-enhancing properties (Nikmaram et al., 2017). I. indigotica is highly valued in traditional medicine and the pharmaceutical industry, and as a functional food due to its antiviral properties. Consequently, the investigation of the underused fruits of I. indigotica presents an opportunity to explore new sources of natural functional ingredients. Additionally, the study is important for optimizing the cultivation of I. indigotica, which has direct implications for its use in both the herbal medicine industry and agriculture. The findings could also support efforts to enhance public health and promote economic development.
4.1 Purple colouration and antioxidation activity of I. indigotica fruit associated with flavonoid biosynthesis and accumulation
Flavonoids display diverse biological functions and have potential medicinal value due to their strong antioxidant and free-radical scavenging activities (Shen et al., 2022). As an important subgroup of flavonoids, anthocyanins are a kind of natural dietary antioxidants that provide protective effects against the harmful effects of oxidative stress (Ullah et al., 2019; Fallah et al., 2020). In this study, the increase in total anthocyanin content during fruit development was proportional to the antioxidant activity, especially in the mature fruits (Iin4), which have the highest total anthocyanin content and antioxidant activity. Previous studies have indicated that anthocyanins, cyanidin and peonidin are associated with bluish-red/magenta/crimson colours (Zhang et al., 2022; Liu et al., 2024). Consistent with these studies, we identified cyanidin 3-glucoside as the most significantly up-regulated compound in mature I. indigotica fruits. This suggests that cyanidin-3-O-glucoside plays a pivotal role in the accumulation of the purple pigment in I. indigotica fruit. Similarly, cyanidin-3-O-glucoside is also prominent in other red-to-blue fruits, such as raspberries, strawberries, Chinese bayberry fruit, and mulberry, contributing to their characteristic red hues (Sun et al., 2012).
In addition to cyanidin-3-O-glucoside, we identified 19 differentially accumulated flavonoids in I. indigotica fruits. These include taxifolin, kaempferide, quercetin, astragalin, isovitexin 2″-O-beta-D-glucoside, 1-O-galloyl-beta-D-glucose and (−)-epigallocatechin, which were more abundant in Iin4 than in the three other stages (Figure 5A). Taxifolin and quercetin, both flavonols, have attracted broad attention of dietitians and medicinal chemists due their wide range of health benefits, including the prevention of various malignancies (Brito et al., 2015; Das et al., 2021). Moreover, flavonols are known to stabilise anthocyanin colouration through co-pigmentation (Houghton et al., 2021). Kaempferide has been confirmed to have anticancer potential to be pharmacologically safe (Nath et al., 2015). Previous research has identified isovitexin as a potential therapeutic agent due to its anti-oxidant and anti-inflammatory properties (Zhang et al., 2011; He et al., 2016). Grzesik et al. (2018) indicated that catechins and epicatechin are effective radical scavengers with the potential for drug development to promote healthy ageing.
The significant accumulation of these flavonoids likely enhances the antioxidant activity of mature I. indigotica fruit, as confirmed by the ABTS and DPPH antioxidant activities analysis. The ABTS antioxidant activity of mature I. indigotica fruit (20.38 ± 0.24 mmol TE g−1 FW) is significantly higher than that of many other species, such as red and black rice (ranging from 101.43 to 80.29 μmol TE g−1) (Chen et al., 2022), blueberry (149.8 μmol TE g−1 DW), blackberry (114.8 μmol TE g−1 DW) and strawberry (44.4 μmol TE g−1 DW) (Huang et al., 2012). Similarly, the DPPH antioxidant activity of mature I. indigotica fruit (3.9874 ± 0.14 mmol TE g−1 FW) is also much higher than that of many species, including 25 varieties of Hibiscus sabdariffa (ranged from 27.3 ± 0.3 to 112 ± 8 μmol TE g−1 DW) (Borrás-Linares et al., 2015) and two varieties of mustard grains (Brassica nigra and Sinapsis albam) (2.01 ~ 58.7 μmol TE g−1) (Rasera et al., 2019).
Additionally, mature I. indigotica fruits are rich in several other active components, such as melibiose and isochlorogenic acid b. Melibiose, an important reducing disaccharide, has been proven to have beneficial applications in both medicine and agriculture (Xu et al., 2017). Isochlorogenic acid has multiple biological and pharmacological effects, including antioxidant, anti-inflammatory, antimicrobial, hypoglycemic, neuroprotective, cardiovascular protective, and hepatoprotective properties (Wang et al., 2020b). A high dietary intake of fruits and vegetables rich in antioxidants has been linked to preventive effects against oxidative stress-mediated diseases (Pinto et al., 2023). Consequently, I. indigotica fruit can be presumed to have pharmacological benefits and be promising as a anti-oxidation additive.
4.2 Candidate structural genes responsible for flavonoid biosynthesis and accumulation
The biosynthesis of flavonoids begins with phenylalanine, which produces phenylpropanoids and enters the flavonoid-anthocyanin pathway. The synthesis of flavonoids is controlled by multiple structural genes (such as CHS, CHI, F3'H, F3'5'H, FLS and ANS) (Hichri et al., 2011) and transcript factors (TFs) (such as MYB, bHLH, bZIP and WD40) (Malacarne et al., 2016; Wang et al., 2017). In this study, we identified 20 differentially expressed structural genes involved in flavonoid biosynthesis, including IiPAL, IiC4H, Ii4CL, IiCHS, IiCHI, IiF3H, IiDFR, IiANS, IiFLS and IiUGT. Most of these genes, which were highly expressed in Iin3 and Iin4, were confirmed by qRT-PCR (Figure 7), and correlate with the high flavonoid content at these developmental stages. As key upstream enzyme genes of the phenylpropane pathway, PAL, C4H, 4CL, CHS and CHI play essential roles in determining the metabolic flow towards flavonoids, phenolic acids, and lignin (Ma et al., 2013). The imbalance in expression between Ii4CL and IiCHI genes was likely responsible for the diversity of phenylpropanoid derivatives found in I. indigotica fruits. The expression levels of IiPAL, IiC4H and IiCHS were significantly higher in Iin3 and Iin4 than in Iin2 and Iin1, which is consistent with the higher contents of many flavonoids in the more mature stages of the fruit (Figure 4). Over-expression of the CHS gene enhances resistance to high-light by increasing anthocyanins synthesis (Zhang et al., 2018). F3H is one of the core enzymes acting at the bifurcation between anthocyanin and flavonol and can convert flavanones into dihydroflavonols, intermediates in the biosynthesis of flavonols (Prescott and John, 1996). Thus, we speculate that the high expression of IiF3H in Iin3 and Iin4 accounts for the significant accumulation of taxifolin, a common precursor for anthocyanin and condensed tannins (Xie et al., 2004). The FLS is a key enzyme directing the metabolic flux toward flavonol production (Winkel-Shirley, 2001), with the expression being particularly high in Iin4. This is likely the cause of the increased accumulation of kaempferide and quercetin in Iin4 (Figure 4). In the anthocyanins biosynthesis pathway, the key genes DFR and ANS play an important role in controlling the flux of flavonoid into anthocyanins and have key roles in regulating the biosynthesis of anthocyanins in purple stems of Astragalus membranaceus (Dong and Lin, 2021). The elevated expression of IiDFR and IiANS in Iin3 and Iin4, combined with the high expression of IiUGT, strongly suggests a coordinated upregulation of the anthocyanin biosynthetic pathway, leading to the observed accumulation of cyanidin 3-glucoside and other anthocyanins in these stages.
Several studies have confirmed the role of UGT genes in flavonoid 3-O-glycoside accumulation (Ono et al., 2010; Xie et al., 2022). In our study, the correlation analysis of DEGs and DAMs related to flavonoids indicated that the expression levels of many DEGs were positively associated with the accumulation of flavonoid glycosides.
4.3 Candidate transcription factors related to flavonoid biosynthesis and accumulation
As widely recognized, the biosynthesis and distribution of flavonoids in plant tissues are fine-tuned and tightly regulated by structural genes and transcription factors. To date, transcription factors, such as MYB, bHLH, WD40, bZIP, MADS, and WRKY proteins, have been proven to be involved in flavonoid regulation (Amato et al., 2019; Jaakola, 2013; Li et al., 2016b).
In this study, three IiMYB and five IibHLH exhibited a positive correlation with flavonoid accumulation. The WRKY, NAC, MADS-box, and bZIP families act as positive regulators of anthocyanin biosynthesis by indirectly or directly interacting with the MYB-bHLH-WD40 complex, which serves as a major regulator determining the activation and spatial and temporal expression of anthocyanin biosynthesis genes (Amato et al., 2019; Jaakola, 2013; Li et al., 2016b).
In the present study, we observed that two IiNAC genes showed a positive correlation with cyanidin 3-glucoside content, and seven IiNAC genes showed a positive relationship with taxifolin, epigallocatechin and 1-O-galloyl-beta-D-glucose. This suggests a potential functional divergence among members of the NAC gene family in the anthocyanin biosynthesis of I. indigotica fruit. In addition, two lateral organ boundary domain (IiLBD) genes were positively correlated with naringenin and luteolin. Zhang et al. (2019) found that CsLBDs are involved in the regulation of flavonoid synthesis in Camellia sinensis. We identified six IiZF-HDs that displayed a close relationship with flavonoids, including (−)-epigallocatechin, naringenin, 1-O-galloyl-beta-D-glucose and taxifolin. The ZF-HDs are involved in various biological processes, such as the response to abiotic stress and the development of phytohormones (Tan and Irish, 2006; Wang et al., 2016).
4.4 Key structural genes responsible for amino acids biosynthesis and accumulation
Essential amino acids are vital for human and animal nutrition, serving as intermediates for pharmaceuticals (Yamamoto et al., 2017). In this study, we identified 18 amino acids in I. indigotica fruits (Figure 6). Among them, His, Phe, Trp and Cit showed an increase with the development of I. indigotica fruits. Phe is a key metabolic node that plays an essential role in the interconnection between primary and secondary metabolism in plants and serves as a precursor for numerous plant compounds, such as flavonoids. The abundance of Phe provides essential precursors for the synthesis of flavonoids, which might contribute to the enhanced accumulation of flavonoids in Iin3 and Iin4. Additionally, Phe and Trp are needed for the production of chemical messengers (neurotransmitters) in the brain, including dopamine, epinephrine, and norepinephrine (Fernstrom and Fernstrom, 2007).
In mature I. indigotica fruit, the high accumulation of Phe might be due to the increased expression of IiCMs (EVM0007160 and EVM0020344) (Figure 6). Over-expression of PhCM2 in petunia has been shown to increase the flux in cytosolic Phe biosynthesis (Lynch et al., 2020). We found abundant Asn, Lys, Met and Thr in Iin2, which might be linked to the higher expression levels of IiAspS (EVM0009283) and IiAK (EVM0012157 and EVM0014845).
This is consistent with the higher Pro content in Iin2, correlated with a high expression of pyrroline-5-carboxylate synthase (IiP5CS) (novel. 2437). The P5CS gene encodes for a bifunctional enzyme that catalyzes the rate-limiting reaction in Pro biosynthesis and is commonly used in metabolic engineering for Pro overproduction (Rai and Penna, 2013). The accumulation of Glu in immature I. indigotica fruits might be attributed to the high expression of n-acetylglutamate synthase (NAGS). As the fruit ripens, Glu, serves as a precursor that is converted into different metabolites, including Arg, Cit, Gln and His, later in the development, which in turn governs osmotic adjustment, protein synthesis, redox balance, and other cellular metabolism (Brosnan and Bronson, 2013).
5 CONCLUSION
The I. indigotica is a famous and important medicinal plant, and it has been used in medical treatments, the pharmaceutical industry, and the handmade textile industry for hundreds of years. In this study, the biochemical properties, metabolomic profiles and gene expression patterns of I. indigotica fruits at four developmental stages were examined (Figure 8). The cyanidin-3-O-glucoside may play pivotal roles in the accumulation of the purple pigment in I. indigotica fruit. The mature I. indigotica fruit exhibited significantly higher antioxidant activities than that of many reported species, which might attributed to abundant flavonoids and several other active components in mature I. indigotica fruit. The high accumulation of those flavonoids has a close relationship with the highly expressed structure genes (IiCHS, IiF3H, IiFLS, IiDFR, IiANS and IiUGT) and TFs-encoding genes (IiMYB, IibHLH, IibZIP and IiNAC) involved in flavonoids biosynthesis. Additionally, many amino acids were also abundant in mature I. indigotica fruit due to the highly expressed genes involved in amino acid biosynthesis. These results suggested that I. indigotica fruit can be presumed to have pharmacological profits and anti-oxidation additives.

AUTHOR CONTRIBUTIONS
Hui Huang planned and designed the research, analyzed data and wrote the manuscript. Li Zhang performed the experiments, analyzed data and wrote the original draft. Liye Guan and Libin Zhang collected resources and performed the experiments.
FUNDING INFORMATION
This work was supported by the National Natural Science Foundation of China (32270249); Key Scientific Research Projects of Hunan Education Department (22A0542); a grand from Foundation of Hunan Double First-rate Discipline Construction Projects of Bioengineering and Key Laboratory of Research and Utilization of Ethnomedicinal Plant Resources of Hunan Province (SWGC-04); Postdoctoral Directional Training Foundation of Yunnan Province (E23178L261), and the Innovation and Entrepreneurship Training Program supported this study for College Students of Hunan Provincial Education Department (S202210548051).
Open Research
DATA AVAILABILITY STATEMENTS
The full RNA-seq data have been submitted to the Sequence Read Archive (SRA) of the NCBI under BioProject accession PRJNA953646 (https://www.ncbi.nlm.nih.gov/sra/PRJNA953646).