Volume 176, Issue 6 e14653
ORIGINAL RESEARCH
Full Access

Harnessing light-harvesting chlorophyll a/b-binding proteins for multiple abiotic stress tolerance in Chlamydomonas reinhardtii: Insights from genomic and physiological analysis

Ali Raza

Ali Raza

Guangdong Key Laboratory of Plant Epigenetics, Shenzhen Engineering Laboratory for Marine Algal Biotechnology, Guangdong Technology Research Center for Marine Algal Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China

College of Physics and Optoelectronic Engineering, Shenzhen University, Shenzhen, China

Shenzhen Collaborative Innovation Public Service Platform for Marine Algae Industry, Longhua Innovation Institute for Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China

Search for more papers by this author
Yiran Li

Yiran Li

Guangdong Key Laboratory of Plant Epigenetics, Shenzhen Engineering Laboratory for Marine Algal Biotechnology, Guangdong Technology Research Center for Marine Algal Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China

College of Physics and Optoelectronic Engineering, Shenzhen University, Shenzhen, China

Shenzhen Collaborative Innovation Public Service Platform for Marine Algae Industry, Longhua Innovation Institute for Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China

Search for more papers by this author
Hafiz Muhammad Rizwan

Hafiz Muhammad Rizwan

Shenzhen Key Laboratory of Food Nutrition and Health, College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen, China

Search for more papers by this author
Asadullah Khan

Asadullah Khan

Guangdong Key Laboratory of Plant Epigenetics, Shenzhen Engineering Laboratory for Marine Algal Biotechnology, Guangdong Technology Research Center for Marine Algal Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China

College of Physics and Optoelectronic Engineering, Shenzhen University, Shenzhen, China

Shenzhen Collaborative Innovation Public Service Platform for Marine Algae Industry, Longhua Innovation Institute for Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China

Search for more papers by this author
Yuqi Peng

Yuqi Peng

Guangdong Key Laboratory of Plant Epigenetics, Shenzhen Engineering Laboratory for Marine Algal Biotechnology, Guangdong Technology Research Center for Marine Algal Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China

Shenzhen Collaborative Innovation Public Service Platform for Marine Algae Industry, Longhua Innovation Institute for Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China

Search for more papers by this author
Chunli Guo

Chunli Guo

Guangdong Key Laboratory of Plant Epigenetics, Shenzhen Engineering Laboratory for Marine Algal Biotechnology, Guangdong Technology Research Center for Marine Algal Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China

Shenzhen Collaborative Innovation Public Service Platform for Marine Algae Industry, Longhua Innovation Institute for Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China

Search for more papers by this author
Zhangli Hu

Corresponding Author

Zhangli Hu

Guangdong Key Laboratory of Plant Epigenetics, Shenzhen Engineering Laboratory for Marine Algal Biotechnology, Guangdong Technology Research Center for Marine Algal Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China

Shenzhen Collaborative Innovation Public Service Platform for Marine Algae Industry, Longhua Innovation Institute for Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China

Correspondence

Zhangli Hu,

Email: [email protected]

Search for more papers by this author
First published: 11 December 2024
Citations: 2
Edited by C.H. Foyer

Abstract

Light-harvesting chlorophyll a/b-binding proteins (LHC) of photosystem II perform key functions in various processes, e.g., photosynthesis, development, and abiotic stress responses. Nonetheless, comprehensive genome-wide investigation of LHC family genes (CrLHCs) has not been well-reported in single-cell alga (Chlamydomonas reinhardtii). Here, we discovered 61 putative CrLHC genes in the C. reinhardtii genome and observed that most genes demonstrate stable exon-intron and motif configurations. We predicted five phytohormones- and six abiotic stress-interrelated cis-regulatory elements in promoter regions of CrLHC. Likewise, 19 miRNAs targeting 42 CrLHC genes from 16 unique families were discovered. Besides, we identified 400 transcription factors from 13 families, including ERF, GATA, CPP, bZIP, C3H, MYB, SBP, Dof, bHLH, C2H2, G2-like, etc. Protein–protein interactions and 3D structures provided insight into CrLHC proteins. Gene ontology and KEGG-based enrichment advocated their role in light responses, photosynthesis, and energy metabolisms. Expression analysis highlighted the shared and unique roles of many CrLHC genes against different abiotic stresses (UV-C, green light, heat, nitric oxide, cadmium, nitrogen starvation, and salinity). Under salinity stress, antioxidant enzyme activity, reactive oxygen species markers, photosynthesis-related traits and pigments were significantly affected. Briefly, this comprehensive genomic and physiological study shed light on the impact of CrLHC genes in abiotic stress tolerance and set the path for future genetic engineering experiments.

1 INTRODUCTION

Green plants change light-energy into chemical-energy through photosynthesis, which is vital for cellular functions. This process depends on the capture and transfer of light via chlorophyll (Chl) (Wang and Grimm 2021; Kirilovsky and Büchel 2019). Of the utmost ample thylakoid membrane proteins are light-harvesting chlorophyll a/b-binding (LHC) superfamily proteins encoded by the nuclear genome, which is crucial in maximizing light capture, photosynthetic efficiency, photoprotection, and overall yield across photosynthetic systems (Jansson 1999; Teramoto & Minagawa, 2001; Engelken et al. 2010; Caffarri et al. 2014; Kirilovsky and Büchel 2019; Wang and Grimm 2021). In green algae (Chlamydomonas reinhardtii), the arrangement of LHC proteins is very close to that of higher plants. Still, investigations have presented a disagreement in their molecular structures and gene family arrangements, which delivers insights into their evolutionary adaptation to diverse environmental settings (Bassi and Wollman 1991; Minagawa et al. 1998). Specific LHC proteins, e.g., those encoded by the Lhcb5 and cabII-1 genes, play vital roles in the regulation and role of “light-harvesting antenna” in C. reinhardtii, like their homologs in plants (Imbault et al. 1988; Hwang and Herrin 1993). The LHC family is distinguished by the presence of a well-maintained Chl-binding domain and is principally distributed into two sets (i.e., Lhca and Lhcb), which are associated with the antenna proteins of photosystem I (PSI) and photosystem II (PSII), correspondingly, also known as “light-harvesting complex” antenna superfamily proteins (Jansson 1999; Andersson et al. 2003; Caffarri et al. 2014). In general, these proteins comprise three transmembrane helices and form complexes with Chl (including a, b, or c), carotenoids, and lipids, which are required to capture and transfer solar energy through photosynthesis (Jansson 1999; Caffarri et al. 2014). Notably, PSI proteins are very effective in energy transfer in C. reinhardtii. PSI proteins exist as two complexes: one including a light-harvesting reaction center and the other establishing a PSI-LHC supercomplex (Su et al. 2019). PSII is linked to Lhcb proteins, and the 3D structure of this system has been discovered in species such as Synechococcus elongatus and C. reinhardtii (Nield et al. 2000).

LHC proteins are among the amplest on Earth and perform important roles in capture, control, and sharing of excitation energy among PSI and PSII, granting photoprotection and assisting in responses to various stresses. Moreover, these proteins are necessary not only for light capture but also for sustaining photoprotection under stressful environments in plants, algae, and cyanobacteria (Jansson 1999; Engelken et al. 2010; Caffarri et al. 2014; Pietrzykowska et al. 2014; Ruban 2015; Dall'Osto et al. 2017; Dall'Osto et al. 2020). For instance, the overexpression of BnLhcb3 from rapeseed (Brassica napus L.) improves cold stress in transgenic Arabidopsis by regulating several physio-biochemical parameters, including Chl fluorescence (Zhang et al. 2022b). Likewise, the overexpression of MdLhcb4.3 from apple (Malus domestica) in apple callus and transgenic Arabidopsis has improved osmotic and drought stress tolerance (Zhao et al. 2020). Another Arabidopsis Lhcb6 protein has been reported to be associated with improved oxidative stress tolerance by regulating reactive oxygen species (ROS) homeostasis and boosting photoprotection in natural conditions (Chen et al. 2018). Many other investigations have reported similar findings in different photosynthetic organisms (Zhao et al. 2020; Zhang et al. 2021b; Luo et al. 2022; Zhang et al. 2022b; Wu et al. 2023; Kosugi et al. 2024).

Since the 1960s, the unicellular alga (C. reinhardtii) has served as a model system for photosynthesis inheritances, yielding valuable insights into the biogenesis of photosynthetic complexes, regulation of photosynthesis-related gene expression, biosynthesis of useful pigments, and modulation of photosynthetic activities under diverse stress conditions (Sasso et al. 2018; Lu et al. 2020a). Although some LHC genes have been functionally characterized previously, as reviewed by Lu et al. (2020a), a comprehensive genomic analysis of LHC gene family of C. reinhardtii (hereafter named as CrLHC) remains unknown and necessitates in-depth in-silico analysis. To address this research gap, we employed diverse genomic tools to discover and characterize CrLHC gene family members in C. reinhardtii. Moreover, we examined their expression levels against diverse abiotic stresses, including UV-C, green light, heat, nitric oxide, cadmium, nitrogen starvation, and salinity stress. We also examined key physiological traits, including photosynthesis-related factors, antioxidant activities, ROS markers, and photosynthetic pigment contents in response to salinity stress to get further insights. Our comprehensive genomic and physiological study will help us understand how CrLHC genes respond to diverse stress conditions and lay the base for future functional investigations in C. reinhardtii.

2 MATERIALS AND METHODS

2.1 Discovery and physicochemical aspects of LHC members in C. reinhardtii

To discover the LHC members in C. reinhardtii genome, the latest genome of C. reinhardtii CC-4532 v6.1 was obtained from phytozome. The HMM domain (PF00504) of LHC sequences was extracted from the Pfam record and exploited for HMMER in C. reinhardtii genome via TBTools-II v2.076 with default mode (Chen et al. 2023a). Secondly, 21 Arabidopsis thaliana LHC members were subjected to BLASTp (as a query) against the C. reinhardtii genome. The LHC domain was confirmed through NCBI-CDD (Lu et al. 2020b) and SMART (Letunic et al. 2021). The sequences containing “PF00504 domain” were designated as presumed CrLHC genes. After identifying various genes, we manually omitted the isoforms of many genes and only kept the unique IDs having different genomic positions, resulting in 61 CrLHC genes (CrLHC1-CrLHC61). We simplified the gene names and collectively termed them LHC following previous studies (Teramoto & Minagawa, 2001; Teramoto et al. 2002; Luo et al. 2022; Lan et al. 2022).

Physico-chemical aspects of CrLHC were evaluated employing the ProtParam option in the ExPASy site (Gasteiger et al. 2005). The subcellular positions of CrLHC proteins were estimated through WoLF PSORT (Horton et al. 2007). Exon-intron configurations of all CrLHC were governed using TBTools-II v2.076 software. The sequences of conserved motifs in CrLHC were discovered by exploring MEME (v5.5.5) website (Bailey et al. 2009).

2.2 Evaluation of chromosomal positions and phylogenetic relationship

The chromosomal positions of CrLHC genes were acquired from the phytozome, and plotted on chromosomes operating TBTools-II. To determine the evolutionary relationships of LHC proteins, a phylogenetic tree was made comparing C. reinhardtii, and A. thaliana members. Multiple sequence alignment was performed with MEGA7 software (Kumar et al. 2018) by employing the neighbour-joining (NJ) system with 1000 bootstrap replicates, and the tree was further polished using iTOL (Letunic and Bork 2021).

2.3 Forecast of cis-regulatory elements

To forecast the presumed cis-regulatory-elements in CrLHC promoters, 2 kb upstream sequences from the start-codons were extracted from the C. reinhardtii genome. We designated the 2 kb upstream regions as it is largely acknowledged to hide key cis-regulatory elements regulating gene transcription, balancing significance, and reducing non-functional interference. Then, the promoter regions were investigated via the PlantCARE website (Lescot et al. 2002), and the findings were pictured using TBTools-II software.

2.4 miRNA targets estimate and functional annotation

The coding sequence of all CrLHC genes was analyzed using the psRNATarget site (Dai et al. 2018) to estimate possible miRNA target sites following default factors. An interactive network between miRNAs and CrLHC genes was pictured with Cytoscape (v3.10.2) (Shannon et al. 2003). Synteny plot was created using an online Bioinformatics platform (Tang et al. 2023). GO and KEGG enrichment assessments were performed using eggNOG v6.0 (Hernández-Plaza et al. 2023), and the enrichment plots were created using the online Bioinformatics platform.

2.5 TFs regulatory network estimate

The 500 bp upstream regions (as they usually comprise core promoter elements and TF binding sites essential for transcriptional regulation) of CrLHC genes were examined with PlantRegMap to estimate putative TFs using a threshold of p ≤ 1e−6 (Tian et al. 2020), which ensure a centered and appropriate evaluation. The predicted TF-CrLHC gene connections were pictured using Cytoscape software.

2.6 Prediction of protein–protein interaction and the 3D structures

The CrLHC protein sequences were investigated using STRING v12.0 database (Szklarczyk et al. 2023), with factors set to a full STRING network, permitting up to ten connections and a minimum interface score of 0.4 (medium confidence). Secondary structures and 3D modelling of all CrLHC proteins were performed via Phyre2 online site with default settings.

2.7 Expression profiling through transcriptome datasets

Expression profiles of all CrLHC genes under different stress conditions, including UV-C, green light, heat, nitric oxide, cadmium, and nitrogen starvation, were evaluated using transcriptome datasets. The expression data for UV-C (0.1 W m−2 at a wavelength of 250 nm) was obtained from Colina et al. (2020), and samples were harvested at 0 (control, CT), 5, and 24 h. Green light (80 ± 5 μmol photons m−2 s−1 at a wavelength of 505 nm) data was obtained from CNGBdb (https://db.cngb.org/) with accession number CNP0004104. Samples were collected at various intervals, including 0 (CT), 0.5, 1, 2, 4, 8, and 12 h, with 3 biological replicates, as explained by Liu et al. (2024). Heat stress (35 and 40°C) data was obtained from GEO with accession code GSE182207 (Zhang et al. 2022a). The samples were harvested at pre-heat, 0 (CT), 0.5, 1, 2, 4, 8, 16, and 24 h with 3 biological repeats under both temperature conditions, see Zhang et al. (2022a) for more information.

The nitric oxide was used in the form of S-nitroso-N-acetylpenicillamine (SNAP; 0.3 mM), and 2-(4-carboxyphenyl)-4,4,5,5-tetramethylimidazolinel-oxyl-3-oxide (cPTIO; 0.4 mM) for 1 h and the data was obtained from SRA using the BioProject accession code PRJNA629395. The experimental details are given by Kuo and Lee (2021). The expression data for cadmium stress (110 μM) was obtained from (Zhang et al. 2023), and the samples were harvested after 3 h with three biological repeats. The data for nitrogen starvation was obtained from SRA with accession number PRJNA255778, and the samples were gathered at nine intervals, including 0 h (CT), 10 min, 30 min, 1 h, 2 h, 6 h, 8 h, 24 h, and 48 h with two repeats, explained by Yang et al. (2022). The transcriptome datasets were evaluated as presented in a previous study (Raza et al. 2021). To account for large changes in expression levels, the transcripts per million (TPM) values were log10-converted for all heatmaps. The expression heatmaps were developed with TBTools-II.

2.8 Chlamydomonas material preparation and stress treatments

For stress treatments, the wild-type strain “CC5325” of C. reinhardtii was acquired from the Chlamydomonas Resource Center (St. Paul, MN 55108, USA). The CC5325 strain was cultivated on a TAP medium at 25°C under standard light (40–50 μE m−2 s−1) conditions. When the algae grew to the logarithmic stage, cold (4°C, based on our preliminary examination) and salinity [(200 mM NaCl, based on a previous study by Fal et al. (2022)] stress treatments were performed. The relevant physiological indexes were measured by sampling at 0, 6, 9, 12, and 24 h, respectively. Each treatment group was repeated three times.

2.9 Measurement of antioxidant enzymes, ROS markers, and photosynthetic pigments under salinity stress

To examine the physiological adjustments during salinity stress, we examined antioxidant activities [i.e., catalase (CAT), superoxide dismutase (SOD), and peroxidase (POD)] with industrial kits, and followed the developer's instructions. Likewise, the contents of malondialdehyde (MDA) and hydrogen peroxide (H2O2) were calculated using respective kits. The kits for CAT (AKAO003-2 M), SOD (AKAO001M), and POD (AKAO005M), MDA (AKFA013M), and H2O2 (AKAO009M) were bought from Boxbio Science & Technology Co., Ltd.. Complete protocol booklets are accessible on the company's site (https://www.boxbio.cn/?bd_vid=11884392139909970175) using the aforementioned kit IDs. All traits were determined in triplicate using an Epoch enzyme-labeled instrument (BioTek Instruments, Inc.).

Photosynthetic pigments, including Chl a, Chl b, and carotenoid contents, were measured by enzyme-labeled assay and calculated by following the formula of Sartory and Grobbelaar (1984). Briefly, 2 mL of algal liquid was extracted at room temperature, centrifuged at 737.5 × g at 4°C for 1 min, and after discarding the supernatant, an equal volume (2 mL) of 95% ethanol was added and placed in a refrigerator at 4°C away from light for 24 h. The OD values of the separated solution at 665, 649, and 470 nm were measured by UV–VIS spectrophotometer.

2.10 Measurement of photosynthesis-related parameters under salinity stress

The parameters analyzed include Y(II) (effective quantum yield of PSII photochemistry), ETR (electron transport rate), NPQ (non-photochemical quenching), Y(NPQ) (yield of non-photochemical quenching), Y(NO) (yield of non-regulated energy dissipation), qN (non-photochemical quenching coefficient), qP (photochemical quenching coefficient), and qL (fraction of open PSII centers) were measured using the PAM-fluorescence method (Heinz Walz GmbH, PAM2500) for all bio-replicates in each group. Cells were harvested on the fourth day during the exponential growth phase and acclimated in darkness for 15 min prior to measuring quantum yield and ETR.

2.11 RNA extraction and qRT-PCR-based expression profiling

Total RNA was extracted using the Plant RNA Extraction Kit (code: AG21019, Accurate Biology, Co., Ltd.), followed by reverse transcription with the Reverse Transcription Kit (code: AG11705, Accurate Biology, Co., Ltd.). qRT-PCR was operated on a qTOWER3 real-time PCR system (Analytik Jena) using SYBR Green Pro Taq HS Premix (code: AG11701, Accurate Biology, Co., Ltd.), with ACTIN (Cre13.g603700) as the reference gene (Tóth et al. 2024).

The qRT-PCR reaction was executed as follows: 95°C for 30 sec, followed by 40 cycles of 95°C for 5 s, 60°C for 30 s, and dissociation stage. Relative transcript expression profiles were examined with 2–∆∆CT method (Livak and Schmittgen, 2001). Each biological sample was examined in triplicate, with technical triplicates performed for each biological replicate. Primers employed for qRT-PCR are given in Table S1.

2.12 Statistical data analysis and resources

The data were observed via GraphPad Prism V9 (Swift 1997). Statistical significance was confirmed through one-way ANOVA followed by Dunnett's test for multiple comparisons with significant degrees specified as ****p < 0.0001, ***p ≤ 0.001, **p ≤ 0.01, *p ≤ 0.05. Results are presented as mean (± SD) from three biological replicates (n = 3). All graphs were prepared in GraphPad Prism.

A list of URLs for all tools and databases used in the comprehensive genomic analysis is provided in Table S1.

3 RESULTS

3.1 Comprehensive characterization of CrLHC members in C. reinhardtii genome

In this study, we discovered 61 CrLHC genes in C. reinhardtii genome (Tables 1 and S2). Henceforth, these genes are described as “CrLHC1CrLHC61”, and were unevenly mapped on the 15 chromosomes in C. reinhardtii genome (Figures 1A and S1). Maximum number (13) of CrLHC genes were mapped on Chr06, then Chr10 with 9 genes, Chr02 with 7 genes, Chr08 with 6 genes, Chr07 with 5 genes, Chr12/16 with 4 genes, Chr04 with 3 genes, and Chr01/Chr03/Chr17 with 2 genes. The minimal number (1) of CrLHC genes were represented on Chr09, Chr11, Chr13, and Chr14. In particular, no genes were positioned on Chr05 and Chr15 (Figures 1A and S1).

TABLE 1. Key features of physiochemical properties of CrCHL proteins.
Gene ID Gene name Genomic position Protein length (aa) CDS length Exon Intron Molecular weight (kDa) Theoretical pI Aliphatic index Gravy Location
Cre01.g066917_4532.1 CrLHC1 Chr01 (8133816:8135652) 257 771 6 5 27.57 5.96 79.77 −0.06 Chloroplast
Cre01.g066917_4532.2 CrLHC2 Chr01 (8133803:8135637) 160 480 5 4 16.83 4.82 92.7 0.19 Nucleus
Cre02.g110750_4532.1 CrLHC3 Chr02 (5003121:5006432) 400 1200 10 9 42.47 8.16 94.51 0.058 Chloroplast
Cre02.g143550_4532.1 CrLHC4 Chr02 (8246098:8248490) 323 969 2 1 34.45 9.3 99.97 0.114 Chloroplast
Cre02.g143550_4532.2 CrLHC5 Chr02 (8245849:8248492) 281 843 3 2 29.81 5.92 103.82 0.249 Chloroplast
Cre02.g143550_4532.3 CrLHC6 Chr02 (8246084:8248489) 302 906 3 2 32.17 6.18 101.5 0.103 Chloroplast
Cre02.g143550_4532.4 CrLHC7 Chr02 (8246104:8248490) 264 792 4 3 27.87 5.9 105.36 0.262 Chloroplast
Cre03.g146147_4532.1 CrLHC8 Chr03 (778778:781411) 254 762 8 7 27.91 9.29 89.84 0.046 Plasma membrane
Cre03.g156900_4532.1 CrLHC9 Chr03 (2152765:2155795) 269 807 6 5 28.69 7.79 81.68 −0.097 Chloroplast
Cre04.g211850_4532.1 CrLHC10 Chr04 (1771656:1774799) 237 711 5 4 24.29 8.71 96.86 0.234 Chloroplast
Cre04.g232104_4532.1 CrLHC11 Chr04 (3990539:3993416) 258 774 5 4 27.38 5.68 77.98 −0.072 Chloroplast
Cre04.g232104_4532.2 CrLHC12 Chr04 (3991279:3993226) 229 687 3 2 24.5 4.86 83.51 −0.005 Chloroplast
Cre06.g272650_4532.1 CrLHC13 Chr06 (2884129:2886577) 244 732 7 6 25.92 9.11 87.65 0.014 Chloroplast
Cre06.g272650_4532.2 CrLHC14 Chr06 (2884202:2886528) 164 492 7 6 17.4 6.73 89.88 0.045 Chloroplast
Cre06.g278213_4532.1 CrLHC15 Chr06 (4019248:4021593) 257 771 6 5 27.6 9.06 92.62 0.112 Chloroplast
Cre06.g278213_4532.2 CrLHC16 Chr06 (4019248:4021794) 256 768 6 5 27.45 9.06 92.98 0.101 Chloroplast
Cre06.g278213_4532.3 CrLHC17 Chr06 (4019246:4021606) 212 636 7 6 22.84 9.27 92.46 0.095 Chloroplast
Cre06.g283050_4532.1 CrLHC18 Chr06 (5132577:5135243) 229 687 8 7 23.9 7.97 81.05 0.078 Chloroplast
Cre06.g283950_4532.1 CrLHC19 Chr06 (5226453:5228185) 255 765 4 3 27.1 5.96 78.90 −0.063 Chloroplast
Cre06.g283950_4532.2 CrLHC20 Chr06 (5226992:5228178) 161 483 2 1 16.88 4.74 92.12 0.197 Chloroplast
Cre06.g284200_4532.1 CrLHC21 Chr06 (5250300:5254185) 255 765 6 5 27.08 5.96 82.72 −0.080 Chloroplast
Cre06.g284200_4532.2 CrLHC22 Chr06 (5250300:5252483) 255 765 5 4 27.08 5.96 82.72 −0.080 Chloroplast
Cre06.g284250_4532.1 CrLHC23 Chr06 (5255841:5258071) 255 765 4 3 27.03 5.96 78.90 −0.051 Chloroplast
Cre06.g285250_4532.1 CrLHC24 Chr06 (5368741:5371542) 254 762 4 3 26.96 5.96 79.60 −0.039 Chloroplast
Cre06.g285250_4532.2 CrLHC25 Chr06 (5369439:5371462) 225 675 2 1 23.92 4.76 83.30 0.075 Chloroplast
Cre07.g320400_4532.1 CrLHC26 Chr07 (1099540:1101108) 194 582 3 2 19.97 9.03 98.60 0.408 Chloroplast
Cre07.g320400_4532.2 CrLHC27 Chr07 (1099241:1101163) 191 573 3 2 19.66 9.03 98.63 0.415 Chloroplast
Cre07.g320450_4532.1 CrLHC28 Chr07 (1102345:1104022) 191 573 3 2 19.66 9.03 98.63 0.415 Chloroplast
Cre07.g344950_4532.1 CrLHC29 Chr07 (4675487:4677302) 214 642 5 4 22.84 9.04 82.63 −0.036 Chloroplast
Cre07.g344950_4532.2 CrLHC30 Chr07 (4675487:4677284) 214 642 5 4 22.84 9.04 82.63 −0.036 Chloroplast
Cre08.g365900_4532.1 CrLHC31 Chr08 (1648653:1650805) 256 768 3 2 28.07 4.61 105.22 0.211 Plasma membrane
Cre08.g365900_4532.2 CrLHC32 Chr08 (1648651:1651311) 254 762 4 3 27.56 4.94 94.86 0.031 Chloroplast
Cre08.g365900_4532.4 CrLHC33 Chr08 (1648651:1650811) 229 687 4 3 24.84 4.97 91.97 −0.003 Chloroplast
Cre08.g367400_4532.1 CrLHC34 Chr08 (1874560:1876960) 260 780 4 3 28.22 4.88 92.70 −0.031 Chloroplast
Cre08.g367500_4532.1 CrLHC35 Chr08 (1886059:1888346) 260 780 4 3 28.22 4.88 92.70 −0.031 Chloroplast
Cre08.g384650_4532.1 CrLHC36 Chr08 (4337682:4340366) 231 693 5 4 23.36 10.00 80.13 0.248 Chloroplast
Cre09.g393173_4532.1 CrLHC37 Chr02 (7567708:7569606) 176 528 2 1 18.75 9.30 99.77 0.264 Chloroplast
Cre09.g393173_4532.2 CrLHC38 Chr02 (7567988:7568841) 108 324 1 0 11.44 4.97 119.44 0.620 Chloroplast
Cre09.g394325_4532.1 CrLHC39 Chr09 (3147292:3148442) 198 594 4 3 20.71 8.85 82.84 0.187 Chloroplast
Cre10.g425900_4532.1 CrLHC40 Chr10 (1182114:1184624) 258 774 7 6 28.23 8.91 86.96 −0.106 Chloroplast
Cre10.g425900_4532.3 CrLHC41 Chr10 (1182114:1184622) 258 774 7 6 28.23 8.91 86.96 −0.106 Chloroplast
Cre10.g425900_4532.4 CrLHC42 Chr10 (1182114:1184621) 258 774 7 6 28.23 8.91 86.96 −0.106 Chloroplast
Cre10.g452050_4532.1 CrLHC43 Chr10 (4672593:4676454) 347 1041 9 8 37.94 9.90 76.47 −0.225 Chloroplast
Cre10.g452050_4532.2 CrLHC44 Chr10 (4672783:4676326) 347 1041 10 9 37.94 9.90 76.47 −0.225 Chloroplast
Cre10.g454734_4532.1 CrLHC45 Chr10 (5030328:5036507) 830 2490 7 6 85.44 9.73 77.76 −0.016 Chloroplast
Cre10.g454734_4532.2 CrLHC46 Chr10 (5030392:5032644) 555 1665 1 0 55.82 9.24 69.33 −0.146 Chloroplast
Cre10.g454734_4532.3 CrLHC47 Chr10 (5030390:5032509) 469 1407 1 0 47.27 8.88 71.00 −0.056 Chloroplast
Cre10.g454734_4532.4 CrLHC48 Chr10 (5030388:5032352) 453 1359 1 0 45.84 8.53 71.57 −0.060 Chloroplast
Cre11.g467573_4532.1 CrLHC49 Chr11 (331144:334255) 268 804 9 8 28.89 8.40 86.37 0.018 Chloroplast
Cre12.g508750_4532.1 CrLHC50 Chr12 (2228735:2231394) 247 741 6 5 26.92 8.76 85.69 −0.001 Chloroplast
Cre12.g508750_4532.2 CrLHC51 Chr12 (2228731:2231396) 246 738 6 5 26.79 8.44 86.04 0.015 Chloroplast
Cre12.g548400_4532.1 CrLHC52 Chr12 (8612923:8616364) 250 750 4 3 26.65 5.41 83.61 0.022 Chloroplast
Cre12.g548950_4532.1 CrLHC53 Chr12 (8557859:8559546) 250 750 5 4 26.69 5.41 83.21 0.013 Chloroplast
Cre13.g576760_4532.1 CrLHC54 Chr13 (2052203:2055517) 163 489 5 4 16.8 9.77 112.59 0.463 Chloroplast
Cre14.g626750_4532.1 CrLHC55 Chr14 (2828954:2832441) 193 579 7 6 20.42 7.79 101.15 0.278 Chloroplast
Cre16.g673650_4532.1 CrLHC56 Chr16 (6634283:6637222) 290 870 4 3 30.71 5.38 83.77 −0.154 Chloroplast
Cre16.g687900_4532.1 CrLHC57 Chr16 (4757396:4760486) 242 726 8 7 26.22 7.81 75.85 −0.144 Chloroplast
Cre16.g687900_4532.2 CrLHC58 Chr16 (4757535:4760382) 242 726 9 8 26.22 7.81 75.85 −0.144 Chloroplast
Cre16.g687900_4532.3 CrLHC59 Chr16 (4757539:4760372) 242 726 9 8 26.22 7.81 75.85 −0.144 Chloroplast
Cre17.g720250_4532.1 CrLHC60 Chr17 (3061385:3062772) 281 843 3 2 29.94 6.22 77.75 −0.231 Chloroplast
Cre17.g740950_4532.1 CrLHC61 Chr17 (5753639:5756744) 286 858 7 6 30.16 9.01 75.47 0.002 Chloroplast
  • Note: Theoretical pI characterizes the pH at which the protein has no net charge and benefits from understanding its activities under different pH conditions. Aliphatic Index is a sign of protein thermostability, where higher numbers are associated with higher stability. GRAVY means the protein hydropathicity, where positive and negative numbers indicate hydrophobicity and hydrophilicity, respectively.
Details are in the caption following the image
Gene mapping and evolutionary tree. (A) Graph indicating the CrLHC gene numbers localized on each chromosome. (B) A Neighbour-Joining, un-rooted evolutionary tree of LHC proteins from Chlamydomonas reinhardtii, and Arabidopsis thaliana. Small red circles show the C. reinhardtii proteins, and blue circles show the A. thaliana proteins. Bootstrap values (%) are displayed on the branches.

Key features of all discovered 61 CrLHC proteins are displayed in Table 1. In short, the protein length ranged from 108 (CrLHC38) to 830 aa (CrLHC45), and the CDS length ranged from 324 (CrLHC38) to 2490 bp (CrLHC45). Exon numbers ranged from one (CrLHC38/46/47/48) to 10 (CrLHC3 and CrLHC44) (Table 1). Specifically, six genes (CrLHC3/43/44/49/58/59) contained the utmost introns (8 and 9), and four genes lacked introns (CrLHC38/46/47/48) (Table 1). Projected molecular weights extended from 11.44 (CrLHC38) to 85.44 kDa (CrLHC45), the pI extended from 4.61 (CrLHC31) to 10 (CrLHC36), the aliphatic index ranged from 69.33 (CrLHC46) to 119.44 (CrLHC38), and the GRAVY fluctuated from −0.231 (CrLHC60) to 0.62 (CrLHC38). The alterations in MW and pI are principally owing to the higher proportion of essential amino acids and posttranslational amendments. Subcellular location detected that 58 proteins are in the chloroplast, two proteins (CrLHC8 and CrLHC31) are located on the plasma membrane, and one protein (CrLHC2) is located in the nucleus (Table 1).

Additionally, 21 LHC genes from A. thaliana genome were also discovered to examine the phylogenetic relationship between green algae and Arabidopsis (Table S2).

3.2 Understandings from phylogenetic bonds of CrLHC proteins

To uncover the comprehensive evolutionary/phylogenetic narration between the CrLHC (61 members), and AtLHC (21 members), a tree was assembled, which was arranged into four main groups; Lhcb, Lhca, ELIP, and PsbS (Figure 1B). As shown in Figure 2, Lhcb group comprised 17 CrLHC members and 12 Arabidopsis members, Lhca group comprised 28 CrLHC members and 6 Arabidopsis members, ELIP group comprised 13 CrLHC members and 2 Arabidopsis members, and PsbS group comprised 3 CrLHC members and 1 Arabidopsis members. Lhca and Lhcb groups were found to gather the greatest number of proteins from both organisms, and LHC members grouped into the homogeneous sub-group may hold analogous functions.

Details are in the caption following the image
Gene structure (intron-exon arrangements), domain organization, motif evaluation, and GO and KEGG pathway enrichment exploration of CrLHCs. (A) Coloured boxes characterize predicted motifs in CrLHCs, and each colour represents a distinct motif. (B) Predicted domain in CrLHCs. (C) Gene structure of CrLHCs. Light green specifies UTR regions, blue indicates CDS regions, and grey indicates introns. (D) Significantly enriched GO terms, classified into molecular function (MF), cellular component (CC), and biological process (BP) for CrLHC genes. (E) Top enriched KEGG pathways found in CrLHC genes.

3.3 Insights into gene structure, domain, and conserved motifs

Results of the conserved motifs discovered that CrLHC genes possessed diverse motifs, ranging from 2 (CrLHC8/36/26/27/28/61) to 8 (CrLHC11/19/21/22/23/24/52/53) (Figure 2A). Like gene structure, the circulation of motifs was also consistently contained by sub-trees (Figure 2A). However, certain motifs were exclusive to specific genes. For example, some genes, such as CrLHC8/26/27/28/33/38/54/C61, were mainly exclusive to motif 2. Whereas motifs 6 and 9 were only exclusive to CrLHC31-CrLHC35, CrLHC4- CrLHC7, and CrLHC10. Likewise, motif 4 was exclusive to some genes such as CrLHC1/2/3/9/11/19/25/52. Motifs 1, 2, 3, 5, and 7 are among the widely distributed motifs; they are only absent in few genes (Figure 2A). The domain analysis showed that Chloroa_b-bind is present in almost all of the genes, whereas Chloroa_b-bind superfamily is only present in CrLHC8/27/28/36/38/54/55/61 genes (Figure 2B).

The exploration of gene structure determined that the number of exons differed from 1 to 10, while the number of introns ranged from 0 to 9 (Table 1; Figure 2C). Notably, genes within the same sub-tree demonstrated similar structural patterns, with a few exceptions (Figure 2C). The CrLHC45 gene exhibited the lengthiest structure, and most of the other genes showed complex structures (Figure 2C). Exon loss or gain instances have been examined through the evolution of the CrLHC family genes. These results also suggest that CrLHC genes sustained a fairly consistent exon-intron alignment during the evolutionary history of the C. reinhardtii genome. Moreover, CrLHC genes within the same sub-tree display extremely similar gene structures, positioning closely with their phylogenetic classifications.

3.4 Insights into GO and KEGG enrichment survey

To harness the active contributions of CrLHC genes, GO and KEGG surveys were accomplished (Figure 2D-E). GO annotation results displayed diverse enriched terms from three classes (Figure 2D). In the molecular function class, the abundantly enriched terms were pigment binding (GO:0031409) and chlorophyll-binding (GO:0016168). In the cellular component class, the abundantly enriched terms were obsolete chloroplast part (GO:0044434), obsolete plastid part (GO:0044435), obsolete thylakoid part (GO:0044436), thylakoid membrane (GO:0042651), photosynthetic membrane (GO:0034357), plastid envelope (GO:0009526), plastid membrane (GO:0042170), chloroplast/plastid thylakoid (GO:0009534/GO:0031976), organelle subcompartment (GO:0031984), and others. In biological process class, the abundantly enriched terms were response to red-light (GO:0010114), response to red or far-red light (GO:0009639), response to far-red light (GO:0010218), response to blue-light (GO:0009637), response to high light intensity (GO:0009644), response to temperature-stimulus (GO:0009266), and cellular response to UV-A/UV (GO:0071492/GO:0034644) (Figure 2D).

KEGG pathway enrichment survey discovered a total of 7 pathways. The abundant enriched pathways were metabolism (A09100), photosynthesis-antenna proteins (00196), energy metabolism (B09102), photosynthesis proteins (00194), and protein families:metabolism (B09181) (Figure 2E). These enrichment surveys authorize the purposeful role of CrLHC genes in vital activities such as light responses, Chl, and photosynthesis.

3.5 Insights into cis-regulatory elements

To leverage the role of CrLHC genes in multiple stress tolerance, we discovered cis-regulatory elements in their promoters by focusing on two key types: abiotic stress- and phytohormones-responsive elements (Figure 3; Table S3). From the stress part, we discovered seven abiotic stress-related (light, drought, anoxic, low temperature, anaerobic, defense and stress, and circadian control) elements. These elements entail Sp1, GTGGC-motif, GATA-motif, G-Box, ACE, Box II, I-box, etc. (light, 72.52%), GC-motif (anoxic, 13.96%) MBS (drought, 5.59%), LTR (low temperature, 3.69%), ARE (anaerobic, 3.60%), TC-rich repeats (defense and stress, 0.36%), and circadian (circadian control, 0.36%) (Figure 3; Table S3).

Details are in the caption following the image
Cis-regulatory element exploration in CrLHC promoter regions. (A) Diverse cis-elements involved in abiotic stress and phytohormones responses are shown with diverse colors. (B, D) The total number of CrLHC genes harbouring elements associated with abiotic stress, and phytohormone response groups. (C, E) Pie charts showing the % ratio of various elements within (C) abiotic stress-responsive elements, and (E) phytohormone-responsive elements. Distinct colours denote specific elements and their proportionality found in CrLHC genes.

Similarly, five phytohormones-related [methyl jasmonate (MeJA), abscisic acid, auxin, gibberellin, and salicylic acid] elements consist of CGTCA-motif/TGACG-motif (MeJA, 53.33%), ABRE (abscisic acid, 36.81%), AuxRR-core/TGA-element/TGA-box (auxin, 5.65%), P-box/TATC-box/GARE-motif (gibberellin, 2.71%), and TCA-element (salicylic acid, 1.52%) (Figure 3; Table S3).

Generally, the findings demonstrated that many abiotic stress- and phytohormones-related elements were anticipated to be gene-specific and circulated irregularly. Subsequently, genes with these specific elements could be prime candidates for additional functional experiments to discover their shielding roles under stress conditions and phytohormone treatments.

3.6 Discovery of miRNAs targeting CrLHC genes

This study discovered 19 miRNAs targeting 42 genes belonging to 16 unique families (Figure 4A; Table S4). Overall, 15 miRNAs targeted 35 CrLHC genes via cleavage, and 7 miRNAs targeted 12 CrLHC genes via translation. Three miRNAs, including cre-miR1147.2, cre-miR1157-3p, and cre-miR1150.3, were found to have both inhibition effects but on different genes (Figure 4B; Table S4). The findings signal that various miRNAs contribute to post-transcriptional management of CrLHC genes by targeting them via cleavage and translation.

Details are in the caption following the image
miRNA interactions with CrLHC genes. (A) Network demonstrating predicted miRNA interactions with CrLHC genes. (B) Sankey diagram illustrating the relationships between miRNAs, target genes, and their inhibitory effects. (C) Graphics indicate that CrLHC61, found on Chr17, is targeted by two miRNAs (cre-miR1149.2 and cre-miR1150.3). (D) Graphics indicate that CrLHC3, found on Chr02, is targeted by two miRNAs (cre-miR913-5p and cre-miR1166.1). The complementary RNA sequences (5′ to 3′) and the anticipated miRNA sequences (3′ to 5′) are illustrated in pink and blue boxes, correspondingly.

For a comprehensive view, the miRNA-targeted positions of CrLHC61 and CrLHC3 are illustrated in Figure 4C-D. The results exhibited that cre-miR1147.1 (8 genes) and cre-miR1144a.1 (6 genes) targeted the utmost number of genes. Two miRNAs, including cre-miR1147.2 and cre-miR1145.1, targeted 5 genes, followed by cre-miR9897-5p, which targeted 4 genes. Four miRNAs, such as cre-miR907, cre-miR1157-3p, cre-miR1152, and cre-miR1153-5p.2 targeted 3 genes. Likewise, three miRNAs, including cre-miR1150.3, cre-miR1158, and cre-miR911 targeted 2 genes. Notably, the remaining 7 miRNAs targeted one gene individually (Figure 4A; Table S4). A few genes, e.g., CrLHC3, CrLHC39, CrLHC61 (2 miRNAs), and CrLHC15, CrLHC16, CrLHC17, and CrLHC18 (3 miRNAs) were projected to be directed by >1 miRNA. The leftover genes, including CrLHC9/11/13/14/29/30/31/32/33/34/35/40/41/42/50/51/52/53/60 were not targeted by any miRNAs. Therefore, it is vital to verify the expression patterns of anticipated miRNAs-CrLHCs modules to identify their natural functions in C. reinhardtii genome.

3.7 TFs regulatory network of CrLHC genes

This study discovered 400 TFs in 61 CrLHC genes, delivering insights into the governing function of diverse TFs in controlling the CrLHC genes' transcription (Figure 5; Table S5). The data indicated that these TFs are classified into 13 distinct families, including ERF, GATA, CPP, bZIP, C3H, MYB, SBP, Dof, Nin-like, bHLH, MYB_related, C2H2, and G2-like (Figure 5; Table S5). The utmost ample TFs were ERFs (99 members), GATA (54 members), SBP (51 members), CPP (41 members), MYB (39 members), C3H (19 members), MYB_related (14 members), G2-like and bHLH (11 members), and Dof and Nin-like (10 members) (Figure 5; Table S5). Among all, C2H2 (7 members) were the least abundant TFs. It was also noticed that almost all CrLHC genes were estimated to be associated with numerous TFs families. For instance, CrLHC46 gene was associated with 26 TFs, subsequently CrLHC48 by 18 TFs, CrLHC5/6/7/47 genes by 16 TFs, CrLHC4 14 TFs, CrLHC61 by 11 TFs, CrLHC3/8/13/45 by 10 TFs, CrLHC21/22/38 by 9 TFs, CrLHC60 by 8 TFs, CrLHC26/27/36/37/43/44 by 7 TFs, and CrLHC10/12/28/34/35 by 6 TFs. In comparison, the remaining genes were found to be associated with <5 TFs. For example, CrLHC57/58/59 genes were associated with only 1 TF, followed by CrLHC1/2/19/20/25/32/33/50/51/52 by 2 TFs, CrLHC11/18/23/31 by 3 TFs, CrLHC14/24/41/53 by 4 TFs, and CrLHC9/15/16/17/29/30/39/40/42/49/54/55/56 by 5 TFs (Figure 5; Table S5). These scores indicate that genes encoding TFs and cis-elements accompanying abiotic stress and phytohormone responses might be genetically manipulated to boost stress tolerance.

Details are in the caption following the image
Potential transcription factors (TFs) allied with CrLHC genes. (A) Circular network graph demonstrating TFs interactions with CrLHC genes. The CrLHC genes are presented around the edges, with shape sizes and colours reflecting interaction levels based on q-values. Background line colours specify interactions corresponding to p-values. The large inner circles represent TFs families. (B) Bar graph shows the CrLHC gene numbers accompanying different families of predicted TFs.

3.8 Insights into PPI and 3D structures of CrLHC proteins

All 61 CrLHC members displayed interactions with recognized proteins and were classified into diverse groups, probably reflecting diverse functions. Thicker lines between proteins correspond to greater interactions, while thinner lines denote weaker ones (Figure S2; Table S6). For instance, CrLHC12/17/18/2/20/3/25/30/35 proteins interact with PSBP1, PSBP2, PSAG, PSBY2, ycf12, psbM, psaN, and others. The detailed results are shown in Table S6. In short, these findings specify that CrLHC proteins demonstrated strong interactions with diverse LHC members, which may execute biologically complex functions.

The secondary structure investigation discovered that alpha helix ranged from 19% (CrLHC46) to 62% (CrLHC38), beta-strand ranged from 0% (multiple proteins) to 18% (CrLHC7 and CrLHC5), transmembrane (TM) helix ranged from 10% (CrLHC46) to 31% (CrLHC38 and CrLHC54), and disordered ranged from 0% (CrLHC2 and CrLHC14) to 54% (CrLHC46) (Table S7). These scores advise significant adaptability in the structural elements of CrLHC proteins, implying diverse functional roles. The wide scope of secondary structure elements highlights these proteins' structural complexity and possible adaptability.

Additionally, all CrLHC proteins were subjected to 3D modelling (Figure S3), and the detailed results of the projected templates are presented in Table S7. Overall, 29 unique templates were discovered: of these, 20 templates including c6ijoS, c6kacW, c6kadW, c6zzxQ, c7bgiN, c7d0jQ, c7dz8P, c7dz8X, c7dz8Z, c7ouiA, c7ouiX, c7pi0U, c7wfdN, c7wyiH, c7xqpZ, c7zqcO, c7zqcQ, c7zqcR, c7zqcS, and c7zqcT displayed 100% confidence level. The high confidence levels of various templates propose reliable structural estimates. The alignment coverage ranged from 28% (CrLHC45) to 99% (CrLHC2 and CrLHC14) (Figure S3; Table S7). Briefly, 8 proteins, including CrLHC2/9/19/20/23/24/25/33, were modelled based on “c7d0jQ” template, while 7 proteins, including CrLHC8/26/27/28/37/38/39, were modelled based on “c4ri2A” template, and other proteins were unevenly modelled based on diverse templates (see Table S7 for detailed results). These results imply that CrLHC proteins demonstrate diverse and flexible structures, as proven by using 29 unique templates with shifting alignment coverage. The flexible structures, emphasized by the presence of coils, may aid the functional adaptability of CrLHC proteins.

3.9 Transcriptome-based expression profiling under multiple stresses

To harness the role of CrLHC genes against multiple stress tolerance, we examined the transcriptome-based expression under 6 types of stresses, including UV-C, green light, heat, nitric oxide, cadmium, and nitrogen starvation. As shown in Figure 6, only a few key genes (but differ from stress types) showed high expression under certain stress. For example, in response to UV-C stress, genes (e.g., CrLHC1/9/11/14/18/24/31/32/37/43/44/50/58/60 and others) belonging to group II showed extremely high expression under CT and stress conditions (5 and 24 h) (Figure 6A). Genes (e.g., CrLHC15/16/17/29/30) belonging to group III showed relatively high expression at CT and certain genes were also expressed at 5 h (CrLHC15-17) and 24 h (CrLHC29 and CrLHC30). Likewise, 4 genes (mainly CrLHC45-48) from group I were moderately expressed at 5 h only (Figure 6A).

Details are in the caption following the image
Transcriptome-based expression analysis. (A) Expression profiling in response to UV-C stress. (B) Expression profiling in response to green light stress. (C) Expression profiling in response to heat stress (35°C). (D) Expression profiling of differentially expressed genes in response to nitric oxide stress and samples were harvested at control-without NO treatment (CT), and after treatment of S-nitroso-N-acetylpenicillamine (SNAP), and 2-(4-carboxyphenyl)-4,4,5,5-tetramethylimidazolinel-oxyl-3-oxide (cPTIO). Combined treatment refers to cPTIO+SNAP. (E) Expression profiling in response to cadmium stress and samples were harvested after 3 h of cadmium treatment. (F) Expression profiling in response to nitrogen starvation. To account for large changes in expression levels, the transcripts per million (TPM) values were log10-converted for the first five heatmaps (A-E); and in panel F log10 was taken of TPM values to generate all final heatmaps. The colour gradient in the expression bar, varying from red to white to blue, specifies high, medium, and low expression levels, correspondingly.

In response to green light stress, genes belonging to groups I and II were more highly expressed than those of group III. A few genes (e.g., CrLHC31/32/33/54/21/22/34/35/39/61) from group III were also expressed at distinct internals (e.g., 0.5, 1, 2, 4, 8 and 12 h), suggesting their role from short to long-term stress responses (Figure 6B). Under heat stress (35°C), diverse genes belonging to groups II and III displayed high expression throughout the stress conditions. Mainly, three genes (such as CrLHC1/2/53) were more highly expressed than the rest of the genes from group II, while 27 genes from group I did not respond greatly to heat stress, suggesting that these genes likely do not influence heat tolerance (Figure 6C). Similar results were also obtained under 40°C, suggesting that increasing temperature does not necessarily alter the gene expression (Figure S4).

In response to nitric oxide treatment, genes (e.g., CrLHC9/13/14/19/29/30/40/42/44/50/51/52/54/56/60, etc.) belonging to group I were significantly expressed under CT, cPTIO, SNAP and SNAP+cPTIO (combined) treatments. Notably, group II genes did not respond to any stress conditions. Some genes (such as CrLHC55) were relatively expressed under CT, cPTIO, and SNAP+cPTIO (combined) treatments (Figure 6D). Under cadmium stress, only one gene (CrLHC35) was highly expressed from group II, and 6 genes (e.g., CrLHC10/21/22/34/36/39) belonging to group III were also expressed under 3 h. Notably, all genes from group I were negatively expressed against cadmium stress, recommending their negative regulatory role in metal stress tolerance (Figure 6E). In response to nitrogen starvation, most of the genes (e.g., CrLHC1/2/9/12/19/24/25/30/40/50/52/53/59 and others) from group I was highly expressed throughout the stress conditions except 24 and 48 h. Likewise, group III genes were slightly expressed throughout the stress conditions. Notably, group II and group IV genes did not respond to nitrogen starvation (Figure 6F).

Certain genes were found to be specific to particular stress conditions, such as those showing high expression under all stress conditions. Furthermore, a few genes displayed moderate expression levels across diverse stresses, indicating their potential role in broad-spectrum stress tolerance. Overall, the expression data recommends that some specific genes may play significant roles in multiple stress responses in C. reinhardtii. Thus, future studies should target the functional manipulations of key and stress-associated genes to further understand their contributions to stress tolerance.

3.10 qRT-PCR-driven expression profiling against salinity and cold stress

Ten CrLHC genes based on the existence of stress-associated cis-elements in their promoter regions were chosen for qRT-PCR assessment, which confirms the authenticity of in-silico predictions for further functional characterization. Under both cold and salinity conditions, all genes were up-regulated throughout the stress duration (Figure 7). Eight genes (e.g., CrLHC1/5/8/14/20/26/49/56) displayed constantly elevated expression trends during the stress duration compared to CT, suggesting their key role in cold stress tolerance. On the other hand, two genes (e.g., CrLHC32/40) showed high expression at 6 and 12 h, but their expression reduced at 24 and 48 h than CT (Figure 7A). In response to salinity stress, all genes displayed elevated expression during the stress duration compared to CT. Particularly, higher expression was prominently observed at 12, 24 and 48 h, signifying that these genes might be the key elements for genetic manipulation to enhance tolerance to lengthy stress duration in actual conditions in C. reinhardtii (Figure 7B).

Details are in the caption following the image
qRT-PCR-based expression assessment under (A) cold and (B) salinity stress. After the stress treatment, the samples were collected at diverse time intervals: 0 (CT), 6, 12, 24, and 48 hours for expression analysis. The data correspond to the mean (± SD) of three biological replicates (n = 3). Asterisks show significant levels at ****p < 0.0001, ***p ≤ 0.001, **p ≤ 0.01, *p ≤ 0.05, and ns means non-significant.

3.11 Salinity stress alters the antioxidant enzyme, ROS markers and photosynthetic pigments

Maintaining redox balance and optimum photosynthetic pigments is vital for coping with stressful conditions. Under salinity, we noticed a noteworthy decline in the antioxidant activities (CAT, SOD, and POD) compared to CT (Figure S5). Likewise, H2O2 content was also reduced under salinity stress, but there was no significant variation in MDA content during the stress conditions (Figure S5). Likewise, salinity stress also reduces the Chl a/b and carotenoid contents during the stress conditions. Notably, more significant reductions were noticed in Chl a/b contents compared to CT (within each parameter) and carotenoid contents (between parameters) (Figure S5). These results suggest that salinity stress disrupts both antioxidant defense mechanisms and photosynthetic pigment strength, which are necessary for stress tolerance in C. reinhardtii.

Details are in the caption following the image
Graphic demonstration of approaches for engineering stress-smart future C. reinhardtii. This overview demonstrates the genome-wide characterization of light-harvesting chlorophyll-binding proteins (CrLHCs) in the C. reinhardtii genome using various in-silico tools and expression data, leading to the discovery of candidate genes for genetic engineering. These genes can be manipulated using several state-of-the-art methods and such alterations can improve critical molecular and cellular processes, which contribute to improved abiotic stress tolerance and adaptation.

3.12 Salinity stress alters the photosynthesis-related parameters

The impact of shifting PAR levels on several photosynthesis-related parameters was examined at different periods (0, 6, 12, 24, and 48 h) under salinity stress (Figure S6). Y(II) reduced gradually with increasing PAR levels at all times. The highest Y(II) was observed at 0 h μmol photons m−2 s−1, with a significant drop as PAR intensity increased, demonstrating decreased PSII efficiency under higher light conditions intensified by salinity stress (Figure S6). ETR displayed a bell-shaped response to increasing PAR, hitting the highest point around 100 to 197 μmol photons m−2 s−1, and then decreasing at higher PAR levels. This trend was constant across all time points, with the extreme ETR observed at 0 h, indicating that salinity stress influences the ETR more significantly at moderate to high light intensities (Figure S6). NPQ intensified with higher PAR, specifically after 6 h of stress, suggesting improved energy dissipation mechanisms under higher light intensities as a defensive response to salinity stress (Figure S6). Y(NPQ) intensified with PAR at 0, 24, and 48 h, signifying an upsurge in defensive energy dissipation processes over time under salinity stress (Figure S6). Y(NO) remained comparatively stable during different PAR levels and time points, representing that Y(NO) was not largely affected by light intensity or stress duration (Figure S6). qN exhibited an increasing pattern with increasing PAR, like NPQ, and speckled during different time points, with obvious upsurges at 24 and 48 h, highlighted the adaptive response of C. reinhardtii to salinity stress over time (Figure S6). qP reduced with increasing PAR levels, specifically at 0 and 28 h, indicating reduced qP at higher light intensities under salinity stress. It remains constant at 12 h and slightly increased at 6 h with increasing PAR levels (Figure S6). qL reduced suddenly with increasing PAR levels, representing that higher light intensities lead to lower qL under salinity stress (Figure S6). These results show that increased PAR levels lead to lowered photochemical efficiency and improved energy dissipation mechanisms in C. reinhardtii under salinity stress. The time-dependent alterations advocate that lengthy exposure to high light intensities impairs these effects, emphasizing the dynamic response of photosynthetic parameters to combined light and salinity stress in C. reinhardtii.

4 DISCUSSION

4.1 Characterization of LHC gene families

During the prior decades, researching the role of LHC genes in different species has been successfully undertaken; yet, their vital roles, mainly in C. reinhardtii, still require more examinations. Therefore, we performed a comprehensive genomic analysis and discovered 61 CrLHC genes in C. reinhardtii genome (Table 1), allowing us to identify their practical and protective mechanisms against stressful conditions. Previously, LHC family genes have been reported in different species including 35 BnLhcbs genes in rapeseed (Zhang et al. 2022b), 28 PbrLhc genes in pear (Pyrus bretschneideri) (Wu et al. 2023), 86 GhLhcs genes in cotton (Gossypium hirsutum L.) (Zhang et al. 2021b), 42 (Actinidia chinensis) and 45 (A. eriantha) LHC genes in kiwifruit (Luo et al. 2022), 27 MdLhcs genes in the apple (Malus × domestica Borkh.) (Zhao et al. 2020), 26 genes in green algae (Prasiola crispa) (Kosugi et al. 2024), 212 Lhcb genes among 9 Rosaceae species, including 23 in Fragaria vesca, 20 in Prunus armeniaca, 21 in P. persica, 20 in P. mume, 15 in P. salicina, 33 in M. domestica ‘Gala’, 33 in Rosa chinensis, 29 in Pyrus bretschneideri, and 18 in Rubus occidentalis (Li et al. 2023), 35 MeLhcs genes in cassava (Manihot esculenta Crantz) (Zou and Yang 2019), 19 PpLhcs genes in peach (Prunus persica L.) (Wang et al. 2023), 20 CsLHCBs genes in tea (Camellia sinensis) (Ye et al. 2024), 13 ZmLhcs genes in eelgrass (Zostera marina L.) (Kong et al. 2016), 127 TaLHCs gene in wheat (Triticum aestivum L.) (Chen et al. 2023b), and 26 StLhcs genes in potato (Solanum tuberosum L.) (Cao et al. 2019). Variations in the LHC members between distinct species may likely be linked with gene repeats incidents (e.g., tandem and segmental) and play a vital role in expanding LHCs for variation. For example, segmental duplication was the main driving factor behind expansion in apple, and most copied genes experienced purifying selection, suggesting their functional conservation (Zhao et al. 2020). Homologous genes in diverse species also be likely to preserve similar functions all through evolution, as confirmed by phylogenetic relationships (Zhao et al. 2020; Zhang et al. 2021b; Luo et al. 2022; Li et al. 2023; Wang et al. 2023; Ye et al. 2024). These evolutionary developments are key to harnessing how LHC genes adapt to diverse conditions and species-specific functional constraints. This was further confirmed in the present study, where we manually omitted the isoforms of many genes and only kept the unique IDs having different genomic positions, resulting in 61 CrLHC genes in C. reinhardtii genome.

The above-cited LHC gene families in different species suggest that LHC members are grouped into 3–7 groups based on Arabidopsis sequencings, domain, sequencing similarities, and annotations, suggesting that there are no standard criteria for gene grouping. However, in the present study, 61 CrLHC genes in C. reinhardtii and 21 from Arabidopsis were grouped into four groups (Lhca, Lhcb, ELIP, and PsbB), which is in agreement with previous studies of cotton (Zhang et al. 2021b), pear (Wu et al. 2023), apple (Zhao et al. 2020), and kiwifruit (Luo et al. 2022), where LHC genes were grouped in 4–7 groups based on Lhca, Lhcb, ELIP, PsbB, etc. Similarly, gene structure analysis, specifically intron-exon patterns, also improves our knowledge of the evolution of LHC genes. Genes within the same phylogenetic sub-groups display conserved structures and motifs, which replicate their evolutionary relationships and functional resemblances. For instance, alterations in intron-exon organization between sub-groups suggest deviation during evolution to meet distinctive functional challenges, as observed in apple (Zhao et al. 2020) and peach (Wang et al. 2023). Likewise, in this study, LHC genes belonging to the same groups almost exhibited similar gene structures (exons-introns patterns) and conserved motifs, highlighting that these genes may participate in performing similar functions, e.g., stress tolerance. In a nutshell, the uniformity of gene arrangements within trees was reliably proven by assessing the motif patterns, gene structures, and phylogenetic relationships. This shows that CrLHC proteins have remarkably well-preserved AA sequences, and CrLHC members within the same sub-tree are expected to perform similar functions.

The PPI and 3D structure assessment of CrLHC proteins presents a comprehensive background to get insights into their functional diversity and structural flexibility. The strong PPIs examined, for instance those between CrLHC12/17/18 and PSBP1, PSAG, and others, highlight their prospective contribution in alleviating photosynthetic complexes and enabling energy transfer. These discoveries align with reports on the structural organization of PSI-LHCI supercomplex in C. reinhardtii, which demonstrates strong interactions between PSI core subunits and LHCIs to claim stability under stress conditions (Yadavalli et al. 2011). On the other hand, the secondary and tertiary structural elasticity of CrLHC proteins, reflected by flexible alpha helices, beta strands, and transmembrane helices, is consistent with the structural complexity noticed in PSII-LHCII supercomplexes, which hold exceptional arrangements of LHCII trimers and antenna subunits C. reinhardtii (Drop et al. 2014; Kubota-Kawai et al. 2019). This structural variety is vital for adjusting and adapting to stressful conditions, supporting helpful energy gathering, and participating in stress tolerance mechanisms.

4.2 Role of LHC genes in stress tolerance mechanisms

Examination of cis-regulatory elements (CREs) of CrLHC genes suggests that LHCs greatly differ in cis-elements composition. Among various abiotic- and phytohormones-correlated elements, light- and MeJA-linked elements were ample in CrLHC genes. These outcomes are similar to former findings of Rosaceae species (Li et al. 2023) and peach (Wang et al. 2023), where diverse stress (mainly light) and phytohormones-responsive elements have been primarily discovered and suggested their key role in stress tolerance. In C. reinhardtii, the characterization of certain stimulating CREs further increases our knowledge of gene regulation under stress conditions. For example, CREs like heat-shock elements, iron deficiency-responsive elements, and singlet oxygen-responsive elements have been discovered and correlated with stress-driven gene expression (Fischer et al. 2009; Fei et al. 2010; Strenkert et al. 2011). Functional investigations showed that CREs (e.g., Eupstr and Ehist cons) can extensively improve gene expression when combined with promoters, recommending their prospective as genetic tools for engineering stress tolerance in C. reinhardtii (Lihanova et al. 2024). In the present study, these results were extra confirmed by GO and KEGG analysis (Figure 2D-E), which also suggests that many CrLHC genes are largely associated with light stress and energy and photosynthesis pathways. A previous study has discovered some CREs that drive high transgene expression, incorporating minimal motifs that can up-regulate expression levels (McQuillan et al. 2022). These CREs, created from abiotic stress-responsive genes, can play a vital role in promoter engineering (by designing synthetic promoters) to enhance stress tolerance.

We examined their expression patterns under numerous stresses to harness more insights into the contribution of CrLHC genes. Our results showed that many genes significantly responded to different stresses (Figures 6, 7). These outcomes further confirmed that genes containing stress-related elements (mainly light-responsive) also showed higher expression under UV-C and green light conditions. Former studies also emphasized the LHCs role in response to different stresses across species. For example, many BnLhcb genes were up-regulated under cold stress in rapeseed (Zhang et al. 2022b). Different LHCs genes responded to temperature and ethylene treatments in two kiwifruit species (Actinidia chinensis and A. eriantha) (Luo et al. 2022). Under drought stress, 20 MdLhcs genes displayed considerable expression levels in apples (Zhao et al. 2020), many PbrLhc genes showed higher expression in pear (Wu et al. 2023), and some of the peach PpLhcs genes were also up-regulated (Wang et al. 2023). In tea plants, CsLHCB genes were up-regulated under temperature (heat and cold) stress (Ye et al. 2024). These examples feature the dynamic role of LHCs genes in multiple stress tolerance across species.

Recent data also suggest that genetic manipulation of LHC genes contributes to stress tolerance across species. For example, transgenic Arabidopsis overexpressing BnLhcb3 gene from rapeseed resulted in enhanced cold tolerance (Zhang et al. 2022b), MdLhcb4.3 from apple improved drought and osmotic stress tolerance (Zhao et al. 2020), and Arabidopsis Lhcb6 protein improved the oxidative stress tolerance (Chen et al. 2018). In wheat, salinity tolerance was reduced in BSMV-VIGS-mediated silencing of the TaLHC86 gene, which affected the photosynthetic rate and electron transport (Chen et al. 2023b). On the other hand, its overexpression in transgenic Arabidopsis enhances salinity tolerance (Chen et al. 2023b). Expression and VIGS-base silencing analysis of the TaLhc2 gene suggested its role in different abiotic and biotic stress tolerance in wheat (Han et al. 2023). In C. reinhardtii, the Lhl4 gene contributed to light stress adaptation, with its transcript greatly induced by high-light and triggered by blue and UV-A light receptors (Teramoto et al. 2006). The LHCSR3 protein, vital for NPQ, accumulates under high-light and nutrient deficiency, shielding the PSII core from oxidative injury and facilitating osmotic stress tolerance (Madireddi et al. 2024). Additionally, npq4 mutants lacking LHCSR3.1 and LHCSR3.2 genes demonstrate decreased stress tolerance, which can be repaired by gene complementation, emphasizing their significance in high-light stress responses in C. reinhardtii (Peers et al. 2009; Maruyama et al. 2014). Together, these findings feature the dynamic roles of LHC genes in advancing stress tolerance in diverse plant and algal species.

4.3 Role of miRNAs and TFs in diverse processes

MicroRNAs (miRNAs) are tiny non-coding-RNAs generated from single-strand hairpin RNA precursors. They modulate gene transcripts by binding to paired regions on target mRNAs (Molnár et al. 2007; Raza et al. 2023). In earlier years, several experiments have been performed to obtain the targets of C. reinhardtii miRNAs that contribute to various processes (Molnár et al. 2007; Voshall et al. 2015; Gao et al. 2016; Voshall et al. 2017; Wang et al. 2017; Lou et al. 2018; Navarro and Baulcombe 2019; Sun et al. 2020; Zhang et al. 2021a). To harness the role of miRNAs targeting LHCs genes, we discovered 19 miRNAs targeting 42 genes. Of these, 15 and 12 CrLHC genes were targeted via cleavage and translation, respectively (Figure 4). Notably, most of the previous genomic studies did not focus on predicting the miRNAs targeting LHC genes. Therefore, the miRNAs identified in the present study could be considered as key elements (LHC-miRNA modules) for genetic engineering to enhance different processes in C. reinhardtii, including stress tolerance.

Moreover, we discovered 400 TFs from 13 families (including ERF, GATA, CPP, bZIP, C3H, MYB, SBP, Dof, Nin-like, bHLH, MYB_related, C2H2, and G2-like) in 61 CrLHC genes, suggesting their role in regulating the transcription of CrLHC genes in C. reinhardtii (Figure 5). Likewise, a previous study also identified 12 TFs families (e.g., ERF, bHLH, bZIP, C2H2, Dof, GATA, MYB, NAC, etc.) in 28 PbrLhc genes in pear (Wu et al. 2023). In C. reinhardtii, only a few members from these TFs families have been functionally characterized. For instance, the overexpression of MYB1 TF increases triacylglycerol and starch growth and biomass construction in C. reinhardtii, suggesting that MYB1 is an ideal target for biofuel production (Zhao et al. 2023). CrbZIP2 TF was found to be controlling lipid/pigment metabolisms of C. reinhardtii (Bai et al. 2021). Another CrbZIP1 TF contributed to endoplasmic reticulum trauma management via adjusting lipid remodelling in C. reinhardtii (Yamaoka et al. 2019). A recent study identified 12 GATA TFs in C. reinhardtii and suggested their role in nitrogen and carbon metabolisms, and light and circadian rhythms-related processes (Virolainen and Chekunova 2024). These studies recommend that the genetic manipulation of TFs might be a fruitful direction to improve C. reinhardtii performance under stressful conditions and produce beneficial products.

4.4 Role of physiological traits in stress tolerance

To harness the role of several physiological traits in stress tolerance, we examine photosynthesis-related traits, antioxidants, ROS markers, and photosynthetic pigments in response to salinity stress (Figures S5 and S6). Under salinity, we noticed a significant decline in almost all the traits, suggesting that salinity stress disrupts both antioxidant defense mechanisms and photosynthetic pigment strength, which are necessary for stress tolerance in C. reinhardtii. Similar findings were noticed by Fal et al. (2022), where salinity stress reduces growth rate, Chl a/b levels, ROS generation, and the activities of enzymes (SOD, CAT, POD, and APX). Likewise, salinity stress reduces cell count, and Chl a, carbohydrate and FAME contents in C. reinhardtii (Hounslow et al. 2021), and photosynthesis-related traits and pigment production were found to be light- and nutrient-dependent (Mogany et al. 2022). In another C. reinhardtii study, photosynthetic pigments and photosynthesis-related traits (i.e., Fv/Fm, ETR, qP, and NPQ) were significantly reduced in response to salinity (chloride and carbonate) stresses (Zuo et al. 2014). From these studies, it can be concluded that salinity stress significantly reduces physiological traits, which is the liability of C. reinhardtii to salinity, proposing the necessity for targeted approaches to improve stress tolerance and retain cellular functions.

5 CONCLUSION

In summary, we discovered 61 putative CrLHC genes in the C. reinhardtii genome, which were mapped irregularly on 15 chromosomes. Comprehensive genomic scrutiny of CrLHC genes was implemented to boost our understanding of these genes in C. reinhardtii research (as highlighted in Figure 8). We also examined their expressions under diverse abiotic stress environments, highlighting the key role of many CrLHC genes in diverse stress tolerance (e.g., CrLHC1/5/8/14/20/26/49/53/56, etc.). Moreover, we also noticed that salinity stress significantly affected several physiological traits in C. reinhardtii. This study set the base for further genetic engineering experiments on some candidate genes, which could help design advanced C. reinhardtii strains with improved stress tolerance (Figure 8).

AUTHOR CONTRIBUTIONS

ZH and AR conceived the idea. AR analyzed the data and wrote the manuscript. LY and PY helped with expression and physiological analysis. HMR helped with genomic analysis. AK and CG helped with proofreading and literature search. ZH supervised, reviewed, and edited the manuscript. All authors have read and approved the final version of the MS.

ACKNOWLEDGMENTS

We thank all the lab members for their support throughout the study. We are also thankful to the College of Life Sciences and Oceanography, Shenzhen University, China, for providing research environment and financial support.

    FUNDING INFORMATION

    This work was supported by National Natural Science Foundation of China (32273118), Shenzhen Special Fund for Sustainable Development (KCXFZ20211020164013021), Guangxi Major Program for Science and Technology (GuikeAA24263042), The Engineering Research Center Support Program from Development and Reform Commission of Shenzhen Municipality (XMHT20220104019), and Shenzhen University 2035 Program for Excellent Research (2022B010) to ZH.

    DATA AVAILABILITY STATEMENT

    The datasets used and/or analyzed during the current study are shown in the supplementary files. For abiotic stress expression profiling, we downloaded transcriptome data from different studies, as explained in the methodology part. The Chlamydomonas reinhardtii and Arabidopsis thaliana genome sequences were downloaded from Phytozome database (https://phytozome-next.jgi.doe.gov/).

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.