Volume 23, Issue 5 pp. 1565-1584
Research Article
Open Access

A system genetics analysis uncovers the regulatory variants controlling drought response in wheat

Bin Chen

Bin Chen

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, China

These authors contributed equally to this article.

Search for more papers by this author
Yuling Liu

Yuling Liu

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, China

These authors contributed equally to this article.

Search for more papers by this author
Yanyan Yang

Yanyan Yang

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, China

Search for more papers by this author
Qiannan Wang

Qiannan Wang

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, China

Search for more papers by this author
Shumin Li

Shumin Li

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Agronomy, Northwest A&F University, Yangling, Shaanxi, China

Search for more papers by this author
Fangfang Li

Fangfang Li

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, China

Search for more papers by this author
Linying Du

Linying Du

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Agronomy, Northwest A&F University, Yangling, Shaanxi, China

Search for more papers by this author
Peiyin Zhang

Peiyin Zhang

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, China

Search for more papers by this author
Xuemin Wang

Xuemin Wang

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, China

Search for more papers by this author
Shuangxing Zhang

Shuangxing Zhang

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Agronomy, Northwest A&F University, Yangling, Shaanxi, China

Search for more papers by this author
Xiaoke Zhang

Xiaoke Zhang

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Agronomy, Northwest A&F University, Yangling, Shaanxi, China

Search for more papers by this author
Zhensheng Kang

Zhensheng Kang

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, China

Search for more papers by this author
Xiaojie Wang

Corresponding Author

Xiaojie Wang

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shaanxi, China

Correspondence (Tel 029-87080063; fax 029-87080063; email [email protected]; Tel 029-87081317; fax 029-87081317; email [email protected])

Search for more papers by this author
Hude Mao

Corresponding Author

Hude Mao

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, College of Agronomy, Northwest A&F University, Yangling, Shaanxi, China

Correspondence (Tel 029-87080063; fax 029-87080063; email [email protected]; Tel 029-87081317; fax 029-87081317; email [email protected])

Search for more papers by this author
First published: 20 February 2025
Citations: 5

Summary

Plants activate a variable response to drought stress by modulating transcription of key genes. However, our knowledge of genetic variations governing gene expression in response to drought stress remains limited in natural germplasm. Here, we performed a comprehensive analysis of the transcriptional variability of 200 wheat accessions in response to drought stress by using a systems genetics approach integrating pan-transcriptome, co-expression networks, transcriptome-wide association study (TWAS), and expression quantitative trait loci (eQTLs) mapping. We identified 1621 genes and eight co-expression modules significantly correlated with wheat drought tolerance. We also defined 620 664 and 654 798 independent eQTLs associated with the expression of 17 429 and 18 080 eGenes under normal and drought stress conditions. Focusing on dynamic regulatory variants, we further identified 572 eQTL hotspots and constructed transcription factors governed drought-responsive network by the XGBoost model. Subsequently, by combining with genome-wide association study (GWAS), we uncovered a 369-bp insertion variant in the TaKCS3 promoter containing multiple cis-regulatory elements recognized by eQTL hotspot-associated transcription factors that enhance its transcription. Further functional analysis indicated that elevating TaKCS3 expression affects cuticular wax composition to reduce water loss during drought stress, and thereby increase drought tolerance. This study sheds light on the genome-wide genetic variants that influence dynamic transcriptional changes during drought stress and provides a valuable resource for the mining of drought-tolerant genes in the future.

Introduction

Drought is a primary abiotic stress that limits yield and plant survival in crops and has been likened to ‘cancer’ for plants owing to its complexity and destructiveness (Pennisi, 2008). Among crops affected by drought, wheat (Triticum aestivum L.) is a staple cereal that is cultivated worldwide, accounting for more than 20% of total protein and calories consumed every day. Wheat was domesticated in the Fertile Crescent, and subsequently spread along with human migration, until reaching its current position as one of the few grains that can be widely grown under arid or semi-arid conditions (Salamini et al., 2002; Zhao et al., 2023a, 2023b, 2023c; Zhou et al., 2020). The combination of poorly predictable drought events arising from climate change presents a major threat to wheat producers (Eckardt et al., 2023; Gupta et al., 2020a, 2020b; Lesk et al., 2016). However, the global human population will reach to ~9 billion by 2050 and food production will need to increase by 70% of that of present capacity to ensure global food security (Hickey et al., 2019). In this expectation, wheat production will need to expand from 763 million tonnes to 858 million tonnes, annually (Gupta et al., 2020a, 2020b). Therefore, improving wheat tolerance to water deficit represents a holy grail of molecular genetic wheat breeding programmes.

However, drought tolerance is a complex quantitative trait controlled by many genes with minor effects. Identification of the causal gene(s) or DNA variant(s) that mechanistically affect phenotype is still challenging (Hu and Xiong, 2014; Liang et al., 2021; Mao et al., 2023; Yang et al., 2023), especially in hexaploid wheat. At present, ~1200 quantitative trait loci (QTLs) involved in wheat tolerance to drought stress have been screened out through linkage analysis and genome-wide association studies (GWAS) (Gupta et al., 2020a, 2020b; Mao et al., 2023), only a small number of causal genes have been uncovered, such as TaWD40-4B.1, TaNAC071-A and TaDTG6-B (Mao et al., 2022; Mei et al., 2022; Tian et al., 2023a, 2023b). This issue is not only present in wheat but also a long-standing problem for other major cereal crops such as maize and rice. Indeed, dozens of QTLs related to drought tolerance have been identified in maize and rice (Almeida et al., 2013; Yue et al., 2005, 2006; Zhao et al., 2018a, 2018b), only a few genes, such as DEEPER ROOTING 1 (DRO1) (Uga et al., 2013), have been cloned through fine mapping. Advances in multi-omics analysis, together with expanded application of GWAS and candidate gene association analysis have driven recent discoveries of drought tolerance-related DNA polymorphisms or alleles (Guo et al., 2018; Li et al., 2020a, 2020b; Mao et al., 2015; Sun et al., 2022, 2023; Tian et al., 2023a, 2023b; Wang et al., 2016, 2024), such as ZmVPP1 (Wang et al., 2016), ZmRtn16 (Tian et al., 2023a, 2023b), DRESH8 (Sun et al., 2023), DROT1 (Sun et al., 2022), and BnaA9.NF-YA7 (Wang et al., 2024). These reports suggest that establishing a landscape perspective of the regulatory mechanisms in wheat drought tolerance requires new approaches for genetic analysis and efficient pipelines for characterizing candidate genes.

Obviously, a multitude of single-nucleotide polymorphisms (SNPs) and insertions/deletions (Indels) were found to be present in non-coding DNA regions through large-scale genome resequencing of natural populations of different crops (Chen et al., 2022; Liu et al., 2020; Zhao et al., 2018a, 2018b; Zhou et al., 2020). Unlike typical loss-of-function mutations in coding regions, the genetic variations in regulatory regions or cis-regulatory elements (CREs) can lead to gene expression changes that impact phenotype, potentially driving evolutionary changes, suitability for domestication, or innovations in breeding (Springer et al., 2019; Swinnen et al., 2016; Wittkopp and Kalay, 2011). Among them, a so-called super eQTL hotspot, DRESH8 (~21.4 kb transposon in ZmPP2C16), was shown to control a resistant phenotype in maize by serving as the source of siRNAs that deplete drought-related mRNAs (Sun et al., 2023). By contrast, three MYB CREs detected within a 366-bp insertion in the ZmVPP1 promoter together enhance drought tolerance by increasing its transcription levels (Wang et al., 2016). Two SNPs in a CCAAT motif were also found to suppress BnaA9.NF-YA7 transcription, thus attenuating drought resistance in Brassica napus (Wang et al., 2024); while an SNP in the DROT1 promoter region enhances its expression, leading to stronger drought tolerance in upland rice varieties, suggesting an adaptive mechanism for improving drought response (Sun et al., 2022). In addition to these natural variations, genome editing strategies that precisely tune gene dosage by modulating transcription levels confer the ability to induce genetic pleiotropy and in turn modify the flavour or size of fruit, plant morphology, grain yield, and stress response phenotype (Alonge et al., 2020; Hendelman et al., 2021; Liu et al., 2021). These numerous reports highlight the effect of transcriptional regulation by non-coding cis-variants or trans-factors as a major contributing factor in phenotypic diversity necessary for key plant adaptations to drought stress.

The development of powerful tools, especially high-throughput transcriptomic profiling and expression quantitative trait locus (eQTL) mapping, have enabled the advanced, in-depth study of the genetic architectures responsible for changes in gene expression that lead to phenotypic diversity in a variety of different model organisms (Zhu et al., 2016), and notably facilitated the recent identification of several essential genes and regulatory networks underlying plant traits (Li, Wang, et al., 2020; Liang et al., 2022; Liu et al., 2020, 2022; Tan et al., 2022; Wei et al., 2024; You et al., 2023; Zhang et al., 2017). Central to these discoveries are dynamic changes in eQTL architecture associated with developmental stages or different stress stimuli that arise from the dependence of transcriptomic profiles on spatiotemporal conditions. Interrogation of these changing eQTL patterns can facilitate a more comprehensive understanding of gene regulation mechanisms and help guide research objectives. More importantly, recently developed strategies, such as Mendelian randomization (MR) analysis or transcriptome-wide association study (TWAS), can be used to screen gene expression patterns associated with specific traits, thus helping to prioritize candidate genes or variants that affect traits (Gusev et al., 2016; Zhu et al., 2016). Therefore, although difficult, improving wheat tolerance to drought requires an accurate and thorough understanding of plasticity in transcriptional architecture associated with drought tolerance.

Here, we examined genomic data from 200 wheat accessions and their 400 corresponding transcriptomes obtained under well-watered or drought stress conditions. By combining GWAS, co-expression networks, TWAS, and eQTL mapping, we identified several genetic loci, co-expression modules, and candidate genes significantly associated with drought tolerance in wheat. To further define the precise regulatory relationships and necessary regulatory elements, we also constructed XGBoost models to predict key transcription factors (TFs) and regulatory mechanisms affecting drought tolerance. We also systematically validated the critical regulatory role of TaKCS3 in wax biosynthesis and drought tolerance through a set of molecular genetic and biochemical methods. These findings shed light on the complex basis of genetic regulation for drought tolerance and provide a valuable resource for breeding drought-tolerant wheat varieties.

Results

Genome-wide markers associated with drought tolerance in wheat

To screen candidate genes involved in drought tolerance, we first constructed a whole-genome genetic variation map for wheat using genotype data from 200 wheat accessions. High-coverage whole-genome sequencing (~14.43×) yielded a total of ~42.291 Tb of data comprising ~283.68 billion 150-bp paired-end reads (Table S1). After filtering for quality, alignment of trimmed reads to the wheat reference genome (IWGSC RefSeq v1.0) (IWGSC, 2018) revealed 337 302 635 total SNPs and 27 152 842 indels. Using cut-off values of minor allele frequency (MAF) ≥ 0.05 and missing rate ≤0.1, we filtered 48 168 951 high-quality SNPs (A genome, 20 893 779; B genome, 24 596 047; D genome, 2 466 299) for further analysis (Figure 1a; Figure S1; Table S2). Among these variants, 44 950 144 (93.32%) SNPs were located in intergenic regions, while only 687 327 (1.43%) mapped to exonic regions (Table S3). In addition, we observed that the A and B subgenomes had similar nucleotide diversity ( π = 1.27 × 10−3 and 1.53 × 10−3, respectively), while the D subgenome ( π = 0.16 × 10−3) had only ~10% of the genetic diversity of the A or B subgenomes (Figure S1). This extremely low nucleotide diversity in the D subgenome of hexaploid wheat may be explained by the severe bottleneck that occurred during hexaploidization (Akhunov et al., 2010; Zhou et al., 2020).

Details are in the caption following the image
Genetic diversity, population structure, and GWAS in a natural diversity panel of 200 wheat accessions. (a) The distribution of SNPs across different wheat chromosomes. Red signifies a higher number of SNPs per Mb, green denotes fewer SNPs/Mb. (b) Neighbour-joining phylogenetic tree of all accessions based on whole-genome SNPs. The colours indicate the seven groups (P1–P7). (c) Principal components analysis (PCA) of the first two components (PC1 and PC2) of the 200 wheat accessions. (d) Population structure of the wheat accessions inferred using K = 7. The y-axis quantifies cluster membership, and the x-axis represents the different accessions. (e) Manhattan plot of GWAS for SR. The horizontal dashed line represents the significance threshold (P < 1 × 10−5), and candidate genes associated with drought tolerance are labelled.

We next constructed a neighbour-joining phylogenetic tree (Figure 1b), performed principal component analysis (PCA) (Figure 1c), and conducted population structure analysis (Figure 1d) of these 200 wheat accessions. As K increased in the genetic assignment analysis, the corresponding cross-validation error decreased and reached a trough at K = 7. The 200 accessions were hence divided into seven subpopulations (P1–P7), respectively comprising 30, 26, 29, 34, 59, 9, and 13 accessions (Figure 1d). These results were further supported by clustering patterns in both the phylogenetic tree (Figure 1b) and PCA plots (Figure 1c), which suggested that geography may be the contributing factor with the strongest influence on the genetic structure of this diversity panel. Assessment of the extent of linkage disequilibrium (LD, as estimated by r2) showed that the D subgenome exhibited lower LD than either the A or B subgenomes in whole genome and intergenic regions, whereas the opposite trend was observed in genic regions (Figure S2). These differences in patterns of LD among groups were consistent with a previous report (Hao et al., 2020), which supported the idea that faster LD decay in the D subgenome could be associated with higher and more evenly distributed recombination events along most of the D subgenome chromosomes.

To further screen the wheat genome for loci that potentially contribute to drought tolerance, we conducted a GWAS based on best linear unbiased prediction of survival rate (SR), as determined in our previous studies (Mao et al., 2022; Mei et al., 2022). Under a standard mixed linear model (MLM), we obtained 180 SNPs significantly associated with drought tolerance (P < 1 × 10−5) (Figure 1e; Figure S3A, Table S4). We then examined the 1 Mb regions upstream and downstream of these significantly associated SNPs, which identified 746 candidate genes (Table S4). Gene Ontology (GO) analysis showed that these candidate genes were mainly enriched in plant stress response pathways, protein phosphorylation, flavonoid and lignin biosynthesis, and transcription regulation (Figure S3B), including several genes related to drought response, such as the wheat homologues of Arabidopsis RD21 (Kim and Kim, 2013), DIL9-3 (Qin et al., 2014), VOZ1 (Nakai et al., 2013), DREB1A (Kasuga et al., 1999), ANAC2 (Wu et al., 2009), ERF71 (Park et al., 2011), ABCG40 (Kuromori and Shinozaki, 2010), DST1 (Huang et al., 2009), and AVP1 (Gaxiola et al., 2001) (Figure S3).

Pan-transcriptome analysis of wheat in response to drought stress

To investigate variants in the wheat genome that may alter the transcriptional patterns of genes participating in drought response, we conducted pan-ranscriptome analysis of the full wheat diversity panel under both well-watered (WW) and drought stress (DS) conditions. For this purpose, seedlings were grown in soil until the two-leaf stage, at which point irrigation was withdrawn. Soil moisture content and relative leaf water content (RLWC) were continuously monitored until RLWC reached ~60% in the drought stress group, while RLWC was sustained at 90% in the well-watered group (Figure 2a). Leaf samples from five plants per genotype in each treatment group were pooled for bulk RNA-seq analysis. We obtained 32.59 billion total high-quality reads with average 81.48 million reads per sample (Table S5). Transcription levels for each gene were quantified by mapping unique reads to the wheat reference genome (IWGSC RefSeq v1.0), which resulted in a subset of 68 695 genes with detectable mRNA levels (Figure 2b). Evaluation of transcriptomic variability among samples by PCA showed that the drought stress and well-watered samples clustered into distinct groups, thus confirming the obvious difference in transcriptomic programme between drought and well-watered growth conditions (Figure 2c).

Details are in the caption following the image
Pan-transcriptome analysis of 200 wheat accessions subjected to well-watered (WW) and drought stress (DS) conditions. (a) Changes in relative leaf water content (RLWC) throughout the drought stress treatment and at the points where WW and DS plants were sampled. (b) Venn diagram illustrating the number of overlapping and unique expressed genes under WW and DS conditions. (c) PCA analysis separates the expression of identical genotypes under WW and DS conditions. (d) Clustering expression pattern of 57 807 differentially expressed genes (DEGs) in response to drought stress. The colour of the bar indicates the level of normalized gene expression. (e) The number and percentage of three clusters of DEGs, where drought-inducible and drought-repressed genes are further categorized into three classes: Class A (60 ≤ n < 100), Class B (100 ≤ n < 180), and Class C (180 ≤ n ≤ 200), with n representing the number of samples. (f) The representative gene ontology (GO) terms are associated with drought-responsive genes across the three clusters. (g) Distribution of log2 fold change of DS/WW for genes response to ABA and water deprivation.

We then established the cut-off for each gene, with the value of log2(Fold Change) ≥ 1 relative to well-watered samples and detected in more than 20% of the accessions as the cluster of differentially expressed genes (DEGs). Based on these criteria, a total of 57 807 DEGs were uncovered, including 30 751 (53.19%) up-regulated and 26 705 (46.20%) down-regulated genes. Considering the expression patterns under both conditions, we further separated these DEGs into drought-inducible or drought-repressed genes (genes were up-regulated or down-regulated across more than 60 varieties), accounting for 43.41% (25096) and 38.52% (22267), respectively. In addition to these gene clusters, we characterized 10 444 genes (18.07%) as responsive-variable genes, which exhibited both up- and down-regulation across different wheat accessions (Figure 2d,e). These drought-inducible or repressible DEGs were then further categorized according to the percentage of samples in which they were detected (class A, 60 < n ≤ 100, 30 ~ 50%; class B, 100 < n ≤ 180, 50% ~ 90%; class C, 180 < n ≤ 200, 90% ~ 100%) (Figure 2e). Heatmap visualization of these inducible, repressed, and variable expression drought-responsive genes across the wheat diversity panel showed inconsistent patterns among accessions, suggesting differences in drought response depending on genetic background (Figure 2d).

Further, GO analysis showed significant enrichment for terms, such as response to stimulus, response to water deprivation, response to abscisic acid (ABA), protein folding, phenylalanine catabolic process, and proline and cuticle biosynthetic process among the drought-inducible DEGs (Figure 2f). We also noted that several well-characterized markers of drought response and ABA signalling were consistently activated across drought stress samples in most wheat accessions (Figure 2g). By contrast, GO analysis suggested that repressed DEGs were enriched in terms related to photosynthesis and hormone response, including chlorophyll biosynthetic process, carbon fixation, response to auxin, response to cytokinin, etc. We also found ‘response to water deprivation’ among the GO terms enriched in drought-repressed DEGs (Figure 2f). Similar GO analysis of the 10 444 variable DEGs revealed enrichment for terms, such as vesicle-mediated transport, autophagy, lipid glycosylation, protein ubiquitination, and cell redox homeostasis (Figure 2f). These results supported the likelihood that natural variability contributed to differences in gene regulation among accessions. Furthermore, by cross-referencing our prior GWAS results, we found 372 of the 746 GWAS candidates that were also differentially expressed in the DEG dataset (Figure S3C; Table S4). Clustering analysis of these 372 overlapping DEGs based on their expression patterns identified 187 candidates that were generally up-regulated across wheat accessions, including several transcription factors (NAC071, WRKY70, ERF1, AKS1, MYB15) and wax biosynthesis genes (KCS3 and CER3). Alternatively, 185 total genes were generally down-regulated under drought stress, such as DST1 and CTL1 (Figure S3C,D).

Moreover, as the biological functions of gene sets are often tightly coordinated through regulatory modules, we sought to home in on such modules (especially those correlated with drought tolerance) to better understand the molecular mechanisms through which they affect drought response. Co-expression clustering analysis of the 68 695 total genes resulted in classifying 29 436 genes into 33 co-expression modules (ME1 - ME33), each containing 59–5697 genes (Figure 3a; Figure S4). Expression levels of the eigengenes for modules ME1, ME2, ME4, ME6, ME7, ME11, ME14, ME16, ME17, ME18, ME20, ME21, ME25, ME27, ME28, ME32, and ME33 were significantly higher in drought stress samples compared to well-watered conditions, while the eigengenes for modules ME3, ME5, ME9, ME10, ME12, ME13, ME15, ME19, ME22, ME23, ME24, ME26, ME30, and ME31 were significantly lower under drought stress conditions than that under well-watered conditions (Figure 3b), which indicated that these gene modules were respectively induced or repressed by drought stress. Further trait-module correlation analysis indicated that eigengene expression levels for modules ME5, ME6, ME9, ME12, ME13, ME23, ME24, and ME27 were significantly associated with seedling SR (P < 1 × 10−10; Figure 3a). GO analysis indicated that module ME6, including 1416 predominantly up-regulated genes (Figure S5A; Table S6), was enriched in terms such as ‘response to water deprivation’, ‘response to abscisic acid’, ‘response to osmotic stress’, and ‘proline biosynthetic process’ (Figure 3c), suggesting that genes in this module could play a prominent role in wheat response to drought stress. Further exploration of hub genes in this module using module membership values (kME) (Langfelder and Horvath, 2008) uncovered 75 hub genes with |kME| > 0.85, including several TFs from the bZIP, HD-Zip, NAC, MYB, and AP2/ERF families (Figure S5B; Table S6).

Details are in the caption following the image
Identification of co-expressed modules and genes significantly associated with drought tolerance. (a) The relationships and correlations among module eigengenes. The heatmap on the bottom illustrates the correlations between module eigengenes and SR trait. (b) Comparison of expression profiles for module eigengenes under WW and DS conditions. The order of the modules corresponds to their relationships based on the dendrogram in A, and all comparisons are based on the Wilcoxon rank-sum test (*P < 0.05; **P < 0.01; ***P < 0.001). (c) GO enrichment analysis of different co-expressed modules. (d) Venn diagram showing shared genes between significant genes identified by TWAS under WW and DS conditions and candidate genes identified by GWAS. (e) GO enrichment analysis of significant genes identified by TWAS under WW and DS conditions. (f) Manhattan plot of TWAS for SR under WW and DS conditions. Each point represents a single gene tested with genomic positions on the x-axis and log-transformed P-values on the y-axis. Genes positively or negatively associated with the SR are plotted above or below the black bold line, respectively. The red dashed lines indicate the significant threshold (P < 0.01). Points exceeding the threshold are coloured to indicate whether the significant associations were detected under WW (blue) or DS (red) conditions, and those genes involved in drought response are labelled.

TWAS screen for genes associated with drought tolerance

To further define the key genes and their associated regulatory mechanisms contributing to drought tolerance in wheat, we next performed TWAS to correlate gene expression levels with SR trait for 200 wheat accessions. Using a threshold for statistical significance (P-value <0.01), we identified 492 genes whose expression levels in well-watered wheat accessions were significantly associated with SR, including 217 positively associated and 275 negatively associated genes. By contrast, the expression levels of 1159 genes emerged as significantly associated with SR in drought stress wheat accessions, with 905 positively associated and 254 negatively associated. Moreover, 30 TWAS-significant genes were jointly detected in both water regime (Figure 3d; Table S7). Analysis of GO functional annotations of significant genes (FDR < 0.05) identified by TWAS showed enrichment for terms related to ‘response to water deprivation’, ‘response to abscisic acid’, ‘response to osmotic stress’, as well as ‘transcriptional regulation’, ‘vesicle-mediated transport’, ‘protein ubiquitination’, and ‘lignin biosynthetic process’ (Figure 3e). Notably, differences in the strength of TWAS associations, and the respective positive or negative relationships, suggested that these significant genes were indeed subject to conditional (i.e. drought-responsive) regulation and that they likely functioned differently in drought response pathways among these wheat accessions. For instance, we found that a wheat homologue of the Arabidopsis drought-sensitive gene DIF1 (Gao et al., 2017) had a positive association with SR in well-watered samples, but was negatively correlated with SR in the drought stress treatment group (Figure 3f). Based on these results, we chose to focus on the large suite of regulatory factors among the significant TWAS genes that may contribute to drought tolerance in wheat.

Among the known regulatory genes in the TWAS results, we found a large proportion related to drought response, such as wheat homologues of Arabidopsis ERD15, DIL9-3, GolS2, RD26, LEA18, ERF1, P5CS2, EDL3, CBF1, SAP5, ERF53, RCD1, NAP, as well as water transport, such as PIP1;4, PIP2;8 (Figure 3f; Table S7). Since the phytohormone ABA plays a vital role in environmental stress response, we included several significant TWAS genes related to ABA biosynthesis (NHL6, NCED3, ABA2, and AAO2), ABA transport (ABCG25 and ABCG40), ABA core signalling (PYL5, SnRK2.5, OST1, HAI3, AREB3, and ABI5), and a number of ABA-responsive genes (GEA6, HB7, CPK29, ERF4, VND7, MYB112, CIPK23, ZFP7, ETR1, and FBW2). In addition to these typical drought stress response pathways, we also noted the presence of genes related to wax biosynthesis (KCS3, KCS12, KAS2, CER9, SHN1, and MYB30) and lignin biosynthetic enzymes (CAD1, CCoAOMT1, MTO3, HCT, and APX1) among the TWAS results (Figure 3f; Table S7), suggesting a potential correlation between drought tolerance and wax or lignin biosynthesis, which was consistent with previous studies (Choi et al., 2023; Lee and Suh, 2022). Interestingly, some genes were negatively correlated with SR among wheat accessions, including genes related to ABA biosynthesis and transport (ABA2, AAO2, ABCG25), water transport (PIP1;4, PIP2;8), and drought or ABA response (GEA6, DIL9-3, GolS2, RD26, LEA18, P5CS2, OST1, SAP5, ERF53, and FBW2) (Figure 3f). These results potentially reflected the balance between stress response and plant development required to prevent over-reaction to environmental stress, which has been reported in other crops (Chiang et al., 1995; Tanaka et al., 2005).

We then investigated whether these SR-associated genes in our TWAS results could facilitate the identification of causal genes in GWAS loci by overlapping their respective gene sets. It is noteworthy that 25 genes identified by TWAS mapped to significantly associated loci detected by GWAS (Figure 3d). One drought tolerance gene previously identified by our group, TaNAC071-A, was located within an SR-associated GWAS locus on chromosome 4A (Mao et al., 2022), and we found that its expression level was indeed significantly positively correlated with SR (Figure 3f). The transcript levels of other candidate genes, such as wheat homologues of Arabidopsis DIL9-3 on chromosome 1B, KCS3 on chromosome 1D, ERF1 on chromosome 3A, NAC053 on chromosome 6A, and ITPK1 on chromosome 7D also shared significant positive correlations with SR (Figure 3f), supporting the likelihood that they may be causal genes for GWAS loci. These results indicated that TWAS could help define causal genes for GWAS loci and were informative of the relationship between phenotype and the transcriptional regulation of various genes involved in wheat response to drought stress.

Genome-wide identification of large-scale eQTLs for drought stress response

To map regulatory variants that may impact gene expression levels or differential responses to drought stress in wheat germplasm, we applied the 48 168 951 SNPs (Figure 1a) and 68 695 expressed genes (Figure 2b) for eQTL mapping. After LD-based merging with criteria r2 > 0.2 and distance <100 kb (He et al., 2022), we identified a total of 620 664 and 654 798 eQTLs (P < 2.17 × 10−10) associated with 17 429 and 18 080 eGenes (genes regulated by eQTLs), representing 26.63% and 27.44% of the expressed genes under well-watered and drought stress conditions, respectively (Figure 4a; Table S8). Among these eGenes, 11 725 (67.27%) under well-watered and 12 509 (69.19%) under drought stress had 10 or more eQTLs influencing their expression (Figure S6A). These results revealed an unprecedented range of regulatory variants controlling gene expression in response to drought stress in wheat and suggested that transcriptomic variation in tolerance to drought stress may be more complicated in Triticum aestivum compared to other crops, like rice, maize, or cotton (Liu et al., 2020; Wei et al., 2024; You et al., 2023).

Details are in the caption following the image
Genome-wide eQTLs identification and dynamic changes in response to drought stress. (a) The proportion of static and dynamic distribution of genome-wide local and distant eQTLs under WW and DS conditions. (b) Combined Manhattan plot showing shared eGenes with significant local eQTL peaks under both WW and DS conditions. (c) Combined Manhattan plot for static local eQTL peaks where the eGenes encode TFs. (d) Combined Manhattan plot showing eGenes with significant local eQTL peaks under only one condition. (e) Combined Manhattan plot for dynamic local eQTL peak where the eGenes encode TFs. To streamline the combined Manhattan plot, we display only the local significant eQTL peaks for specific eGenes, along with randomly selected markers below the threshold (P = 2.17 × 10−10).

When the start positions of the mapped eGenes were plotted against the positions of the eQTLs, a strong enrichment was observed along the diagonal under well-watered and drought stress conditions (Figure S7), suggesting a tightly coordinated, local regulatory relationship for gene expression, consistent with observations in other crops (Liu et al., 2020; Tan et al., 2022; Wei et al., 2024). Additionally, we observed that distant eQTLs tended to co-localize with target eGenes in syntenic regions of homoeologous chromosomes across the genome. We hypothesized that this co-localization may be due to the presence of shared regulatory elements in homoeologous genes that influence their co-regulation through regulatory feedback loops conserved among homeologs. We then categorized all eQTLs as either local (cis-) or distant (trans-), based on their proximity to a target eGene, using a distance of 1 Mb upstream or downstream between the eQTL and its target eGene as a cut-off (He et al., 2022). This analysis identified 29 357 (4.73%) local eQTLs and 591 307 (95.27%) distant eQTLs under well-watered conditions, while 31 360 (4.79%) local eQTLs and 623 438 (95.21%) distant eQTLs were detected under drought stress conditions (Figure 4a; Table S8). Subsequent analysis of the distribution of distances between local eQTLs and the transcription start sites (TSS) of their target eGene indicated that the large majority of eQTLs were located in or near a corresponding eGene region, supporting the possibility that regulatory elements close to eGenes might play a prominent role in regulating their expression (Figure S8). Comparison of determination coefficients between distant and local eQTLs revealed that local eQTLs tended to have a larger effect on target eGene expression than distant eQTLs under well-watered conditions. However, both local and distant eQTLs exhibited similar effects on target eGene expression under drought stress conditions (Figure S6B). This higher number of distant eQTLs, together with their strong associations with target eGenes, implied that higher complexity regulatory network could regulate wheat response to drought, with distant eQTLs playing a prominent role in determining phenotype. In addition, we observed that 11 714 (67.21%) and 12 386 (68.51%) target eGenes were jointly regulated by local and distant eQTLs under well-watered and drought stress conditions, respectively (Figure S6C). These results again illustrated the complexity of eGene regulation, involving interplay between local and distant eQTLs.

Depending on the occurrence of eQTLs under different conditions, we classified eQTLs that were consistently detected under both conditions as static eQTLs, while those detected under only one condition were classified as dynamic eQTLs. This analysis yielded 8112 static local eQTLs and 364 032 static distant eQTLs that could be identified in both water regime; 21 245 dynamic local eQTLs and 227 275 dynamic distant eQTLs were identified under well-watered conditions; and 23 248 dynamic local eQTLs and 259 406 dynamic distant eQTLs were detected under drought stress conditions (Figure 4a). Analysis of local regulatory variants revealed 8112 static local eQTLs that were stable and consistently associated with 7092 eGenes in transcriptomic data of both under well-watered and drought stress treatment groups. For example, we repeatedly observed highly significant peaks that were uniquely associated with genes involved in ABA biosynthesis and response (HVA22A, CRK29, ABA3, ABA4, AAO2), drought stress response (DIL9-3, RGLG2, DEAR3, SAUR32), plant growth and development (FER, GSK1, ARF8, ARF16, XLG3), transcriptional regulation (Di19, TCP24, RAP2.7, MYB30, ERF109, AKS2, SCL11), and wax biosynthesis (HCD1, ACC2, FAR5) under both water regime (Figure 4b,c). The results provided strong evidence supporting that local regulatory variations could significantly affect their own expression and/or that of adjacent genes. However, greater numbers of dynamic local eQTLs were associated with corresponding eGenes under well-watered (21 245 eQTLs for 11 500 eGenes) or drought stress (23 248 eQTLs for 12 220 eGenes) conditions. Annotations of these target eGenes included ABA signalling (NCED9, EDL3, PYR1, ATAF1, HB-7, EULS3, RHC1A), plant growth and development (ARF8, ARF16, VRN1, FD, NAP, SEU, SLY1), transcriptional regulation (bZIP19, bZIP58, ERF73, LOV1, NAC041, NF-YB3, NF-YC10, WRKY51, WRKY70, ZFP2, MYB101), drought stress response (RD19, SDIR1, XERO1, RGLG2, BPM2), wax biosynthesis (KCS3, KCS11, KCR1), and calcium signalling (CIPK15, CPK32) (Figure 4d,e). Especially, eGenes regulated by the dynamic eQTLs under drought stress were significantly enriched for terms such as “response to water”, “response to abscisic acid”, “response to salt stress”, “response to osmotic stress”, “response to oxidative stress” (Figure S9). We inferred from these results that local variants might be responsible for stress responsiveness on these genes among the different wheat accessions.

Identification of eQTL hotspots and construction of a TF-eGene interaction network

Close examination of the genomic distribution of distant eQTLs revealed that a large proportion (24.22%) were localized into higher density regions, forming apparent chromosomal hotspots for distant eQTLs (Figure S9). This analysis ultimately uncovered 572 total hotspots, each containing between 99 and 14 722 distant eQTLs, and corresponding to 77–3628 eGene regulatory targets per hotspot (Figure 5a; Table S9). Interestingly, most of these regulatory hotspots were distributed in the B subgenome (A:B:D = 152:244:152). Comparison of eQTL hotspots between the well-watered and drought stress treatment groups revealed that 437 (76.4%) hotspots were identical or overlapped between groups, while only 135 (23.6%) were exclusive to only one condition. Furthermore, the number of unique eQTL hotspots under drought stress conditions was approximately twice that under well-watered conditions (Figure 5a; Table S10). These results indicated that the distribution of distant eQTLs remains relatively stable between stress and non-stress treatments, but drought stress leads to the formation of significantly more distant eQTL hotspots, suggesting a strong, distinct transcriptional response.

Details are in the caption following the image
Identification of distant eQTL hotspots and construction of a TF-eGene interaction model. (a) The genomic distribution of eQTLs and distant eQTL hotspots. Tracks a and b represent the distribution density of eQTLs along the genome under WW and DS conditions, respectively. Tracks c and d show the locations and distribution of distant eQTL hotspots across the genome, where a more intense red colour indicates a higher number of eGenes and green colour indicates fewer eGenes. The bars indicate the number of hotspots. “Special” refers to the specific hotspots, “Partial” to the partially overlapping hotspots, and “Identical” to the fully overlapping hotspots under two water regimes. (b) Venn diagram showing the number of eGenes identified by GWAS, TWAS, WGCNA, and distant eQTL hotspots under WW and DS conditions. (c) Statistics of TF families predicted by the XGBoost regression model and the frequencies of eQTL hotspots where these TFs are located. (d) A TF-eGene interaction network based on upstream TFs prioritized by the XGboost regression model. The colours of the edges indicate whether the TF-eGene regulatory relationship occurs under WW, DS, or both conditions.

Despite uncovering this pattern of dynamic distant eQTL hotspots involved in drought stress response, identifying specific, relevant genes in these hotspots still posed an obstacle to determining their roles in drought stress response. Previous studies suggest that trans-acting TFs in distant eQTL hotspots likely mediate the activation or repression of drought-responsive genes (Liu et al., 2020; Ming et al., 2023; Peleke et al., 2024; Tan et al., 2022). Screening the 6023 TFs previously identified in the wheat (Yang et al., 2022) resulted in defining a subset of 1063 TFs, spanning more than 20 TF families, located within eQTL hotspots (Figure S11; Table S10). Additionally, we compared our eGenes with candidate genes from GWAS, significantly associated genes from TWAS, and drought-responsive genes from co-expression modules, and found 4085 eGenes that were likely regulated by distant eQTLs in hotspots (Figure 5b) and mainly enriched in GO terms related to stress response pathways. In agreement with our above data, approximately twice as many eGenes were predicted to be regulated by hotspot eQTLs under drought stress conditions as that under well-watered conditions, which aligned well with the higher abundance of eQTL hotspots identified in the drought stress group (Figure 5b).

Considering the well-established physiological and molecular mechanisms of plant response to drought stress (Gong et al., 2020; Gupta et al., 2020a, 2020b; Zhang et al., 2022), we focused on the regulation of 97 eGenes involved in a variety of cellular processes, including ABA biosynthesis and signalling, water stress response, oxidative stress response, wax biosynthesis, lignin biosynthesis, root development, water transport, and calcium signalling. As key TFs in eQTL regulatory hotspots are well-known to influence plant phenotype by regulating downstream gene expression (Liu et al., 2020; Ming et al., 2023; Tan et al., 2022), we next constructed an XGBoost regression model based on a gradient-boosting decision tree sought to identify the key TFs responsible for regulating these 97 stress-related eGenes. This model has been employed in other studies to predict the upstream TFs located in specific distant eQTL hotspots (Tan et al., 2022; Zhao et al., 2023a, 2023b, 2023c). For each eGene, the XGBoost model identified the top three TFs in corresponding distant eQTL hotspots and ranked them as possible upstream regulators. This analysis revealed 160 TFs across >20 TF families, primarily from the C3H, bZIP, AP2/ERF, MYB, C2H2, WRKY, bHLH, and NAC families (Figure 5c; Table S11). Notably, many of these TFs were associated with hotspot 117, suggesting that this hotspot could potentially play a prominent role in regulating drought response in wheat (Figure 5a,c). Furthermore, based on 335 significant associations between these 160 TFs and 97 eGenes, we constructed a drought-responsive regulatory network of TF-eGene interactions, thus providing a comprehensive framework for understanding plant adaptation mechanisms to drought stress (Figure 5d).

A 369-bp local variant enhances TaKCS3 expression to confer drought tolerance

Based on these numerous results depicting an extensive role in wheat response to drought, we therefore investigated whether local variants also regulated TaKCS3 transcription, and if its upregulation shared positive or negative correlations with seedling survival rate (SR). We therefore sequenced ~3.2 kb genomic region that included the promoter, exon, intron, as well as both the 5′- and 3′-UTRs of TaKCS3 from 179 wheat accessions. Sequence alignments revealed 29 SNPs and one InDel which collectively could be used to categorize the accessions into two TaKCS3 haplotype groups, designated as Hap1 or Hap2 (Figure 6a; Table S12). Comparison of stress tolerance between haplotypes indicated that SR was significantly lower in Hap1 accessions compared with Hap2 accessions (P = 5.3 × 10−6; Figure 6b). Furthermore, TaKCS3 transcript levels were significantly decreased in Hap1 plants compared with expression levels of the Hap2 allele, regardless of well-watered or drought stress treatment (P < 0.01; Figure 6c–e). In addition, TaKCS3 mRNA expression showed a strong, positive correlation with SR, also across both under well-watered or drought stress conditions (P < 0.01; Figure 6f,g). These cumulative results suggested that local variations in TaKCS3 could modulate its expression, resulting in differential tolerance to drought stress.

Details are in the caption following the image
A 369-bp local variant in the TaKCS3In-686 allele confers enhanced gene expression and is positively associated with drought tolerance. (a) Haplotype analysis of TaKCS3 genotypes among 200 wheat varieties based on 30 SNPs/indels. (b) Comparison of SR trait between wheat varieties carrying Hap1 (H1) and Hap2 (H2) genotypes. (c) Comparison of TaKCS3 expression in 200 wheat varieties under WW and DS conditions. (d, e) Comparison of TaKCS3 expression between wheat varieties carrying Hap1 and Hap2 genotypes under WW (d) and DS (e) conditions. (f, g) Correlation analysis between wheat SR and relative expression levels of TaKCS3 under WW (f) and DS (g) conditions. (h) Distribution of cis-regulatory elements within the 369-bp promoter sequence. The probes used for EMSA are indicated at the bottom. (i) A diagram showing the construction of different promoter reporters from the Hap1 and Hap2 genotypes, and the truncated 369-bp sequence. (j) Quantify the relative luciferase activity in tobacco leaves transfected with the different reporter constructs. The empty reporter vector served as the negative control. (k) Results from yeast one-hybrid assays showing that TaTGA4, TaTGA10, TabZIP63, and TaMYBR1 binds to 369-bp promoter sequence. The pGADT7 vector served as the negative control. (l) EMSA assays showing that TaTGA4, TaTGA10, TabZIP63, and TaMYBR1 directly bind to TGACG-motif or MYBR cis-elements within the 369-bp promoter sequence. (m) Quantify the relative luciferase activity in tobacco leaves co-transfected with the truncated 369-bp promoter–reporter and different TF effectors. The co-infiltration of an empty effector vector and the reporter construct served as the negative control. Values are means ± SD from at least three independent experiments, and statistical significance was determined by a two-sided t-test (**P < 0.01).

Based on these findings, we next examined whether and how each of these variations in TaKCS3 may affect its expression. Although we found no cis-elements associated with the 26 SNPs in the promoter region, we found that the 369-bp indel (InDel-686) contained multiple cis-elements, including C-box, CAAT-box, TGACG motifs, MYB recognition sites (MYBR), auxin-responsive element (ARE), and low-temperature responsive element (LTR) (Figure 6h). To determine whether the InDel-686 was indeed responsible for increasing transcription of the TaKCS3In-686 allele, we conducted transient expression assays using a LUC (luciferase) fusion reporter driven by the full-length TaKCS3In-686 or TaKCS3Del-686 promoter variants, or the 369-bp insertion alone (Figure 6i). We found that LUC expression was higher in the context of the isolated 369-bp insertion compared with the vector control. Similarly, LUC expression driven by the TaKCS3In-686 promoter was also significantly higher than that driven by the TaKCS3Del-686 variant (Figure 6i,j). These results indicated that the 369-bp insertion in the TaKCS3 promoter likely mediated the enhanced gene expression observed in Hap2 wheat accessions.

Notably, our XGBoost model also identified several TFs in eQTL hotspots, such as wheat homologues of Arabidopsis TGA4, TGA10, bZIP63, and MYBR1, that could potentially influence drought response phenotype via transcriptional modulation of TaKCS3 (Figure 5d; Table S11). Yeast one-hybrid assays suggested that TaTGA4, TaTGA10, TabZIP63, and TaMYBR1 could likely interact with one or more sequence motifs in 369-bp insertion (Figure 6k). To further validate their interactions, we conducted electrophoretic mobility shift assays (EMSA) with probes for either the TGACG-motif (TGACGTC) or MYBR (CAACCA) sequence and recombinant TFs carrying a GST tag. We found that each protein could indeed bind the 369-bp DNA fragment in vitro while adding probes that corresponded to their respective recognition sites could reduce or abolish their interaction with the 369-bp insertion sequence through competitive binding (Figure 6l). Additional dual-luciferase assays in N. benthamiana leaves co-expressing TaTGA4, TaTGA10, TabZIP63, or TaMYBR1 along with the 369-bp insertion sequence fused to a LUC reporter showed that presence of the 369-bp promoter variant resulted in significantly higher LUC expression than that in leaves expressing the reporter alone (Figure 6m). These results thus demonstrated that the 369-bp variant could act as a local regulator to increase TaKCS3 expression through enhanced binding with drought-responsive TFs in distant eQTL hotspots.

Overexpression of TaKCS3 ameliorates drought tolerance in wheat seedlings

Finally, to investigate the relationship between drought stress phenotype and TaKCS3 transcription levels, transgenic wheat lines overexpressing TaKCS3 (TaKCS3-OE) were constructed in wheat cultivars Fielder, Xinong511, and Jimai22 backgrounds. We selected three overexpression lines from 15 to 20 transgenic events for each respective cultivar based on their relatively higher TaKCS3 levels (Figure 7a–c). Comparison of SR among the nine TaKCS3-OE lines and each wild-type (WT) cultivar showed that seedling survival was consistently higher, with less wilting in the drought stress-treated OE lines compared with their WT counterparts. Analysis of seedling fresh weights after drought stress showed that they were 25.12–31.43% higher than the corresponding WT controls, but no significant difference in shoot weights between TaKCS3-OE and WT lines under well-watered conditions (Figure 7a–f). Similarly, detached leaf assays also showed that OE lines lost water at a significantly slower rate compared with WT control leaves (Figure 7g–i), and infrared thermal imaging to monitor leaf surface temperature, which is affected by transpiration, indicated that drought stress-treated TaKCS3-OE plants had higher leaf surface temperatures than WT controls, suggesting less water loss, while no difference in surface temperature was observed between TaKCS3-OE and WT plants under well-watered conditions (Figure 7j–o). Taken together, these results demonstrated that increasing TaKCS3 expression could improve seedling survival during drought stress, potentially by reducing water loss.

Details are in the caption following the image
Increasing TaKCS3 expression confers drought tolerance by promoting cuticular wax biosynthesis. (a–c) Statistical analysis of TaKCS3 expression and drought tolerance of OE lines from wheat cultivars Fielder (a), Xinong511 (b), and Jimai22 (c). (d–f) Assessment of drought tolerance of OE lines from Fielder (d), Xinong511 (e), and Jimai22 (f). Photographs were taken under well-watered conditions and after a 3-day period of recovery with full irrigation post-drought treatment. Scale bars, 5 cm. (g–i) Water loss of detached leaves of different transgenic plants (n = 3). (j–o) Infrared thermography (j, i, n) and leaf temperature (k, m, o) of different transgenic plants under WW and DS conditions (n = 10). Scale bars, 5 cm. (p–r) Alkane and fatty acids contents in leaves of different transgenic plants (n = 6). (s) Molecular marker (InDel-686) used for segregating NILs. (t) Relative expression levels of TaKCS3 in wheat cv. Yanfu188 (YF), Ning9940 (NG), and Chinese spring (CS) under well-watered and drought stress conditions. (u) Relative expression levels of TaKCS3 in NILs carrying In-686 or Del-686 alleles under WW and DS conditions. (v) Fresh weight of NILs under WW and WD conditions. (w) Distributions of the TaKCS3In-686 and TaKCS3Del-686 alleles in hexaploid wheat landraces and modern cultivars. (x) Distribution of TaKCS3In-686 and TaKCS3Del-686 alleles in 10 Chinese ecological zones. Values are means ± SD from at least three independent experiments, and the statistical significance was determined by two-sided t-test (*P < 0.05, **P < 0.01).

KCS is the rate-limiting enzyme in the biosynthesis of cuticular wax components and determines the degree of elongation in a substrate- or tissue-specific manner (Haslam and Kunst, 2013; Xu et al., 2024). Phylogenetic analysis revealed that TaKCS3 clustered with several known functional KCS proteins, such as rice OsKCS10 (Yang and Qin, 2023), maize ZmKCS3 and ZmKCS12 (Xu et al., 2024), and Arabidopsis KCS3, KCS12, and KCS19 (Huang et al., 2023) (Figure S12). Further subcellular localization analysis revealed that TaKCS3 primarily localizes to the endoplasmic reticulum (ER) (Figure S13). This result was in line with previous reports that showed that the large majority of enzymes involved in cuticular wax production are associated with the ER membrane (Samuels et al., 2008; Xu et al., 2024). As KCS proteins are known to be required for wax biosynthesis, we conducted gas chromatography–mass spectrometry (GC–MS) analysis comparing cuticular wax levels and composition in the TaKCS3-OE lines and WT plants to investigate its specific contribution. Although total wax levels did not significantly differ between transgenic OE lines and the corresponding WT plants (Figure S14), these assays revealed that alkane and fatty acids concentrations were indeed higher in the waxy cuticle of TaKCS3-OE compared with that in WT (Figure 7p–r). Further analysis indicated that the chain lengths of many components were significantly altered in the OE plants compared to the WT. In particular, the levels of wax monomers, such as C31 and C33 alkanes and C28 fatty acids, were significantly higher in the OE plants than in the WT (Figure S14), which suggests the possibility that TaKCS3 might contribute to drought tolerance in wheat by altering the cuticular wax load and composition.

To validate the effects of the TaKCS3In-686 allele in conferring drought tolerance, self-pollination of a residual heterozygous line from the F7 RIL population (Yanfu188 × Chinese spring and Ning9940 × Chinese spring) with marker-assisted selection were used to construct two near-isogenic lines of TaKCS3In-686, NIL-TaKCS3YF and NIL-TaKCS3NG, as well as the NIL-TaKCS3CS near-isogenic control (Figure 7s,t). Relative expression analysis by RT-qPCR confirmed that NIL-TaKCS3YF and NIL-TaKCS3NG plants had higher TaKCS3 transcript levels than the NIL-TaKCS3CS controls across well-watered and drought stress treatment groups (Figure 7u). Comparison of fresh weights showed that NIL-TaKCS3YF and NIL-TaKCS3NG plants had higher fresh weight following DS treatment than NIL-TaKCS3CS (Figure 7v), further supporting TaKCS3In-686 allele function in enhancing drought tolerance. As breeding programmes are known to select for and accumulate such favourable alleles in elite lines (Barrero et al., 2011), we examined TaKCS3In-686 prevalence in landraces and modern varieties. This genotyping screen indicated that 56.3% of landraces and 64.6% of the modern cultivars carried the TaKCS3In-686 allele (Figure 7w). Furthermore, TaKCS3In-686 showed an obviously preferential distribution in lines cultivated in Chinese agro-ecological zones III, IV, VII, VIII, and X, compared with its prevalence in the predominant wheat lines of other zones (Figure 7x). These cumulative results indicated that the TaKCS3In-686 allele was favourable for breeding elite drought-tolerant wheat, and had strong potential for even wider deployment in China.

Discussion

The growth, development, and yield of crop plants are often drastically limited by drought stress (Eckardt et al., 2023; Gupta et al., 2020a, 2020b; Lesk et al., 2016). Defining the genetic mechanisms underlying drought tolerance remains a long-standing challenge for wheat breeders due to the complexities associated with multi-gene regulation. Despite a number of major advances in genome sequencing that have enabled the discovery of numerous genetic variants, relatively few such variants have been linked to drought tolerance (Mao et al., 2023; Yang et al., 2023). While many variants have been proposed to participate in regulating the expression of drought-responsive genes, their regulatory targets remain undetermined. In our current work, we conducted genome-wide screening for natural genetic variants that could mediate changes in gene expression in response to drought stress. Based on whole genome-wide variants and transcriptomic data under well-watered and drought stress conditions, a systems genetics approach through the integration of co-expression network, TWAS, eQTL mapping, and XGBoost model has been successfully used to provide comprehensive insights into the genetic basis of natural variation of drought tolerance in wheat. In particular, incorporation of GWAS with the above analysis led to the identification of a previously unknown drought tolerance gene, TaKCS3, which harbours a 369-bp promoter variant carrying recognition cis-elements for multiple drought-responsive TFs, consequently increasing its expression and ameliorating drought tolerance in wheat. Further analysis showed that TaKCS3 modulates the composition of the waxy cuticle, which results in less water loss and higher drought tolerance in plants with higher gene expression levels. Our genome-wide-scale screen of drought tolerance-related variants will further facilitate an expanded understanding of the sophisticated regulatory mechanisms involved in drought response, and provide a framework for identifying more regulators of drought stress in future wheat studies.

eQTL mapping enables the discovery of genetic variants that can regulate single gene or genome-wide transcription patterns, which can drive major innovations in crop breeding and genetics. For instance, eQTL mapping was recently shown to provide a number of benefits over other approaches for screening stress-responsive genes involved in drought, heat, and salt tolerance (Liang et al., 2022; Liu et al., 2020; Wei et al., 2024). Such eQTLs that are significantly associated with drought tolerance traits can be used to characterize the molecular genetic networks involved in crop adaptation to drought conditions and can guide the process of selecting and introgressing drought tolerance genes into elite crop lines. Here, we applied eQTL analysis to RNA-seq data from 200 wheat accessions, which were shown to be useful for characterizing quantitative agronomic traits in several previous studies (Mao et al., 2022; Mei et al., 2022; Wu et al., 2021; Yu et al., 2020; Zhao et al., 2023a, 2023b, 2023c). In comparison of these drought-stressed plants with counterparts grown under full water availability, we obtained 57 807 DEGs and identified eight co-expression modules that contribute to drought tolerance among wheat accessions (Figure 2; Figure 3a–c). In addition, a total of 1621 DEGs controlling seedling SR were identified by TWAS under well-watered and drought stress conditions, and 30 DEGs showed significant correlations under both water regimes (Figure 3f; Table S7).

We then used these high-density genotyping and bulk RNA-seq data under different water conditions for eQTL analysis to investigate genome-wide changes in drought-responsive eGene expression. Based on extensive genetic diversity (48 168 951 SNPs) of wheat genome variation, a total of 903 318 eQTLs were detected under well-watered and drought stress conditions (Figure 4a; Table S8). Approximately, 41.2% of the eQTLs were static and identified under both water regimes (Figure 4a). These results suggested that drought-responsive gene regulation patterns are well-conserved and provide critical functions across accessions, and therefore may have value for improving stress tolerance in crops. Remarkably, 23 248 local and 259 406 distant dynamic eQTLs were detected only upon drought stress conditions (Figure 4a), capturing a copious number of regulatory variants for drought stress response in wheat seedlings. In addition, we found that only ~8.2% (1421) and ~7.9% (1428) of eGenes were exclusively regulated by just one unique eQTL under well-watered and drought stress conditions, respectively (Figure S6A). By contrast, the majority of eGenes are associated with just one unique eQTL in other cereal crops, such as maize and rice (Liu et al., 2020; Ming et al., 2023; Wei et al., 2024). Furthermore, ~ 67.2% (well-watered conditions) and 68.5% (drought stress conditions) eGenes were co-regulated by both local- and distant-eQTLs (Figure S6C), hinting that the regulatory networks involved in modulating gene transcription under variable conditions could be highly complex.

We also found that distant eQTL hotspots were often associated with changes in the expression of several downstream targets. Previous studies employing eQTLs in crops found that identifying hotspot regulatory regions was necessary to define the relationship between genes and traits, and also essential for constructing a gene regulatory network (Liu et al., 2020; Ming et al., 2023; Tan et al., 2022). We identified 572 such distant eQTL hotspots that were predicted to regulate 4085 eGenes (Figure 5a,b). Within these distant eQTL hotspots, we could identify 384 TFs with detectable and variable expression levels across the wheat panel (Table S10). TFs are canonical regulators of gene expression that often function as so-called molecular switches at the end of signal transduction pathways to dramatically shift transcriptomic profiles, activating (or suppressing) targets in response to stress stimuli (Hu and Xiong, 2014; Liu et al., 2020). We therefore constructed an XGBoost regression model, using eGenes and TFs to establish a theoretical basis for feature selection in modelling. This strategy can simultaneously consider and evaluate the interactive or orchestral effects of different types of TFs in an organism. To the best of our knowledge, XGBoost modelling has only been used in eQTL studies of cotton and Brassica napus (Tan et al., 2022; Zhao et al., 2023a, 2023b, 2023c), and thus our current report may be the first relatively complete regulatory interaction network generated through eQTL analysis in wheat. Further, the increasing body of whole-genome sequences will help to increase the scope of our understanding of causal variants in eQTLs and eQTL hotspots.

Moreover, the use of RNA-seq with eQTL analyses was previously demonstrated to be an efficient strategy for identifying potential candidate genes of phenotypic GWAS or QTL loci (Ming et al., 2023; Tang et al., 2021; Yuan et al., 2024). Herein, we used the trait of seedling SR with GWAS along with our above-mentioned integrative transcriptomics analyses (i.e., DEGs, TWAS, and eQTLs) to quickly identify a large set of candidate genes potentially associated with a drought-tolerant phenotype (Figure 3d). Among these candidates, we selected TaKCS3 for further validation of its role in regulating wheat tolerance to drought stress. We found that TaKCS3 indeed functions in stress tolerance via biosynthesis of specific cuticular wax components that reduce water loss during drought stress in wheat (Figure 7a–r). Moreover, we found that the Hap2 variant of TaKCS3 exhibits stronger drought tolerance, which could be attributed to a 369-bp local variant in the promoter region, leading to a differential TaKCS3 expression and drought stress response between Hap1 and Hap2 (Figure 6a–h). Further analysis revealed that 369-bp DNA insertion containing multiple cis-elements, which could be bond and activated by several eQTL hotspot associated TFs, such as TaTGA4, TaTGA10, TabZIP63, and TaMYBR1, thereby enhancing TaKCS3 expression and ameliorating wheat drought tolerance (Figure 6i–m), supporting the effectiveness of our TF-eGene interaction networks. Subsequent functional assays demonstrated that substituting the TaKCS3Del-686 allele with TaKCS3In-686 resulted in higher TaKCS3 expression as well as obviously stronger drought tolerance (Figure 7s–v). Moreover, we found that the TaKCS3In-686 allele is widely distributed among Chinese wheat landraces and modern varieties, across geographic regions, indicating that TaKCS3 and its favourable allele can serve as a direct target for both genetic engineering and selection for the improvement of wheat drought tolerance.

In conclusion, our study applied a combination of multi-omics analyses for genome-wide investigation of the genetic architecture underlying natural variation in drought tolerance of a wheat diversity panel. A massive set of the genetic variants that regulate the gene expression, constitutively or drought dynamically, were uncovered on a genome-wide scale, which provides a large set of candidate genes and eQTLs to facilitate a landscape perspective of drought response in wheat. The prioritized candidate genes are valuable targets for further functional study or allelic mining in controlling wheat drought tolerance.

Methods

Wheat accessions and genome resequencing

Two hundred wheat accessions, including both landraces and cultivars were selected from a wheat diversity association mapping panel (Mao et al., 2022; Mei et al., 2022). The association panel was germinated, cultivated, and phenotyped as previously described (Mao et al., 2022). Genomic DNA was extracted from each accession using the CTAB method, and DNA concentration was detected by the Qubit 4.0 Fluorometer (Invitrogen, USA), as well as DNA integrity and purity were assessed by agarose gel electrophoresis. After quality evaluation, the genomic DNA was randomly fragmented by sonication, and the fragmented genomic DNA was selected by magnetic beads to an average size of 200–400 bp. Paired-end libraries were constructed following the manufacturer's standard protocol (MGI, Shenzhen, China). Finally, the libraries were sequenced with ~10 × sequencing depths on the DNBSEQ-T7 platform (MGI Tech, Shenzhen, Guangzhou, China) for 150 bp paired-end reads. To ensure reliable reads without artificial bias, the paired reads underwent quality checking using the FastQC software (http://www.bioinformatics. babraham.ac.uk/projects/fastqc/), and the low-quality paired reads were removed using fastp v0.23.2 (Chen et al., 2018) with the parameters: -q 10 -u 50 -n 5.

Variant calling, filtering and annotation

The high-quality paired-end reads were mapped to the wheat reference genome IWGSC RefSeq v1.0 (IWGSC, 2018) using BWA v0.7.17 software (Li and Durbin, 2009) with the command ‘mem -M -k 32 -t 4’. Subsequently, the raw SNPs and InDels were obtained using the DNA-seq pipeline of Sentieon (http://www.sentieon.com). SNPs were preliminarily filtered using bcftools v1.15 (Danecek et al., 2021) with the parameter -e ‘MQ < 40.0 || FS > 60.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0 || SOR > 3’. The filtering settings for InDels were ‘QD < 2.0 || FS > 200.0 || ReadPosRankSum < -20.0’. SNPs that did not meet the following criteria were further excluded: (i) a total read depth within 30% of the mode value of the site depth distribution and a standard deviation of the depth ≤ 12; (ii) biallelic alleles; (iii) P-value for the segregation test ≤0.01; (iv) a mean of linkage disequilibrium (r2) between SNP being tested and 50 adjacent SNPs ≥0.1; and (v) minor allele frequency (MAF) ≥ 0.05 and a maximum missing rate ≤0.1. The SNPs/InDels identified from the population were further annotated according to the IWGSC RefSeq v1.1 annotation using the software ANNOVAR (Wang et al., 2010).

Population genomic analysis

The biallelic SNPs with a missing rate ≤0.1 and a MAF ≥0.05 were kept for population genomic analysis. The genetic diversity ( π ) was calculated using VCFtools v0.1.16 (Danecek et al., 2011). Due to the highly repetitive nature of the wheat genome, especially in the intergenic regions, only SNPs located in the genic regions were used to construct the phylogenetic tree and infer population structure. The p-distance was used to construct a neighbour-joining (NJ) phylogenetic tree with 1000 bootstraps using the software MEGA-CC (Kumar et al., 2012), and the consensus tree was visualized by the ggtree v2.2.4 package in R (Yu et al., 2017). Population structure was evaluated using the programme Admixture v1.3.0 (Alexander et al., 2009) and principal components analysis (PCA) was performed using GCTA (Yang et al., 2011) with the entire set of SNPs.

Linkage disequilibrium analysis

To compare the patterns of linkage disequilibrium (LD) among different groups, the genome was divided into 4 Mb sliding windows with a sliding step of 2 Mb. For genome and intergenic groups, 400 SNPs were randomly selected from each 4 Mb window for calculations. However, for the exonic group, all SNPs were included in the calculations. We utilized PopLDdecay v3.42 software (Zhang et al., 2019) for these calculations, setting the maximum distance to 30 000 (MaxDist 30 000) and the output type to 1 (OutType 1). For LD decay analysis, 30 Mb LD regions were further divided into 100 bp bins, and the mean of r2 values were calculated. Subsequently, we plotted these values against distance in Mb and applied a LOESS smoothing curve using second-degree locally weighted regression to fit the data.

Genome-wide association analysis

The filtered SNPs within the population were utilized in the GWAS. An association analysis was conducted employing a univariate linear mixed model from the GEMMA software package (Zhou and Stephens, 2012). Correction for population structure was performed using the relatedness matrix and the top three genotyping principal components. The suggestive threshold for the P-value (2.17 × 10−7) was determined using the Genetic Type 1 Error Calculator (GEC) software (Li et al., 2012). To achieve a balance between false-positives and false-negatives, a moderate threshold of P-value (1 × 10−5) was chosen for identifying significant associations, as guided by previous GWAS studies in wheat (Mao et al., 2022; Mei et al., 2022).

RNA-seq analysis

Plants of 200 wheat accessions were germinated and cultivated as previously described (Mao et al., 2022; Mei et al., 2022). Two duplicated cultivations were performed for each accession. When the seedlings were 3 weeks old, drought treatment was applied to one duplicated cultivation by withholding water, while keeping another duplication for normal growth. After 3 weeks, relative leaf water content (RLWC) was monitored to reach ~60% in the drought stress group, while RLWC was sustained at 90% in the well-watered group. Five-cm leaf sections in the middle of the second leaves were harvested from five plants of each germplasm, pooled, and frozen in liquid nitrogen. The samples were stored at −80 °C prior to RNA extraction.

RNA extraction and sequencing for each of the 200 accessions were performed as previously described (Mei et al., 2022). After discarding low-quality reads and those containing sequencing adapters, a total of 16.18 billion and 16.41 billion paired-end 150 bp reads were obtained from the well-watered and drought stress samples, respectively. To quantify the expression levels of the high-confidence (HC) genes from the RNA-seq data, we employed the Kallisto v0.46.1 software (Bray et al., 2016), which is based on the pseudo-alignment method and has been previously validated for the polyploid wheat genome (IWGSC, 2018). We defined expressed genes as those with expression levels (Transcripts Per Kilobase Million, TPM) >0.1 in at least 20% of the wheat accessions and a minimum twofold expression change between the 5th and 95th percentiles of sorted expression levels.

Co-expression modules construction

To identify the co-expression modules for the expressed genes, the WGCNA package (Langfelder and Horvath, 2008) was used to construct the co-expression network. To choose the soft thresholding power based on the criterion of approximate scale-free topology, the function pickSoftThreshold was used, and power 13 was chosen as the lowest power at which the scale-free topology fit index curve flattens out upon reaching a high value. A signed network was constructed using the blockwiseModules function with the following parameters: power = 13, corType = pearson, TOMType = unsigned, minModuleSize = 50, and mergeCutHeight = 0.25. Each gene was then assigned to a module, denoted by a number and a colour, based on its co-expression relationships as determined by the above criteria. Genes that did not fit into these defined modules were categorized into ‘grey’ module.

Transcriptome-wide association study

To conduct the TWAS analysis, expressed genes under well-watered (65 460 genes) and drought stress (65 884 genes) conditions were analysed using the EMMAX software (Kang et al., 2010) with MLM. A kinship matrix, calculated from randomly selected SNPs across the genome, was incorporated to adjust for population structure. To achieve a higher level of statistical power, we applied a less conservative threshold (P-value <0.01), serving as the significance criterion for the TWAS analysis.

eQTL mapping

For eQTL identification, only 68 695 expressed genes were used. One of the prerequisites for detecting eQTLs through a linear regression model is that the expression levels should follow a normal distribution in each genotype class, which is violated by outliers or non-normality in gene expression estimated from the sequencing reads (Fu et al., 2013). The expression level of each gene under both well-watered and drought stress conditions was normalized to follow a normal distribution using the ‘qqnorm’ function in R. To eliminate the hidden confounding factors in the expression data, the probabilistic estimation of expression residuals (PEER) (Stegle et al., 2012) was further employed for the data and then the residuals were used for studying the genetic effects on expression levels in the population. A total of 48 168 951 SNPs derived from the whole-genome sequencing and the gene expression PEER residuals were used for final eQTL mapping. The associations between SNPs and gene expression PEER residuals under both well-watered and drought stress conditions were carried out using the linear regression model of the MatrixEQTL package v2.3 (Shabalin, 2012), incorporating the top three genotyping principal components as covariates.

To reduce false-positive associations between SNP-gene pairs, a highly significant P-value threshold (P < 2.17 × 10−10) calculated by the GEC software (Li et al., 2012) was set. For each gene, significantly associated SNPs were merged based on LD (r2 > 0.2) and distance (d < 100 kb) into genomic intervals. The SNP with the strongest association signal within an interval was defined as an eQTL of the gene. If an eQTL was located within ±1 Mb upstream and downstream of the target gene, it was defined as a local eQTL; otherwise, as a distant eQTL. The explained variance (R2) of each eQTL for gene expression was estimated based on the formula R 2 = U 1 SST = b 1 2 C 11 SST . Where U1 represents the partial sum of regression squares for the specific eQTL, b1 is the partial regression coefficient indicating the estimated marker effect size, c11 corresponds to the relevant element in the inverse matrix of the coefficient matrix within the normal equation group, and SST denotes the total sum of squares reflecting the sum of squared differences between the actual gene expression level and its mean (https://github.com/bingochenbin/WMG200).

Distant eQTL hotspots identification

Distant eQTL hotspots are defined as genomic regions that influence the expression of many downstream genes. To identify the potential distant eQTL hotspot regions, a sliding window with 1 Mb and 100 kb steps was performed to count the number of distant eQTLs along the genome under both well-watered and drought stress conditions. To determine the threshold or declaring significant distant eQTL hotspots, a permutation test was conducted by randomly permuting the distribution of distant eQTLs (Wang et al., 2018). In the permutation, all distant eQTLs were randomly assigned to 1 Mb windows in the genome, and the number of eQTLs in each window was counted. Subsequently, the maximum number of eQTLs within a 1 Mb window was recorded. After 1000 permutations, the threshold for declaration of a significant distant eQTL hotspot was 98 eQTL/Mb (P < 0.01), based on the distribution of the maximum number of eQTLs from the permutations. Any window with a greater number of distant eQTLs than the cut-off was considered a putative distant eQTL hotspot regulatory region. Overlapping or adjacent hotspots that likely corresponded to a single hotspot were combined into one. The distant eQTL hotspots were visualized using circos v.0.69-6 (Krzywinski et al., 2009).

XGBoost regression model construction

The TPM matrices from the RNA-seq analysis under both well-watered and drought stress conditions, along with a total of 6023 wheat TFs obtained from a previous study (Yang et al., 2022), were used to construct the XGBoost regression models. Gene regulatory networks (GRNs) were built using the Python machine learning library scikit-learn v0.24.1 (Pedregosa et al., 2011) and XGBoost v1.7.5 (Chen and Carlos, 2016). The transformed TPM matrices and the list of putative TFs were used to train the XGBoost regression model for each dataset using the XGBRegressor class, with the parameters set to ‘n_estimators = 1000, max_depth = 3, learning_rate = 0.0001, reg_alpha = 0, reg_lambda = 1’.

Wheat transformation and drought tolerance assay

Seedling leaves of wheat cultivar Chinese Spring were used to generate cDNA that served as template to amplify the TaKCS3 ORF, which was then inserted into the pMWB122 vector to generate the Ubi:TaKCS3 construct. The binary vector harbouring the desired construct was transferred by electroporation into Agrobacterium tumefaciens strain EHA105 and transformed into the wheat cultivars Fielder, Xinong511, and Jimai22 by Agrobacterium-mediated transformation. Transgene-positive plants were identified in the T0, T1 and T2 generations by PCR and Sanger sequencing of the transgene using vector-specific primers (Table S13). Transgenic plants were screened and phenotyped following drought stress as described previously (Mao et al., 2022; Mei et al., 2022). Briefly, T2 transgene-positive and WT plants were planted together in enriched soil (3:1 soil:vermiculite, w/w), and grown in a greenhouse under 16 h of light and 8 h of darkness, with a 16 °C/14 °C temperature cycle, and 60% relative humidity. Drought treatment was applied to soil-grown plants at the three-leaf seedling stage by withholding water. After ~25 days, watering was resumed to allow plants to recover. Survival rates were recorded after 3 days by scoring all plants with green and viable leaves. At least 12 plants of each line were compared in each test and statistical analyses were based on data obtained from four independent experiments. Leaf water loss rate was calculated based on gravimetric water loss in relation to fresh weight from three biological replicates. Leaf temperature was measured with a portable infrared imager (T40, Teledyne FLIR).

TaKCS3 favourable allele exploration and genotyping

The ~3.2 kb genomic region, consisting of the ~1.8 kb promoter, exons, 5′-UTR and 3′-UTR sequences, was amplified and sequenced from 179 wheat accessions. These sequences were aligned using MEGA version 7 (Hall, 2013). Nucleotide polymorphisms, including InDels and SNPs (MAF ≥0.05) were identified and further divided into different haplotypes. PCR genotyping was conducted by using 369-bp InDel marker in the promoter region of TaKCS3. The progenies of near-isogenic lines were used to test the drought tolerance-related traits.

Subcellular localization of TaKCS3

For TaKCS3 localization assay, the cauliflower mosaic virus (CaMV) 35S promoter-driven pJIT163:GFP expression vector (Mei et al., 2022), containing the ORF of TaKCS3, was transformed into wheat mesophyll protoplasts using polyethylene glycol (PEG)-mediated transformation (Yoo et al., 2007). The empty vector was separately transformed as a control. After incubation for 18 h under 23 °C in the dark, fluorescence was detected using a confocal microscope (FluoView FV300, Olympus, Japan) and visualized using the image analysis software provided by the manufacturer.

Yeast-one-hybrid assay

For Y1H assays, the coding sequences of TaTGA4, TaTGA10, TabZIP63, and TaMYBR1 were amplified from wheat cultivar Chinese Spring and independently fused with the activation domain of the pGADT7 vector (Clontech, Mountain View, CA, USA) to produce the recombinant effectors. The 369-bp insertion fragment of TaKCS3 was amplified and cloned into the pABAi vector (Promega, Madison, WI, USA) as the reporter. The yeast strain Y1HGold was created by the co-transformation of effector and reporter plasmids, and the transformed yeast cells were grown in SD/-Ura-Leu selection plates with 0.3 mg/L AbA at 30 °C for 4 days. Empty empty pGADT7 vector served as negative control.

Electrophoretic mobility shift assay

The coding sequences of TaTGA4, TaTGA10, TabZIP63, and TaMYBR1 from wheat cultivar Chinese Spring were respectively cloned into the pGEX4T-1 vector. The resulting constructs were transformed into E. coli strain Rosetta (Promega). The recombinant proteins were induced when cell cultures reached OD600 = 0.6 using 0.1 mM Isopropyl β-D-Thiogalactoside (IPTG), followed by overnight incubation at 16 °C. E. coli cells were then lysed by ultrasonicator (JY92-IIN) set to 10% power, in 3:2 s on:off intervals for 10 min until bacterial lysates resembled a water-like consistency. The GST fusion proteins were purified using GST-Sefinose resin (Promega), according to the manufacturer's instructions. EMSA was then conducted using the LightShift Chemiluminescent EMSA Kit (Thermo Fisher Scientific, Waltham, MA, USA) following the manufacturer's instructions.

Dual-luciferase transcriptional activity assays

Dual-luciferase transcriptional activity assays were conducted as previously described (Mao et al., 2022). The reporter construct was generated by cloning the 369-bp promoter fragment of TaKCS3 from wheat cultivar Chinese Spring into the pGreenII 0800-LUC vector (Promega); the effector constructs were generated by cloning the coding sequences of TaTGA4, TaTGA10, TabZIP63, and TaMYBR1 into the pGreenII 62-SK vector (Promega), driven by the CaMV 35S promoter. Agrobacterium strain GV3101 was transformed with each respective construct, and the resulting Agrobacterium colonies were used to co-infiltrate 4-week-old N. benthamiana leaves with the reporter and effector plasmids. Luciferase levels were measured using a Dual-Luciferase Reporter Assay System (Promega) following the manufacturer's instructions. Normalized data are presented as the ratio of luminescent signal intensity of the reporter versus the internal control reporter (35Spro:REN) from three independent biological samples.

Cuticular wax extraction and chemical analysis

Five-leaf stage wheat leaves with the same growth and leaf position were harvested and cut into 4 cm segments. The surface area of each sample was calculated using the intelligent leaf area measurement system. The sample leaves were then immersed in 20 mL chloroform (trichloromethane) and extracted for 60 s to obtain the surface wax at room temperature. Subsequently, 50 μL n-tetracosane (C24 alkane) was added to all extracts as an internal standard. Cuticular wax samples were filtered, concentrated, and transferred to gas chromatography (GC) autosampler vials, then dried completely with N2. The residues were derivatized with 50 μL bis-N, N-(trimethylsilyl) trifluoroacetamide (BSTFA, Sigma) and 50 μL pyridine (Fluka) at 70 °C for 1 h. Finally, the BSTFA and pyridine were dried with N2 and the resulting derivatives were dissolved in 700 μL of chloroform.

The composition of each cuticle wax sample was analysed by GC equipped with a Rxi-5ms column attached to a mass spectrometer (GMS-TSQ9000, USA). The column was operated with a helium carrier gas and split injection (5:1) at 280 °C. The oven temperature was programmed for 2 min at 50 °C, then increased by 10 °C/min to 200 °C, where it was held for 5 min at 200 °C. Subsequently, the temperature was further increased by 1.5 °C/min to 320 °C, and held for 15 min at this final temperature. Individual constituents of the wax extracts were identified via mass spectrometry by comparison with standards and known properties of each metabolite reported in the literature. Quantification was based on flame ionization detection peak areas relative to the internal standard n-tetracosane.

Accession numbers

The genotypes of 200 wheat accessions used in this study have been deposited in the Genome Variation Map (https://bigd.big.ac.cn/gvm) under accession number GVM000750, and the RNA-seq reads for 200 wheat accessions under well-watered and drought stress conditions have been deposited in the Genome Sequence Archive (https://bigd.big.ac.cn/gsa) under accession number CRA016347. All the materials in this study are available upon request.

Acknowledgements

We are very grateful to Dr. Xueling Huang and Liru Jian of the State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, Northwest A&F University, for assistance with quantitative real-time PCR, genetic transformation, and GC–MS analyses. No conflict of interest was declared. This study was supported by the National Key Research and Development Programme of China (grant no. 2022YFD1200202), the National Natural Science Foundation of China (grant no. 32072002 and 32272044), and the Science Foundation for Distinguished Young Scholars of Shaanxi Province (grant no. 2023-JC-JQ-20).

    Author contributions

    H.M. and Z.K. designed and supervised this study. S.L., F.L., L.D., P.Z., and X.W. cultivated the wheat varieties and collected the leaf samples. B.C. performed the bioinformatics analysis. Y.L., Y.Y., and Q.W. performed the TaKCS3-related experiments. S.Z. and X.Z. provided the materials and helped in the construction of NILs. H.M., B.C., Y.L., Y.Y., and Z.K. drafted and revised the manuscript. The authors read and approved the final manuscript.

    Conflict of interest

    The authors declare no competing interests.

    Data availability statement

    The data that support the findings of this study are openly available in Genome Sequence Archive at https://bigd.big.ac.cn/gsa, reference number CRA016347.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.