Shaping polyploid wheat for success: Origins, domestication, and the genetic improvement of agronomic traits
Edited by: Zhizhong Gong, China Agricultural University, China
ABSTRACT
Bread wheat (Triticum aestivum L., AABBDD, 2n = 6x = 42), which accounts for most of the cultivated wheat crop worldwide, is a typical allohexaploid with a genome derived from three diploid wild ancestors. Bread wheat arose and evolved via two sequential allopolyploidization events and was further polished through multiple steps of domestication. Today, cultivated allohexaploid bread wheat has numerous advantageous traits, including adaptive plasticity, favorable yield traits, and extended end-use quality, which have enabled its cultivation well beyond the ranges of its tetraploid and diploid progenitors to become a global staple food crop. In the past decade, rapid advances in wheat genomic research have considerably accelerated our understanding of the bases for the shaping of complex agronomic traits in this polyploid crop. Here, we summarize recent advances in characterizing major genetic factors underlying the origin, evolution, and improvement of polyploid wheats. We end with a brief discussion of the future prospects for the design of gene cloning strategies and modern wheat breeding.
INTRODUCTION
Bread wheat (Triticum aestivum L., AABBDD, 2n = 6x = 42), one of the most important food crops, provides about one-fifth of the calories used by humans around the world (Dubcovsky and Dvorak, 2007). As a globally cultivated allopolyploid crop, bread wheat has been shaped for success by multiple steps of domestication and polyploidization. Understanding the origin and evolution of bread wheat enables us to design new and better strategies for improving this important food crop.
Taxonomically, bread wheat belongs to the Dinkel section in genus Triticum L. of the tribe Triticeae of the family Gramineae. The three genomes of bread wheat (A, B, and D) are derived from three diploid wild ancestors, Triticum urartu (AuAu, 2n = 2x = 14), an unknown member of the Sitopsis section (closely related to Aegilops speltoides, SS, 2n = 2x = 14), and Aegilops tauschii (DD, 2n = 2x = 14). These three genomes were combined through two sequential polyploidization events. The first polyploidization occurred between T. urartu (contributing the A genome) and the Sitopsis species (source of the B genome) about 500, 000–150, 000 years before present (BP), resulting in the appearance of tetraploid wild emmer wheat (Triticum dicoccoides) belonging to the Emmer wheat section (IWGSC, 2014, 2018). About 8, 000–9, 000 years BP, a second polyploidization event followed the natural hybridization between the domesticated tetraploid emmer wheats (AABB) and Ae. tauschii (DD), finally giving rise to T. aestivum (Figure 1) (Dvorak et al., 2012; Pont et al., 2020; Yao et al., 2020).

Evolution and domestication of crop wheats with different ploidy levels
BTR1, Tg, Sog, and 5Aq are the wild-type alleles of four domestication genes; btr1, tg, and sog are recessive mutant alleles; and 5AQ is the dominant domesticated form of 5Aq. The black boxes show materials that have not yet been identified or validated.
Apparently, the diploid, tetraploid, and hexaploid wheat species had all undergone domestication before they were widely cultivated. Domestication of diploid Triticum boeoticum (AbAb, 2n = 2x = 14), a close relative of T. urartu, gave rise to the cultivated einkorn wheat Triticum monococcum (AmAm, 2n = 2x = 14), with non-brittle rachides but hulled grains (Pourkheirandish et al., 2018). A parallel domestication of wild tetraploid emmer wheat resulted in the sequential appearance of hulled domesticated emmer wheat (Triticum dicoccum) and large-seeded, free-threshing durum wheat (Triticum durum; Figure 1) (Dubcovsky and Dvorak, 2007; Avni et al., 2017). Because of its favorable large-seed and free-threshing traits, tetraploid durum wheat had become the major wheat crop by ~3, 000 years BP. However, once hexaploid bread wheat appeared, this “newcomer” exhibited broader adaptability and many other favorable agronomic traits, which enabled it to expand further than durum to become the most important wheat crop worldwide. In this review, we focus on the recent advances in understanding the origin, domestication, and distribution of wheat crops, especially the hexaploid bread wheats, around the world. Moreover, we summarize the key genetic factors that shape the complex agronomic and yield traits of polyploid wheats, including stress tolerance (primarily abiotic stress tolerance; disease resistance is not covered because of space limitations), lodging resistance, high-yield traits, and extended end-use quality, which have, to a large extent, ensured the success of polyploid wheat as a global staple food crop.
GENETIC BASES OF WHEAT SPECIES AND SUBSPECIES FORMATION
Genetic diversity, conferred by polyploidization-triggered genome changes, interspecies introgression, and spontaneously occurring mutations, forms the basis for species/subspecies differentiation in the genus Triticum (Feldman and Levy, 2005). Although current genomic sequencing of Triticum populations provides deep insights into genetic variation among different wheat species, enabling a preliminary understanding of wheat species evolution and classification, it remains challenging to pinpoint the major genes that shaped the origins of wheat species/subspecies (IWGSC, 2014, 2018; Avni et al., 2017; Luo et al., 2017; Ling et al., 2018; Hong et al., 2019; Guo et al., 2020; Pont et al., 2020; Yao et al., 2020). To date, several pioneering studies have been undertaken to characterize the genes involved in wheat species/subspecies formation.
VRT-A2 facilitated the origin of T. polonicum and T. petropavlovskyi
Polish wheat (Triticum polonicum, AABB, 2n = 4x = 28) is a unique tetraploid wheat species characterized by elongated outer glumes, that is, the basal, sterile, bract-like organs subtending the wheat spikelets (Watanabe, 1999). This long-glume trait is controlled by a single gene locus, P1 (from Polish wheat). Recently, several independent groups, including our own (Liu et al., 2021), have reported the map-based cloning of the P1 causal gene and revealed that the ectopic expression of a MADS-box transcription factor gene, VRT-A2 (VEGETATIVE TO REPRODUCTIVE TRANSITION 2 in genome A), resulted in the long-glume phenotype in T. polonicum (Adamski et al., 2021; Liu et al., 2021). Further evidence suggests that a sequence rearrangement within the first intron of T. polonicum-derived VRT-A2, consisting of a 560 base-pair (bp) deletion coupled with a 157 bp sequence substitution, facilitates the activation of VRT-A2 in floral organs, resulting in the repression of a large set of MADS-box genes and elongated leafy glumes (Li et al., 2020a; Adamski et al., 2021; Liu et al., 2021). A recent study suggested that the deletion/insertion variation in VRT-A2 intron 1 might also cause intron retention and, therefore, a frame-shift mutation (Chai et al., 2021). More interestingly, a sequence rearrangement identical to that of VRT-A2 has been detected in T. petropavlovskyi (also known as “Daosuimai” or rice-head wheat), a hexaploid wheat subspecies that also has elongated glumes (Liu et al., 2021; Xiao et al., 2021). Together, these studies support two major conclusions: (i) the activation of VRT-A2 facilitated the origin of T. polonicum as a unique species; and (ii) the long-glume trait of T. polonicum was likely inherited by T. petropavlovskyi through a genomic hybridization between T. polonicum and an unknown hexaploid landrace, enabling the formation of T. petropavlovskyi as a special subspecies of T. aestivum (Dixon and Boden, 2021; Xiao et al., 2021; Xu et al., 2021).
Tasg-D1 underlies the origin of T. sphaerococcum
The hexaploid T. sphaerococcum (AABBDD, 2n = 6x = 42), also known as Indian dwarf wheat endemic to India and Pakistan, has been classed as a unique wheat species belonging to the Dinkel section (Table 1). T. sphaerococcum shows typical brassinosteroid (BR)-deficient phenotypes, such as hemispherical grains, a semidwarf stature, straight flag leaves, and compact spikes, due to the pleiotropic effect of a single semi-dominant gene locus, S (Sphaerococcum) (Sears, 1947). The S causal gene, designated as Tasg-D1 (Triticum aestivum semispherical grain 1 in genome D), has been isolated by map-based cloning and, consistent with the phenotypes it confers, encodes serine/threonine protein kinase glycogen synthase kinase 3 (GSK3), a key repressor of BR signaling that shows sequence homology to the Arabidopsis BRASSINOSTEROID-INSENSITIVE 2 (BIN2) protein (Cheng et al., 2020; Gupta et al., 2021). Single amino acid substitutions (Arg284-to-Gly or Glu286-to-Lys) in the highly conserved TREE (Thr283-Arg-Glu-Glu286) domain enhanced the stability of Tasg-D1 protein, leading to compromised BR signal transduction and consequently the species-forming phenotypes of T. sphaerococcum. Intriguingly, previous studies showed that amino acid substitutions in the TREE domain of GSK proteins in Arabidopsis and rice also caused BR-deficient dwarfism and round grains (Li and Nam, 2002; Liu et al., 2014). These findings suggest a highly conserved biological function of GSK kinases in diverse plant species. Haplotype analyses in different wheat accessions further supported the conclusion that T. sphaerococcum might have originated from T. aestivum after hexaploidization, due to the natural point mutation in Tasg-D1. The discovery of Tasg-D1 as a species-forming gene leads to the notion that a single-gene mutation may cause considerable phenotypic changes and give rise to novel taxonomic species in Triticum L. (Cheng et al., 2020).
Section | Genome(s) | Type | Species |
---|---|---|---|
Einkorn | AA | Wild | Triticum urartu Thum. ex Gandil. |
Wild | Triticum boeoticum Boiss. | ||
Hulled | Triticum monococcum L. | ||
Emmer | AABB | Wild | Triticum dicoccoides (Koern. ex Aschers. et Graeb.) Schweif. |
Hulled | Triticum dicoccum (Schrank) Schuebl. | ||
Hulled | Triticum paleocolchicum Menabde | ||
Hulled | Triticum ispahanicum Heslot | ||
Naked | Triticum carthlicum Nevski (syn. T. persicum Vav.) | ||
Naked | Triticum turgidum L. | ||
Naked | Triticum durum Desf. | ||
Naked | Triticum turanicum Jakubz. | ||
Naked | Triticum polonicum L. | ||
Naked | Triticum aethiopicum Jakubz. | ||
Dinkel | AABBDD | Hulled | Triticum spelta L. |
Hulled | Triticum macha Dekapr. et Menabde | ||
Hulled | Triticum vavilovi (Thum.) Jakubz. | ||
Naked | Triticum compactum Host | ||
Naked | Triticum sphaerococcum Perciv. | ||
Naked | Triticum aestivum L.a | ||
Timopheevii | AAGG | Wild | Triticum araraticum Jakubz. |
Hulled | Triticum timopheevii Zhuk. | ||
Zhukovskyi | AAAAGG | Hulled | Triticum zhukovskyi Menabde et Erizjan |
- a The Chinese endemic wheats, including T. aestivum ssp. yunnanense King, T. petropavlovskyi Udacz. et Migusch., and T. aestivum ssp. tibetanum Shao, are defined as three subspecies belonging to the species T. aestivum.
Genomic reshaping and the origin of Tibetan semi-wild wheat
The Tibetan semi-wild wheat (T. aestivum ssp. tibetanum Shao, AABBDD, 2n = 6x = 42) phenotypically resembles Tibetan local bread wheat landraces but shows unique brittle rachis characteristics and thus has been identified as an independent subspecies of T. aestivum (Shao et al., 1980). De novo assembly of the draft genome of a semi-wild wheat accession, Zang1817, uncovered extensive reshaping of the genome of Tibetan semi-wild wheat compared with that of normal bread wheat Chinese Spring (CS). This genome reshaping included the addition of many Zang1817-specific genomic segments and the loss of more than 380 Mb of CS-specific genomic segments (Guo et al., 2020). Principal component analysis of these data suggested that Tibetan semi-wild wheats share a closer genetic relationship with Tibetan bread wheat landraces than with other wheat accessions. More intriguingly, a genome-wide association study (GWAS) identified two types of genetic variation associated with the brittle-rachis trait in Tibetan semi-wild wheat accessions: the deletion of a 0.8 Mb fragment on chromosome 3D containing the homologs of the domestication genes Brittle rachis 1 (TaBTR1) and TaBTR2 and the insertion of a 161 bp transposon (TE) in the domestication gene Q (square-head spike) on chromosome 5A (Guo et al., 2020). Indeed, the TE insertion in Q-5A, designated the Qt allele, is known to contribute to the brittle rachis trait (Jiang et al., 2019). Based on these findings, it is proposed that natural variations in genome structure and specific genetic segments, particularly the TaBTR1-D and Q gene loci, caused the de-domestication of Tibetan semi-wild wheat from Tibetan bread wheat landraces (T. aestivum) to form a novel subspecies (Jiang et al., 2019; Guo et al., 2020).
The C (compactum) locus and the origin of T. compactum
The C gene on chromosome 2D controls the compact spike trait in a species of hexaploid wheat known as Triticum compactum (club wheat). Club wheat cultivars are still grown commercially, but their worldwide distribution is limited to certain agro-climatic regions. For example, club wheat was originally favored due to its drought and shattering resistance, stiff straw, early seeding, and competitive yield in the drylands of the Pacific northwest (USA) (Gul and Allan, 1972). Cytogenetic studies located the locus C on chromosome 2D, and subsequent genetic studies localized C to the centromere (Johnson et al., 2008). Because proximal segments of chromosomes are characterized by highly repetitive DNA fragments with reduced levels of recombination, map-based cloning of the C causal gene will be challenging.
FROM WILD TO DOMESTICATED
Cultivated crops were initially domesticated from their wild progenitors by artificial selection of traits that better met human needs and cultivation habits, and these altered morphological traits are usually referred to collectively as the domestication syndrome (Doebley et al., 2007). The domestication of crop wheat most likely began in the region west of Diyarbakir in southeastern Turkey about 10 000–11 000 years BP, marked by the cultivation of domesticated diploid einkorn (T. monococcum) and tetraploid emmer wheat species (Dubcovsky and Dvorak, 2007; Haas et al., 2019). Just as for other crops, wheat domestication was characterized by changes in several major traits that are controlled by a few large-effect genes. These traits include non-brittle rachis, free-threshing habit, inflorescence morphology, seed size, and reduced seed dormancy. Here we mainly focus on the non-brittle rachis and free-threshing habits.
Non-brittle rachis
The wild species of wheat, for instance, the diploid T. urartu and tetraploid wild emmer (Triticum dicoccoides), naturally allow mature grains/spikelets to disperse freely through the disarticulation of the entire spikelet at each spike rachis node. This trait, referred to as “brittle rachis,” helps the wild wheat plants to effectively reproduce themselves in nature (Charmet, 2011). However, brittle rachis is a disadvantage in cultivated wheat species, as it causes grain loss and, thus, low harvest efficiency. Loss of the natural grain-dispersal character was likely the most important step in wheat domestication. About 10, 000 years BP, wild emmer underwent a natural mutation event that led to the formation of cultivated emmer wheat with non-brittle rachis. At almost the same time, a similar mutation event occurred in a relative of T. urartu and conditioned the origin of the non-brittle einkorn wheat. Cultivation of the non-brittle forms of einkorn and emmer wheats enabled easier harvest of the grains.
Three loci controlling rachis brittleness have been referred to as Brittle rachis (Br or BTR). BTR-D, also known as Br1, has been mapped to the short arm of chromosome 3D and is responsible for the brittle rachis character in Tibetan semi-wild hexaploid wheat (Watanabe et al., 2003; Jiang et al., 2014). The other two loci, BTR-A (Br2) and BTR-B (Br3), which control the brittle rachis trait in wild emmer wheat, are located on chromosomes 3AS and 3BS, respectively (Nalam et al., 2006; Watanabe et al., 2006). The first breakthrough in BTR gene cloning was made in barley (Pourkheirandish et al., 2015). The brittle rachis trait of wild barley (Hordeum vulgare subsp. spontaneum [C. Koch] Thell.) is conditioned by two genetically and functionally linked genes, BTR1 and BTR2, on chromosome 3H. Deletions in either BTR1 (a 1 bp deletion in its coding region) or BTR2 (an 11 bp deletion in the coding region) converted the brittle rachis trait of wild barley into the non-brittle trait seen in domesticated barley (H. vulgare L. ssp. vulgare) (Pourkheirandish et al., 2015). BTR1 encodes a membrane-bound-like protein, BTR1, that harbors two lipophilic regions, while BTR2 encodes a small soluble protein with no sequence similarity to BTR1 (Pourkheirandish et al., 2015). The identical non-brittle rachis traits triggered by the deletion of either BTR1 or BTR2 suggest that these two genes may be functionally interdependent, yet their genetic interaction remains to be determined.
Intriguingly, the ortholog of BTR1, but not BTR2, likely underwent parallel selection in crop wheat. In cultivated einkorn accessions, btr1-A harbors a single nucleotide polymorphism (SNP) at position 355 (G355-to-A) relative to a wild-type gene in wild einkorn. This SNP led to a non-synonymous amino acid substitution (alanine to threonine at position 119 of BTR1 protein, Ala119-to-Thr) and finally a non-brittle rachis. A different btr1-A mutation, a premature stop codon mutation (Gly97*) caused by a 2 bp deletion (CG at position 291–292 from the start codon) in the BTR1-A coding region, was found in domesticated tetraploid T. durum, T. dicoccum, and hexaploid bread wheat (T. aestivum L.) (Pourkheirandish et al., 2018; Zhao et al., 2019). The different mutations in BTR1-A gene in diploid einkorn and polyploid wheats indicate the divergent origins of their A genomes (Figure 1). In addition to the mutation in BTR1-A, the B-genome-derived BTR1-B in wheat was also disrupted by a 4 kb insertion at position 539 from the start codon in cultivated emmer, suggesting that simultaneous recessive mutations in BTR1-A and BTR1-B are likely minimally required for the non-brittle rachis morphology in polyploid wheat (Avni et al., 2017; Zhao et al., 2019). To date, the contribution of BTR-D to rachis brittleness in hexaploid wheat is completely unknown.
Another domestication gene, Q, located on the long arm of chromosome 5A, is also involved in the control of the brittle rachis trait. The wild wheat 5Aq allele confers fragile rachides, while its semi-dominant mutant form 5AQ gives rise to the non-brittle rachis trait (Zhang et al., 2011). Q/q encodes an APETALA2 (AP2)-like transcription factor that has sequence similarity with Arabidopsis AP2 (Faris et al., 2003; Simons et al., 2006; Zhang et al., 2011). Comparing the coding sequences of the dominant 5AQ allele and the primitive 5Aq allele revealed two conserved nucleotide differences, that is, an A-to-G variation at position 985 (G985-to-A) and a C-to-T variation at position 1254 (C1254-to-T) of the coding region, which are considered to be the major causes for the enhanced biological effect of Q relative to q. The A985-to-G variation changes an amino acid at position 329 of the Q protein (Val329-to-Ile), which might be related to the enhanced protein dimerization activity of Q relative to q (Simons et al., 2006). The C1254-to-T variation causes no amino acid change but attenuates the miR172-mediated cleavage of Q transcripts, leading to greatly elevated Q transcript levels (Debernardi et al., 2017; Greenwood et al., 2017). Based on genotypic analyses of Q/q alleles in wheat accessions with different ploidy levels, it has been speculated that the domestication event giving rise to the Q allele occurred in polyploid wheat only once, but whether it initially arose in tetraploid or hexaploid wheat is still controversial (Simons et al., 2006). Overall, the mutant forms of btr1 and Q shaped the non-brittle rachis trait of the domestication syndrome in wheat species (Figure 1). However, whether btr1 and Q control the non-brittle rachis trait independently or interdependently is not known at present.
Free-threshing grain
A second domestication event was the conversion of hulled wheat (with tough glumes) to a de-hulled (free-threshing) form, and this conversion contributed greatly to the widespread cultivation of crop wheat. The tough glume trait is genetically controlled by numerous minor and major gene loci, including Sog (Soft glume) and Tg (Tenacious glume), on the short arm of the group 2 chromosomes, and 5AQ (Sood et al., 2009). The single recessive gene sog determines the soft glume trait and threshability in diploid einkorn wheat T. sinskajae and has been mapped to the region close to the centromere on 2AS. The semi-dominant Tg, which is thought to confer a non-free-threshing character mainly in tetraploid and hexaploid wheat, has been mapped to the most distal regions on 2BS and 2DS (Taenzler et al., 2002; Jantasuriyarat et al., 2004; Sood et al., 2009). However, the effect of Tg-A1 in the A genomes of T. aestivum or T. urartu has not yet been well defined. Based on the mapping positions, sog from the A genome and Tg from the B and D genomes are probably not orthologs, supporting the speculation that diploid and polyploid (tetraploid and hexaploid) wheat populations experienced independent domestication processes (Sood et al., 2009; Charmet, 2011).
In addition to the recessive mutation in Tg (from Tg to tg), the dominant 5AQ locus also facilitates grain threshability mainly in polyploid wheats. This has been well illustrated by the selection of the naked durum wheat from hulled cultivated tetraploid wheat genotypes (similar to emmer wheat). The dominant mutation from 5Aq to 5AQ conferred this grain threshability conversion (Figure 1) (Oliveira et al., 2012). Intriguingly, the 5Aq allele is associated with not only the hulled grain trait, but also elongated spikes (speltoid), while the 5AQ allele confers soft glumes, de-hulled grains, and subcompact spikes (Debernardi et al., 2017), suggesting a pleiotropic effect of the Q/q gene. Notably, Q has a similar biological effect to q but to a greater degree: five doses of q confer a squareheaded spike, resembling the effect of one copy of Q (Muramatsu, 1963). Notably, the facile grain threshability conferred by Q is likely repressed by the dominant Tg locus, as a synthetic allohexaploid wheat (AABBDAeDAe, genotype 5AQ; tg-2B; Tg-2DAe) generated by the combination of the free-threshing tetraploid wheat (AABB, genotype 5AQ; tg-2B) with the non-free-threshing Ae. tauschii (DAeDAe, genotype Tg-2DAe) exhibiting the non-free-threshing trait (Kerber and Dyck, 1969). These clues give rise to the notion that in polyploid wheat, the dominant Q locus as well as tg-2B; tg-2D genotypes are simultaneously required to allow the full expression of the free-threshing character (Figure 1) (Jantasuriyarat et al., 2004). The epistatic effect of Tg on Q illustrates a complicated genetic interaction among different domestication genes and might reflect shared downstream signaling pathways.
FROM LOCAL TO GLOBAL
Although wheat species with different ploidy levels experienced similar domestication processes, cultivated crop wheats with higher ploidy levels always exhibit wider environmental adaptability, enabling them to be more extensively cultivated, that is, globally rather than locally. This has been well exemplified by hexaploid bread wheat, which shows broader adaptability to diverse growth environments, enhanced tolerance to biotic and abiotic stresses, and extended potential for use in making different food products relative to domesticated diploid and tetraploid wheat species (Dubcovsky and Dvorak, 2007). The successful expansion of allopolyploid wheats, especially hexaploid T. aestivum, from local to global cultivation largely depends on novel genomic and genetic variations driven by polyploidization. Two specific features of these wheats, that is, an increased flexibility in flowering/heading time and a greatly improved adaptability to harsh growth environments due to the pyramiding of favorable variations, are the major determinants in shaping this success.
Timing of heading
Precise timing of flowering/heading is crucial for wheat productivity in a given environment and wheat adaptability to different agro-climatic environments. Relative to their diploid wild progenitors that grow in relatively narrow and limited regions, polyploid wheats, especially the hexaploid bread wheat, exhibit more flexible flowering/heading habits in response to vernalization (long-term exposure to cold temperature) and photoperiod (day length) conditions.
The vernalization response
Vernalization is a process in which a prolonged exposure to low environmental temperatures (between 0°C and 10°C) stimulates a plant to flower (Flood and Halloran, 1984; Zhang et al., 2019b). Depending on whether vernalization is required for flowering/heading, wheat accessions are divided into winter and spring wheat classes. Winter wheats require vernalization to flower at the appropriate time, while spring wheats do not. As a result, spring wheats can be sown in either autumn or spring, but winter wheats can only be sown in autumn. Several genetic factors, including the major vernalization response genes Vernalization 1 (VRN1, also called WAP1), VRN2, and VRN3, have been documented to underlie the requirement for vernalization. In this scenario, VRN2 is a dominant determinant of the winter growth habit, while VRN1 and VRN3 contribute to the spring growth habit (Figure 2).

Regulation of the flowering transition in winter wheat varieties in response to vernalization and photoperiod
The flowering transition in wheat is regulated by a combination of light, photoperiod, and temperature signals. The core vernalization regulatory loop, composed of Vernalization 1 (VRN1), VRN2, and VRN3, is fine-tuned by the interplay among red light signaling through Phytochrome B (PHYB)/PHYC, circadian and photoperiodic cues mediated by WPCL1 (the wheat ortholog of LUX ARRHYTHMO (LUX)/PHYTOCLOCK 1)–PPD1 (Photoperiod 1)–CO1/2 (CONSTANS 1/2) module, and cold signaling at least partly through the activation of non-coding TaVRN1 alternative splicing (VAS).
VRN1, encoding an AP1-like MADS-box transcription factor, is a central regulatory node modulating the vernalization response and initiating the transition of the shoot apical meristem from vegetative to reproductive growth (Law et al., 1976; Quarrie et al., 1995; Dubcovsky et al., 1998; Iwaki et al., 2002; Danyluk et al., 2003; Trevaskis et al., 2003; Yan et al., 2003; Barrett et al., 2008). In non-vernalized winter wheat varieties, VRN1 is expressed at a very low level; after exposure to cold during the winter, VRN1 is dramatically induced in leaves and at the shoot apex, and its expression levels remain high thereafter to promote heading (Loukoianov et al., 2005; Oliver et al., 2009; Deng et al., 2015). In some wheat lines harboring mutations in the proximal promoter or deletions in the first intron of VRN1, the expression of VRN1 is activated even without vernalization, resulting in a conversion from a winter to a spring growth habit (Yan et al., 2003; Fu et al., 2005). In both barley and wheat crops, vernalization triggers decreased histone 3 lysine 27 trimethylation (H3K27me3) levels and increased H3K4me3 levels in the promoter and coding regions of VRN1 (Oliver et al., 2009; Diallo et al., 2012), suggesting that VRN1 transcription in response to vernalization is tightly associated with the chromatin methylation state. A recent study demonstrated that a long non-coding RNA (lncRNA), VAS (TaVRN1 alternative splicing), is expressed from the first exon and part of the first intron of VRN1 in vernalized winter wheat. This lncRNA is recognized by and associates with the transcription factor RF2b to activate the expression of VRN1 (Figure S1), highlighting a feedback regulatory mechanism by which the intron region determines VRN1 activation (Xu et al., 2021). VRN1 transcription is also fine-tuned by several other proteins, such as the RNA-binding protein GRP2 (Glycine-rich RNA-binding protein 2) and the carbohydrate-binding protein VER2 (Vernalization-related 2). Before vernalization, GRP2 directly binds to the first intron of VRN1 to repress its transcription. During vernalization, GRP2 is modified by O-GlcNAcylation, and this modification promotes tight association of GRP2 with VER2 and its translocation from the nucleus to the cytoplasm; this triggers dissociation of GRP2 from the VRN1 intron region and subsequent VRN1 activation (Xiao et al., 2014). Therefore, sequence variations in the first intron of VRN1 that block the binding of GRP2 will lead to higher expression of VRN1, and thus earlier flowering/heading in winter wheat (Xiao et al., 2014; Kippes et al., 2018).
The VRN2 locus includes two tandemly duplicated genes, ZCCT1 (zinc finger-CCT (CONSTANS, CONSTANS-like, and TOC1) domain) and ZCCT2, which encode proteins that have no clear homologs in Arabidopsis but contain a zinc-finger motif mediating DNA binding and a CCT domain observed in the flowering time-controlling protein CO (Yan et al., 2004; Trevaskis et al., 2007). VRN2 is highly expressed in non-vernalized wheat seedlings to inhibit early flowering through the direct or indirect repression of VRN3 and VRN1. Initial map-based cloning of a spring-wheat-derived VRN2 gene identified single point mutations or deletions in the CCT domain of ZCCT1 and ZCCT2, leading to the functional disruption of the ZCCT proteins (Yan et al., 2004; Distelfeld et al., 2009). These mutations contribute to the spring growth habit. Thus, the CCT domains of ZCCT1 and ZCCT2 are likely essential for their full function as flowering repressors. Further evidence revealed that the CCT domains of ZCCT1 and ZCCT2 are also required for their interactions with HAP (Heme activator protein, also known as Nuclear factor-Y or NF-Y) complex. The association of the ZCCTs with HAPs further competes with CO2 for the regulation of VRN3 (Li et al., 2011). After a prolonged exposure to cold temperatures, VRN2 expression is largely repressed, leading to the activation of VRN3 and VRN1 and subsequent initiation of flowering in early spring. In barley, the downregulation of HvVRN2 expression during vernalization is perfectly correlated with the direct association of the HvVRN1 protein with the promoter region of HvVRN2 (Deng et al., 2015). Nevertheless, whether VRN1 also acts as a direct transcriptional repressor of VRN2 in wheat is debated, as it has been reported that the wheat VRN1 is not mandatory for the downregulation of VRN2 during vernalization, but only helps maintain the low transcript levels of VRN2 after vernalization (Chen and Dubcovsky, 2012). These findings suggest that some uncharacterized genes, rather than VRN1, might be employed for the repression of VRN2 in wheat.
VRN3, a homolog of Arabidopsis FLOWERING LOCUS T (FT), is another key floral activation gene in wheat (Yan et al., 2006). FT in Arabidopsis acts as a flowering signal, that is, florigen, moving from leaves to apices through the phloem to promote flowering (Putterill and Varkonyi-Gasic, 2016). These findings shed light on the possible action of VRN3 protein in wheat. Indeed, VRN3 can directly activate the transcription of meristem identity genes, such as VRN1, possibly through physical interaction with the FD-like transcription factor TaFDL2 (Li and Dubcovsky, 2008).
The extensive genetic interactions among VRN genes lead us to propose a model for the vernalization response in wheat (Figure 1). In this model, VRN2 acts as a repressor of the flowering transition through the direct repression of VRN3 and the indirect repression of VRN1; vernalization-triggered upregulation of VRN1 can release the repressive effect of VRN2 on VRN3, which leads to a further upregulation of VRN1 (a feedback regulatory loop) beyond the threshold required for initiation of flowering (Figure 1). Dominant mutations in VRN1 as well as loss-of-function mutations in VRN2 all lead to vernalization insensitivity and conversion from a winter to a spring growth habit.
Photoperiod response
The heading time of wheat is also determined by photoperiod. Wheat is a long-day (LD) plant; most wheat varieties flower earlier in LD (14 h or longer light) conditions, but show a significant delay (60–200 d) in heading in short-day (SD, 10 h or shorter light) conditions. However, some wheat varieties exhibit insensitivity to photoperiod and are characterized by only a slight delay (less than 35 d) in heading under SD relative to LD, which enables them to adapt to a relatively wider range of growth environments (Beales et al., 2007).
The pseudo-response regulator (PRR) gene Photoperiod 1 (PPD1) is the main determinant of photoperiod sensitivity in wheat. Numerous variations in PPD1 homoeoalleles (PPD-A1, PPD-B1, and PPD-D1) have been identified and are associated with photoperiod insensitivity. Compared to the wild-type photoperiod-responsive allele PPD-D1b, the photoperiod-insensitive allele PPD-D1a shows a 2 089 bp deletion upstream of its transcription start site (Beales et al., 2007). Similarly, a 1 085 bp deletion and a 308 bp insertion were separately identified in the promoter regions of PPD-A1 (PPD-A1a allele) and PPD-B1 (PPD-B1a allele) and contribute to photoperiod insensitivity (Wilhelm et al., 2009; Nishida et al., 2013). Notably, these variations are predicted to interrupt a key highly conserved 95 bp cis-regulatory element (Wilhelm et al., 2009; Nishida et al., 2013). This 95 bp region is thought to be important for photoperiod perception and for binding of upstream transcriptional repressors. Deletion of this cis-element largely attenuates the diurnal expression pattern of PPD1 under both SD and LD conditions, resulting in PPD1 upregulation and consequently a photoperiod-insensitive phenotype (Wilhelm et al., 2009). In addition, extra gene copies of PPD-B1 also lead to a photoperiod-insensitive phenotype (Diaz et al., 2012).
The wheat ortholog of the clock gene LUX ARRHYTHMO (LUX)/PHYTOCLOCK 1 (PCL1), WPCL1, encodes a MYB transcription factor that directly binds to the promoter of PPD1 to repress its repression. WPCXL1 and PPD1b have opposite diurnal expression patterns: The photoperiod-responsive PPD1b shows peak expression in the daytime and lower expression at night, whereas WPCL1 transcript accumulation peaks at the beginning of the night and declines in the early morning (Mizuno et al., 2012). Deletion or loss of function of WPCL1 in einkorn and bread wheat results in higher levels of PPD1 expression and consequently early flowering under SD conditions (Mizuno et al., 2012, 2016). In addition, the photoreceptor Phytochrome C (PHYC) genetically activates PPD1. As a result, tetraploid wheat plants harboring null mutations in PHYC show completely attenuated expression of PPD1 and delayed flowering under inductive photoperiods (Chen et al., 2014). Further evidence suggests that, in response to red light, PHYC forms active homodimers or heterodimers with PHYB to activate PPD1 expression in the nucleus (Chen et al., 2014). Overall, these findings suggest that PPD1 is a key integrator of the circadian clock and light cues in regulating the photoperiod response in wheat (Figure 1).
The major outputs of the circadian clock and light cues, as mediated by PPD1 and PHYC, are the sequential activation of downstream photoperiod-responsive genes, including the central clock output gene CO and the florigen-coding gene VRN3/FT. In barley, PPD1 tends to promote early flowering through the activation of two CO-like genes, HvCO1 and HvCO2, the products of which then activate HvFT (Turner et al., 2005). The same appears to be true in wheat, since the transcript levels of the wheat CO-like genes CO1 and CO2 (also known as TaHd1) are also affected by PPD1 (Kitagawa et al., 2012; Shaw et al., 2012). PHYC also influences CO expression, which is evidenced by the altered expression profiles of CO1 and CO2 in wheat phyC null mutants in comparison to the wild type (Chen et al., 2014). However, it remains to be answered whether the influences of PPD1 and PHYC on CO expression are due to direct protein-gene associations, indirect genetic interactions, or merely the feedback effect of VRN3/FT (Chen et al., 2014; Song et al., 2015).
Integration of vernalization and photoperiodic flowering
The vernalization genes, such as VRN2 and VRN3, are also transcriptionally responsive to photoperiod, highlighting crosstalk between photoperiod and vernalization signaling in orchestrating wheat flowering (Dubcovsky et al., 2006). A long photoperiod induces the expression of VRN2 before vernalization, which guarantees the repression of early flowering before winter, whereas vernalization antagonizes this VRN2 induction (Dubcovsky et al., 2006; Turner et al., 2013). In addition, the expression of VRN2 is rapidly downregulated under SD conditions, mimicking vernalization-triggered repression of VRN2. As a consequence, the vernalization requirement of photoperiod-sensitive wheat varieties for flowering can be greatly reduced or even be eliminated by exposure to SD for several weeks, followed by a return to LD conditions (termed SD vernalization) (McKinney and Sando, 1935; Evans, 1987; Dubcovsky et al., 2006). This suggests that VRN2 is a regulatory hub regulated by both photoperiod and vernalization signaling pathways.
Intriguingly, the SD-mediated repression of VRN2 is antagonized in functionally dominant PPD-D1a mutant varieties, whereas in phyC loss-of-function mutants, neither PPD1 nor VRN2 is activated (Distelfeld and Dubcovsky, 2010; Turner et al., 2013; Chen et al., 2014). These findings suggest that the transcription profile of VRN2 in response to photoperiod is also conditioned by the “PHYC–PPD1” interaction. Thus, a signaling module, “light signal–PHYC–PPD1–VRN2,” is proposed to explain the light-dependent induction of VRN2. In addition, more recent studies in wheat and barley have confirmed that the circadian clock output genes CO1 and CO2 are also activators of HvVRN2, possibly through direct interactions of CO1 and CO2 with PPD1 (Mulki and Korff, 2016; Shaw et al., 2020). Notably, the repression of VRN2 by cold temperature appears to be independent of the functions of PHYC and PPD1, suggesting that the photoperiod and vernalization cues fine-tune the expression of their common downstream gene VRN2 through independent signaling pathways (Turner et al., 2013; Chen et al., 2014).
The expression pattern of VRN3/FT follows a diurnal rhythm, emphasizing its role in sensing day length. Similar to the “NF-Ys/CO–FT” module in Arabidopsis, an “NF-Ys/CO1/CO2–VRN3” regulation module is also proposed in wheat species. In this module, CO1 and CO2 interact with NF-Y transcription factor family proteins to coordinately promote VRN3 expression for early flowering in photoperiod-sensitive wheat varieties under LD conditions (Li et al., 2011). It is worth noting that CO1/2 accelerate flowering only in the absence of functional PPD1, whereas they act as weak flowering suppressors when PPD1 is present (Shaw et al., 2020). This seemingly self-contradictory role of CO1/2 in the photoperiodic flowering pathway might be due to dynamic interactions among PPD1, CO1/2, and VRN2 (Figure S1).
Overall, the photoperiodic regulatory modules, including the light-responsive “PHYC–PPD1–VRN2/VRN3” and circadian clock-responsive “PHYC–NF-Ys/CO1/CO2–VRN2/VRN3”, deliver their output signals through the transcriptional regulation of vernalization genes, representing an integration of photoperiod and vernalization signals for proper timing of heading in wheat (Figure 2).
Adaptation to environmental constraints
As they spread globally to new agricultural areas, polyploid wheats are subjected to diverse stress-related constraints, for instance extremes of heat and cold, high salinity, drought, soil chemical toxicity, and so on. The relatively high ploidy level and complex genome compositions of polyploid wheats give them greater physiological and ecological plasticity than their diploid progenitors, and thus wide adaptability to harsh growth environments. To understand the molecular bases of the improved adaptability of polyploid wheats, researchers have identified many stress-responsive genes through either “omics” techniques or homology cloning. Among these, genes that confer salt and drought tolerance include the histone acetyltransferase gene GCN5/HAG1 (General control non-repressed protein 5), the salt overly sensitive (SOS) pathway genes SOS1, SOS2/TaCIPK (CBL-interacting protein kinase), SOS3/TaCBL4 (Calcineurin B-like 4), and SOS4, the jasmonic acid (JA) synthesis gene TaAOC1 (allene oxide cyclase 1), and the microRNA (miRNA) gene MIR172 (Ramezani et al., 2013; Zhao et al., 2013; Sun et al., 2015; Cheng et al., 2021; Zheng et al., 2021). Genes that confer drought tolerance include the calreticulin gene TaCRT and reactive oxygen species (ROS) scavenging genes including CATs (encoding catalases), SODs (encoding superoxide dismutases), PODs (encoding peroxidases), and APXs (encoding ascorbate peroxidases) (Jia et al., 2008; Dudziak et al., 2019). Finally, the JA synthesis gene TaOPR3 (12-oxophytodienoate-reductase 3), the heat shock transcription factor genes TaHsfA6f and TaHsfC2a, the E3 ubiquitin ligase gene TaSAP5, and the trehalose-6-phosphate synthase gene TaTPS confer thermotolerance (Xie et al., 2015; Xue et al., 2015; Zhang et al., 2017b; Hu et al., 2018; Tian et al., 2020). Based on the annotations of these genes, it is proposed that: (i) metabolism of phytohormones including abscisic acid (ABA), JA, ethylene (ET), and auxin (AUX); (ii) reactive ROS and ionic homeostasis; and (iii) cellular accumulation of organic solutes, are the main processes that together enhance the adaptability of bread wheat to environmental constraints (Budak et al., 2013; Zhang et al., 2019b; Urbanavičiūtė et al., 2021).
Adaptation to high temperature
The diploid progenitors of wheat were initially adapted to the dry, cool-summer Mediterranean climate and thus are relatively sensitive to high temperatures (Shewry, 2009). In particular, exposure of wheat to high temperatures during its reproductive or grain-filling stage can lead to especially severe damage, characterized by reduced grain number, decreased grain weight, and poor grain quality (Parent et al., 2017). Current knowledge of the heat-response signaling networks established in the model plant Arabidopsis suggests that heat shock transcription factors (Hsfs) and stress-related plant hormones (JA, ABA, and ET) are essential for high temperature tolerance (Li et al., 2018a). Indeed, wheat employs multiple Hsfs- and/or phytohormone-related signaling regulons, such as “Hsfa1b–TaOPR3–JA biosynthesis,” “ABA–TaHsfC2a-B–TaHSP70d (Heat shock protein 70d)/TaGalSyn (Galactinol synthase),” “TaHsfA6f–TaHSPs/TaGAAP (Golgi anti-apoptotic protein)/TaRof1 (encodes a co-chaperone),” and “TaHsfA6e–TaMBF1c-7B (Multiprotein bridging factor 1c)–DREB2A (Dehydration-responsive element-binding protein 2A)/HsfB2A/HsfB2B,” to counteract the adverse effects of high temperature (Xue et al., 2015; Hu et al., 2018; Tian et al., 2020, 2021a). Our recent unpublished data also suggest that TaHsfA1 acts as a master regulator of transcriptional reprograming in response to high-temperature stress in wheat: Loss of HsfA1 not only attenuated the activation of large sets of heat-responsive genes, including Hsf and HSP genes, but also abolished tolerance to heat stress (Figure 3).

A simplified working model illustrating the key components regulating heat responses and heat tolerance in wheat crops
Under normal conditions, the activity of the core heat shock transcription factor HsfA1 is repressed by association with the heat shock proteins HSP70 and HSP90. As a result, the downstream heat-response signaling is blocked. During heat stress, HSP70 and HSP90 dissociate from HsfA1, releasing active HsfA1 to initiate downstream gene expression. The downstream genes include HsfA2, HsfA6e, HSPs, TaOPR3, and MBF1c. The resulting elevated levels of HSP chaperone proteins prevent in vivo protein misfolding and aggregation during heat stress, while TaOPR3 and MBF1c are thought to be involved in the metabolism and/or signaling of stress-related phytohormones, such as jasmonic acid (JA), ethylene (ET), and abscisic acid (ABA). MBF1c can also be recruited into the stress granule (SG) complex by the SG component TaG3BP, where MBF1c helps maintain the translation efficiency of specific messenger RNAs (mRNAs) encoding HSPs and DNA-binding proteins. In addition, a HsfA1-independent signaling pathway has been proposed in the model plant Arabidopsis, although its presence in crop wheats has not been confirmed.
In addition to Hsfs, several other heat-responsive components, including the phosphoenolpyruvate carboxylase kinase-related kinase TaPERKR2, the cell death suppressor Bax inhibitor-1 (BI-1), ferritin TaFER, trehalose-6-phophate (T6P), and transcriptional coactivators such as TaMBF1c, are also positive regulators of wheat heat tolerance (Qin et al., 2015; Zang et al., 2017, 2018; Lu et al., 2018). More intriguingly, a recent study suggested that stress granules (SGs), cytoplasmic aggregates induced by heat stress, maintain translation efficiency during heat stress by recruiting TaMBF1c through the SG component RNA-binding Ras-GAP SH3 binding protein TaG3BP (Figure 3) (Tian et al., 2021a). However, whether these components and signaling pathways are controlled by or are independent of Hsfs is not well understood. Indeed, a recent study in Arabidopsis revealed that in a hsfa1 null mutant, some sets of heat-responsive genes were still effectively induced by heat stress, suggesting the existence of an HsfA1-independent signaling pathway that responds to temperature fluctuations (Li et al., 2019).
Adaptation to high salinity
Soil salinity is a major factor limiting the cultivation of wheat. Durum wheat is particularly sensitive to salt stress, while bread wheat is salt tolerant, suggesting that allohexaploidization contributed to salt tolerance. The enhanced salt tolerance of hexaploid bread wheat is determined by Kna1, a major-effect quantitative trait locus (QTL) on chromosome 4DL (Dubcovsky et al., 1996). Further evidence suggests that the wheat High-affinity K+transporter gene HKT1;5-D (derived from the D genome), which is closely related to the rice gene OsHKT1;5, co-segregates with Kna1 (Byrt et al., 2007). Consistent with its role as a Na+ transporter, HKT1;5-D mediates Na+ exclusion from leaves and discrimination of K+ over Na+ in leaf tissues, resulting in a higher K+-to-Na+ ratio and consequent salt tolerance (Byrt et al., 2007; Munns and Tester, 2008). Interestingly, functional differentiation has been found among the three homoeologs of HKT1;5: HKT1;5-A is deleted and HKT1;5-B is non-functional, leaving HKT1;5-D as the only fully functional homoeolog (Yang et al., 2014). These findings suggest that the strong salt tolerance of bread wheat was acquired during or after allohexaploidization. A cross between durum wheat and the T. monococcum accession C68-101 generated a novel salt-tolerant durum wheat, Line 149 (Munns et al., 2000). Two major gene loci, Nax1 on chromosome 2AL and Nax2 on chromosome 5AL, are responsible for this elevated salt tolerance (Munns et al., 2003; Lindsay et al., 2004). Further genetic work revealed that TmHKT7-A2 is the candidate gene for Nax1, and TmHKT1;5-A is the candidate gene for Nax2 (Huang et al., 2006; Byrt et al., 2007; James et al., 2007), suggesting that HKT genes may play a major role in maintaining ionic homeostasis and enhancing salt tolerance in polyploid wheats.
In addition to the importance of ionic homeostasis, recent studies have revealed crucial roles for ROS in maintaining salt tolerance. Salinity is known to trigger rapid accumulation of ROS, which act as important signaling molecules to initiate downstream stress responses (Qi et al., 2018). TaHAG1, a histone acetyltransferase similar to Arabidopsis HAG1/GCN5, mediates this salt-triggered induction of ROS by promoting the expression of the reduced form of nicotinamide adenine dinucleotide phosphate oxidase Rboh genes (Zheng et al., 2021). Intriguingly, during salt stress, hexaploid bread wheat has obviously increased ROS levels relative to tetraploid wheats (Zheng et al., 2021). Notably, excessive ROS can disrupt cellular homeostasis, resulting in a salt-sensitive phenotype (Zhu, 2016). To prevent these adverse effects of ROS, wheat evolved the “miR172–IDS1–APXs/CATs/GPXs” signaling pathway to fine tune ROS levels and facilitate salt tolerance (Cheng et al., 2021). These preliminary findings support the central role of ROS in modulating the adaptability of polyploid wheat to high salinity.
Tolerance to excess boron and aluminum in soil
High soil contents of boron (B) and aluminum (Al), which usually occur in dry environments with alkaline soils, are also major constraints limiting the cultivation of wheat (Paull et al., 1991). Several major-effect QTLs, including Bo1–Bo4, have been confirmed to improve wheat tolerance to soil boron toxicity. Two of these loci have been cloned: Bo1 (on chromosome 7BL) in bread and durum wheat and Bo4 (on chromosome 4AL) in bread wheat. Bo1 and Bo4 are nearly identical, suggesting that Bo4 is likely a dispersed duplication of the Bo1 locus (Pallotta et al., 2014). Further sequence analysis revealed that Bot-B5 is the causal gene for Bo1 (and for Bo4) and encodes a previously undescribed boron transporter (Pallotta et al., 2014). Sequence variations in the promoter and the coding regions of Bot-B5 are associated with distinct gene expression patterns and protein activities. Among these alleles, Bot-B5b, Bot-(Tp4A)-B5c, and Bot-B5c confer the highest expression levels of Bot-B5 and full function of the Bot-B5 protein. Accordingly, these alleles are essential for tolerance to high boron. The other alleles all compromise wheat tolerance of boron toxicity; their variations include gene fragmentation and deletion (Bot-B5f and Bot-B5g), transcriptional attenuation by repetitive sequence insertion in the promoter region (Bot-B5a, Bot-B5d, and Bot-B5e), and amino acid substitutions causing decreased protein activity (Bot-B5a) (Pallotta et al., 2014). The divergent alleles of Bot-B5 selected in different growth environments suggest a considerable flexibility in the strategies used by wheat to adapt to agro-geographically diverse regions (Pallotta et al., 2014).
In acid soils, soluble aluminum (Al) cations greatly inhibit wheat root establishment, preventing wheat production (Kochian, 2003). Hexaploid bread wheat exhibits significantly elevated Al tolerance than its diploid and tetraploid progenitors, suggesting that allohexaploidization contributed to Al tolerance (Cosic et al., 1994; Garvin et al., 2003). Several major-effect QTLs for Al tolerance have been detected in bread wheat on chromosomes 4DL, 4BL, 3BL, and 2A. Among these, the causal gene of the dominant locus Alt1 on chromosome 4DL has been characterized as a member of the aluminum-activated malate transporter gene TaALMT1 (Sasaki et al., 2004; Cai et al., 2008). Elevated expression of TaALMT1, conferred by the insertion of tandemly repeated elements in its promoter, increases malate efflux from wheat roots, resulting in Al tolerance (Sasaki et al., 2004; Ryan et al., 2010). Analysis of allelic variation in the TaALMT1 promoter suggested that: (i) a basal Al tolerance in the hexaploid wheat progenitor A. tauschii (DD) predates the appearance of hexaploid wheat; and (ii) an increased Al tolerance caused by transcriptional activation of TaALMT1 specifically occurred in hexaploid wheat (Ryan et al., 2010). Notably, TaALMT1 is also involved in boron tolerance and in the acidification of an alkaline rhizosphere by promoting exudation of malate and gamma-aminobutyric acid. These multiple roles suggest pleiotropic effects of TaALMT1 in shaping wheat adaptability to diverse soil conditions (Kamran et al., 2020; Long et al., 2020). In addition to TaALMT1, another QTL for Al tolerance, designated Xcec, has been mapped to a small region on chromosome 4BL, where a root citrate transporter gene TaMATE (Multidrug and toxic compound extrusion) is located. In support of TaMATE as its causal gene, Xcec conditions Al tolerance through the modulation of citrate efflux. Overall, the functional distinctiveness of TaALMT1 and TaMATE provides two independent mechanisms for Al tolerance in crop wheat, that is, malate efflux and citrate efflux. Notably, citrate efflux is restricted to a few genotypes from Brazil, such as Carazinho and Toropi, whereas malate efflux contributes to strong Al tolerance in other genotypes, including Atlas 66 and BH1146, suggesting that these two mechanisms might have been selected and utilized independently during wheat evolution (Tang et al., 2002; Ryan et al., 2009).
The importance of the “hidden half” in polyploid wheats
The “hidden half” of wheat, referring to the root system, plays fundamental roles in ensuring nutrient and water uptake efficiency, and thus the adaptability of wheat to diverse growth environments. High-throughput screening of mutant populations, genetic mapping studies, and gene transcriptome profiling have found multiple mutants with altered root architecture, QTLs for wheat root traits, and candidate genes and signaling pathways relating to root development (An et al., 2006; He et al., 2014; Shorinola et al., 2019; Zhuang et al., 2021). These QTLs/genes tend to be related to plant hormone (BR, AUX, and cytokinin) signaling and ROS homeostasis (He et al., 2014; Zhuang et al., 2021). In particular, a novel ethylene response factor (ERF) transcription factor gene, TaSRL1, has recently been shown to be a negative regulator of root length, indicating a potential relationship between root growth and stress responses. Intriguingly, sequence polymorphisms associated with root length and grain yield were exclusively detected in the A-genome-derived TaSRL1, suggesting a functional asymmetry among the three TaSRL1 homoeologs (He et al., 2014; Zhuang et al., 2021). Several characteristics of the wheat root system, including root length, lateral root number, and root hair length, appear to be shaped by genomic asymmetry in polyploid wheats, and have experienced dramatic phenotypic changes during polyploidization. By comparing root hair length in the synthetic allotetraploid wheat (AASS) and its diploid parents (SS and AA), as well as a synthetic allohexaploid wheat (AABBDD) and its parents (AABB and DD), it has been shown that the root hair lengths of allotetraploid and allohexaploid wheats are nearly identical and significantly longer than those of the three diploid parents (AA, SS, and DD), suggesting that longer root hairs first appeared in allotetraploid progenitors and became fixed in tetraploid and hexaploid wheat species (Han et al., 2016). Notably, the A- and/or B(S)-genome-derived root hair developmental genes, including TaRHD6 (Root hair defective 6) and TaRSL4 (TaRHD6-like 4), and the cell-wall organization gene TaEXPB1 (Expansin B1) are more highly expressed in allotetraploid and allohexaploid wheats than in their diploid parents (Han et al., 2016, 2017). In contrast to the contribution of the A genome to root hair growth vigor, the lateral root number of wheat is conditioned mainly by the D genome. Allohexaploid wheats have higher numbers of lateral roots compared with the other diploid groups (AA and SS) and allotetraploid wheats (AABB). Consistent with this, the D-genome-derived lateral root initiation gene, TaLBD16-D, shows the highest transcript levels among the three homoeologs (Wang et al., 2018b). These findings exemplify the genomic asymmetry involved in establishing multiple root traits and the fundamental role of genome combination in conferring wide and flexible adaptability in allohexaploid wheats.
LODGING RESISTANCE FOR HIGH YIELD POTENTIAL
Lodging, the bending or breaking of the basal stem internodes (Berry et al., 2003), reduces photosynthetic potential and dry matter accumulation and leads to severe yield losses and increased agronomic costs. Therefore, improving lodging resistance is vital for wheat to achieve maximum yield potential, especially in irrigated environments with high nitrogen fertilizer supply. Multiple agronomic traits, including plant height, culm strength, and root spread, influence crop lodging behavior. Here we focus on genetic factors influencing plant height and culm strength.
Breeding semi-dwarf stature
The “Green Revolution” of the 1960s led to spectacular increases in cereal crop yields through the introduction of semi-dwarfing traits into cereal crops to increase lodging resistance and achieve maximum yield potential (Hedden, 2003). In wheat, the semi-dwarfing trait was introduced mainly via the Reduced height (Rht) gene loci Rht-B1b (formerly Rht1) and Rht-D1b (Rht2) derived from the Japanese variety “Norin 10” (Peng et al., 1999). Rht-B1b and Rht-D1b, representing semi-dominant mutant alleles of the Rht-B1 and Rht-D1 homoeologs, respectively, can reduce plant height by ~25%, and have been used in more than 70% of commercial wheat cultivars worldwide (Youssefian et al., 1992; Duvick, 1999). To date, more than 25 loci have been designated as Rht loci (McIntosh et al., 2003; Peng et al., 2011; Chen et al., 2015; Mo et al., 2018). Depending on whether the resulting dwarfing phenotype could be restored by the exogenous application of bioactive gibberellin (GA), these Rht genes have been classified into GA-sensitive and GA-insensitive groups. The “Green Revolution” genes Rht-B1b/D1b, as well as their various alleles, including Rht-B1c (Rht3), Rht-B1e (Rht11), Rht-B1p (Rht17), and Rht-D1c (Rht10), are in the GA-insensitive group, while other Rht loci, such as Rht8, Rht9, Rht18, Rht23, and Rht24, are in the GA-sensitive group (Worland et al., 1998; Bazhenov et al., 2015; Chen et al., 2015; Zhao et al., 2018; Tang et al., 2021).
Rht-B1 and Rht-D1 encode DELLA proteins, the key repressors of the GA signaling pathway and plant growth (Peng et al., 1998, 1999; Richards et al., 2001). Although there is limited knowledge available about the Rht-B1/D1 protein functions and their downstream signaling responses in wheat, the functions of DELLA proteins have been extensively analyzed in other plant species (Peng et al., 1998; Velde et al., 2017). The N-terminus of DELLA proteins, which contains DELLA and TVHYNP motifs, is essential for GA signaling; in the presence of bioactive GAs, the N-terminus associates with the GA receptor GIBBERELLIN INSENSITIVE DWARF1 (GID1), triggering degradation of the DELLA proteins (Dill et al., 2001; Velde et al., 2021). In the “Green Revolution” Rht-B1b and Rht-D1b alleles, nucleotide substitutions in the DELLA-motif-coding regions lead to translation re-initiation, producing N-terminally truncated DELLA proteins (DELLAΔN) that are more stable (Peng et al., 1999; Bazhenov et al., 2015; Velde et al., 2021). The degradation-resistant DELLAΔN proteins inhibit GA signaling without affecting GA catabolism, thereby conferring a GA-insensitive semi-dwarfing phenotype (Figure 4). Rht-B1c, another Rht-B1 mutant allele derived from the wheat variety “Tom Thumb” (McVittie et al., 1978), harbors an in-frame 90 bp insertion in the coding region, leading to the generation of a N-terminally disrupted, more stable Rht-B1 protein. The dwarfism caused by Rht-B1c is far more severe than that of Rht-B1b, making it unsuitable for semi-dwarfing breeding (Pearce et al., 2011; Wu et al., 2011). Rht-D1c, derived from the Chinese variety “Ai-bian 1,” also leads to extreme dwarfism, because it consists of multiple copies of the dominant semi-dwarfing allele Rht-D1b (Börner et al., 1996; Pearce et al., 2011). It is worth noting that although at least five semi-dwarfing alleles in Rht-B1 (Rht-B1b, Rht-B1c, Rht-B1d, Rht-B1e, and Rht-B1f) and three mutant alleles in Rht-D1 (Rht-D1b, Rht-D1c, and Rht-D1d) have been characterized, and some of them have been utilized in wheat breeding, no semi-dwarfing allele of Rht-A1 has been defined, suggesting that Rht-A1 is likely subfunctionalized (Börner et al., 1996; Pearce et al., 2011).

The known Reduced height (Rht) genes and their potential roles and genetic interactions in the regulation of plant height in crop wheat
The gibberellin (GA)-insensitive “Green Revolution” genes, Rht-B1b and Rht-D1b, encode truncated, more stable forms of DELLA proteins, which repress GA signaling and causes semi-dwarfism (“GA signaling”). In contrast, the GA-sensitive dwarfing genes, Rht18/Rht24 and Rht12, encode the GA 2-oxidases TaGA2oxA9 and TaGA2oxA13, respectively, which break down bioactive GA in plants. Higher expression of these GA2ox genes reduces the in vivo levels of bioactive GA, leading to semi-dwarf phenotypes (“GA metabolism”). A natural recessive mutation in a previously undefined nuclear protein-coding gene is responsible for the semi-dwarfing effect of Rht8. Rht8 inhibits cell elongation via genetic repression of BR signaling and GA metabolism. Rht23 is allelic to the APETALA2 (AP2) transcription factor gene 5Dq, the homoeolog of the domestication gene 5AQ. Due to the pleiotropic effects of 5Dq as a transcription factor, the Rht23-triggered semi-dwarfism is associated with increased spike compactness.
Unexpectedly, the “Green Revolution” alleles Rht-B1b and Rht-D1b also lead to unfavorable agronomic traits, such as shorter coleoptiles and reduced seedling vigor, hampering plant stand establishment especially in dryland regions where deep planting is essential (Richards, 1992; Rebetzke et al., 1999; Ellis et al., 2004). In addition, Rht-B1b and Rht-D1b result in reduced kernel size and loss of grain yield (Zhang et al., 2013; Würschum et al., 2017; Guan et al., 2018). To overcome these pleiotropic effects, the GA-sensitive Rht loci have been used as alternate gene resources for wheat semi-dwarfing breeding. To date, several GA-sensitive semi-dwarfing genes and their effects in wheat breeding have been analyzed (Figure 4).
Rht8 is one of the most important GA-sensitive Rht loci that has been used globally for wheat semi-dwarfing breeding. Rht8 is a recessive locus that shortens wheat plant height by ~10% without affecting coleoptile elongation or grain yield (Rebetzke et al., 1999). Physiological experiments show that the Rht8-triggered semi-dwarfing trait is not affected by exogenous BR, indicating that Rht8 might be involved in BR signaling (Figure 4) (Gasperini et al., 2012). Previous studies mapped Rht8 to a small region on the short arm of chromosome 2D (Gasperini et al., 2012; Chai et al., 2018). Intriguingly, Rht8 and Rht-B1b individually reduced coleoptile length by about 6.75% and 21.64%, respectively, but caused a severe coleoptile phenotype, reducing length by more than 43%, when combined (Grover et al., 2018). These findings suggest a complex genetic interaction between Rht8 and Rht-B1b and, more interestingly, imply that a Rht-B1b-independent signaling pathway conferred by Rht8 regulates plant height.
Rht18, located on chromosome 6A, is another GA-sensitive semi-dwarfing locus, but is dominant. Rht18 was initially identified in the tetraploid durum wheat cultivar “Icaro” (Ford et al., 2018). Recent work identified GA2oxA9, encoding the GA-inactivating enzyme GA 2-oxidase, as the candidate gene for Rht18. GA2oxA9 metabolizes GA12, the GA1 precursor, into inactive products, resulting in a decreased abundance of bioactive GA1 and a semi-dwarf phenotype (Figure 4) (Ford et al., 2018). In line with the dominant effect of Rht18 in reducing plant height, GA2oxA9 is more highly expressed in “Icaro” than in rht18 accessions. Genetic experiments revealed that Rht18 behaves identically to Rht-D1b in reducing plant height (by ~25%), but has no effect on coleoptile elongation. However, the combination of Rht18 and Rht-D1b leads to the reduction of plant height by ~35%, less than that of the combination of Rht-B1b and Rht-D1b (Tang et al., 2021). Notably, Rht18 and Rht24 map to overlapping regions and are also linked to the same marker as Rht14. These findings have led some to propose that Rht18, Rht24, and Rht14 are allelic (Tian et al., 2017; Würschum et al., 2017; Ford et al., 2018; Mo et al., 2018). Indeed, a recent study confirmed that the transcriptional upregulation of GA2oxA9 explains the effect of Rht24 (Tian et al., 2021b). More intriguingly, the semi-dwarfing Rht24b allele reduces plant height and increases nitrogen use efficiency, but has no yield penalty, illustrating a favorable semi-dwarfing gene (Tian et al., 2021b).
Another semi-dwarfing gene, Rht12 on chromosome 5A, was identified in bread wheat following γ-radiation mutagenesis. Rht12 shows identical characteristics to Rht18, such as dominant nature and GA-sensitive classification (Sutka and Kovács, 1987; Ellis et al., 2004). Based on either map-based cloning or MutChromSeq approaches in independent studies, GA2oxA13 (also called GA2oxA14 in a different report) has been shown to be the candidate gene for Rht12, functionally similar to Rht18/GA2oxA9 (Sun et al., 2019; Buss et al., 2020). Transcriptional activation of GA2oxA13/A14 in Rht12-harboring accessions reduces the level of bioactive GA1, causing semi-dwarfism (Buss et al., 2020). Notably, the four GA-sensitive semi-dwarfing alleles found in tetraploid and hexaploid wheats, Rht18, Rht14, Rht24, and Rht12, are all defined as GA inactivation loci and lead to the depletion of bioactive GAs and thus semi-dwarfism (Figure 4). These are completely different from the rice GA-sensitive “Green Revolution” gene sd1, which represents a loss-of-function mutation in the GA biosynthesis gene GA20ox-2 (Sasaki et al., 2002).
Rht23 is a recently reported GA-sensitive Rht locus that was found in an ethyl methane sulfonate (EMS)-induced mutant line NAUH164 in the “Sumai 3” genetic background (Chen et al., 2015). Rht23 is allelic to 5Dq, the D genome homoeolog of the domestication gene 5AQ (Zhao et al., 2018). Because of the pleiotropic effect of q on both plant height and spike morphology, the Rht23-triggered semi-dwarfism is associated with compact spikes. Further sequence analysis revealed a single-nucleotide mutation in 5Dq, that is, a G-to-A substitution at the 3 147 bp position, which potentially attenuates the binding of miR172 to 5Dq transcripts, leading to their excess accumulation (Figure 4). Similarly, different single-nucleotide substitutions in this region of either 5AQ or 5Dq all cause reduced plant height and compact spikes, resembling the effect of Rht23 (Debernardi et al., 2017; Greenwood et al., 2017). Unlike other semi-dwarfing genes relating to GA metabolism or signaling, Rht23 tends to have more complicated genome-wide effects on transcription, including that of genes involved in cell wall biosynthesis, lignin synthesis, and photosynthesis (Zhang et al., 2020).
Breeding strong culms for lodging resistance
Improving lodging resistance by reducing plant height can only go so far, because severe dwarfism leads to inadequate biomass accumulation and, ultimately, lower yield potential (Okuno et al., 2014; Ookawa et al., 2016; Nomura et al., 2021). In addition, neither sd1 in rice nor Rht loci in wheat could completely prevent bending-type stem lodging, especially at high planting densities, because these semi-dwarfing genes reduce culm strength and nitrogen-use efficiency (Li et al., 2018b; Wu et al., 2020). Therefore, the design of novel cereal crop varieties with a strong culm (SCM) phenotype may be a further strategy for enhancing of lodging resistance.
The SCM trait is determined by multiple factors, for example, culm wall thickness, culm diameter, stem solidness, and the lignin/cellulose content in the stem wall (Pinthus, 1967; Berry et al., 2003; Tripathi et al., 2003; Kong et al., 2013; Okuno et al., 2014; Xiao et al., 2015; Xiang et al., 2016). Based on forward genetic strategies, several SCM-related QTLs have been identified on chromosomes 4A, 5A, and 5B for culm stiffness; 3B for solid stem; 1A, 1B, and 2D for culm wall thickness; 2D for tensile breakage force; and 1A, 3A, 2D, and 4D for culm diameter, among others (Keller et al., 1999; Cook et al., 2004; Hai et al., 2005; Verma et al., 2005; Okumoto, 2011; Berry and Berry, 2015; Hyles et al., 2017; Nilsen et al., 2017). However, few of these loci have been cloned by map-based cloning, except for the 3B locus for stem solidness and the 1A locus for culm wall thickness (Hyles et al., 2017; Nilsen et al., 2020).
The 3B stem solidness locus, conferring a solid culm with a thick sclerenchyma layer and the SCM trait, has been designated as SSt1 in durum wheat and Qss.msub-3BL in bread wheat (Cook et al., 2004; Nilsen et al., 2017). A putative Dof zinc finger protein coding gene, TdDof, was cloned as the SSt1 candidate gene (Nilsen et al., 2020). Copy number variation in the TdDof gene region leads to the duplication of this gene, and thus its activation, in solid-stemmed wheat lines. Excess accumulation of TdDof transcripts attenuates programmed cell death (PCD) in pith parenchyma cells, predominantly through the transcriptional repression of NAC (NAM/ATAF1/2/CUC2) and cysteine proteinase (CEP) genes (Nilsen et al., 2020). This finding not only demonstrated the existence of a novel “Dof–NAC/CEP–PCD” signaling pathway for regulation of stem solidness regulation, but also provided a selection strategy for designing solid-stemmed wheat cultivars through transgenic breeding. The tin locus on chromosome 1A, which has pleiotropic effects on tiller number, spike architecture, and grain weight, is also associated with stem thickness (Kebrom et al., 2012). Map-based cloning identified Csl as the candidate gene for the tin locus; this gene encodes a cellulose-synthase-like protein with homology to members of the CslA clade (Hyles et al., 2017). Changing the level of Csl transcripts, and consequently, Csl protein, may alter carbon partitioning in wheat culm tissues, leading to increased lignification and stronger stems, which in turn would improve lodging resistance (Hyles et al., 2017).
The minor effects of most of the other SCM-related QTLs in wheat make them very challenging to characterize. Our current strategy, combining EMS-based mutagenesis with map-based cloning, has enabled the isolation of major-effect SCM-regulating genes. Some of these genes encode key regulators of BR and AUX signaling, and others encode cytoskeleton regulatory factors (unpublished data). More importantly, the SCM candidate genes identified using this strategy, as well as the scm mutant materials obtained, could be readily utilized in wheat breeding and agronomic practices (Figures 5, S2).

Schematic workflow of the forward genetics screen for wheat mutants with the strong culm (SCM) trait
(A) A workflow representing the screen for scm mutants based on phenotypic selection. Wheat cultivars with high yield and/or favorable grain quality but poor culm quality were selected as starting materials (e.g., elite cultivar A). Following ethyl methane sulfonate (EMS) mutagenesis, a screen for scm mutants with improved culm quality (SCM trait) was conducted using field-grown individuals or populations in the M2 to M4 generations based on phenotypes relating to culm strength, grain size, and grain quality. (B) Growth phenotypes of a strong culm (scm1) mutant line showing stronger culms and improved culm quality. Left, morphologies of individual plants of the elite wheat cultivar “Shannong22” (SN22) and its derived scm1 EMS-induced mutant. Right, field-grown populations of SN22 and scm1.
In rice, several SCM QTLs have been isolated by map-based cloning, providing gene resources for SCM design in wheat (Minakuchi et al., 2010; Kashiwagi, 2014; Yano et al., 2015; Nomura et al., 2019; Samadi et al., 2019; Chigira et al., 2020; Nomura et al., 2021). Rice SCM2 is allelic to Aberrant panicle organization 1 (APO1), an F-box protein-coding gene controlling meristem fate identity and panicle structure (Ikeda et al., 2007; Ookawa et al., 2010). Notably, gain-of-function alleles of APO1 conferred not only enhanced culm strength and lodging resistance, but also increased spikelet number, suggesting that APO1 would be a favorable target in engineering lodging-resistant and high-yield crop varieties (Ookawa et al., 2010). Another SCM locus, SCM3, is identical to Fine culm 1 (FC1), the rice ortholog of maize Teosinte branched 1 (TB1), which encodes a TCP transcription factor and is involved in the control of tiller number (Takeda et al., 2003; Yano et al., 2015). Activation of OsTB1/FC1 leads to stronger culms with thicker culm walls, increased spikelet number, and decreased tiller number in rice, and similar effects are also observed in wheat (Yano et al., 2015; Dixon et al., 2018). Increased expression of TB1 in wheat reduces plant height by ~10% without decreasing seedling emergence, largely resembling the effect of the semi-dwarfing locus Rht8, thus providing an alternative to Rht alleles in wheat breeding (Dixon et al., 2020). Considering that both APO1 and TB1 enhance culm strength and promote spikelet number, these SCM genes would be valuable genetic factors to balance the trade-off between lodging resistance and yield potential in wheat breeding, resulting in novel SCM wheat varieties with stronger culms, enhanced lodging resistance, and higher grain yield (Yano et al., 2015; Nomura et al., 2019).
SHAPING YIELD TRAITS: TILLERS, SPIKES, AND GRAINS
In wheat, the final grain yield is determined by three major components: spike number per unit area, grain number per spike, and grain weight (Koppolu and Schnurbusch, 2019). Under selection pressure to increase grain yield, wheat has acquired multiple QTLs, genes, and signaling pathways that guarantee the improvement of these yield traits, as has been shown by a recent GWAS (Pang et al., 2020).
Tillering
Tillering, a form of branching that involves the emergence of lateral shoots from axillary buds on the unelongated basal internode (Wang et al., 2018a), is a major factor determining the productivity of cereal crops. An optimal number of tillers per unit area is essential for achieving the highest yield potential for crop wheat. Extensive studies in rice have characterized multiple genetic factors controlling tillering. For example, MONOCULM 1 (MOC1), encoding a GRAS family nuclear protein that promotes the outgrowth of axillary buds (Li et al., 2003), and MOC3, the rice ortholog of WUSCHEL (WUS) (Lu et al., 2015), are the positive regulators of tillering; FC1, together with TAD1 and TE, which separately encode the co-activator and substrate-recognition factor of the anaphase-promoting complex (APC/C) for the degradation of MOC1 (Lin et al., 2012; Xu et al., 2012), are all negative regulators that lead to decreased tiller number. In particular, SPL14 (SQUAMOSA promoter binding protein-like 14), also known as Ideal Plant Architecture 1 (IPA1), is dominant in repressing tillers and shaping ideal plant architecture in rice (Jiao et al., 2010). To date, the wheat orthologs of IPA1 and TB1 have been isolated based on homology cloning, and their contributions to wheat tillering have also been extensively analyzed. The wheat gene SPL17 (TaSPL17) is the ortholog of rice IPA1 and, like that ortholog, is targeted and functionally repressed by miRNA156 (miR156). Interestingly, over-accumulation of miR156 triggers functional attenuation of TaSPL17 and another SPL gene, TaSPL3. As a consequence, there was a dramatic increase in tiller number, converting the wheat plants to a bushy architecture (Liu et al., 2017). These findings suggest an essential role for TaSPL17 and TaSPL3 in maintaining a suitable tiller number. More intriguingly, overexpression of wheat SPL20 and SPL21 significantly promoted grain size without affecting normal tiller formation (Zhang et al., 2017a), suggesting that SPLs are ideal targets for simultaneously manipulating plant architecture and grain size. Tillering is also repressed by TB1, a function genetically validated by characterizing a highly-branched (hb) line derived from a four-parent multiparent advanced generation intercross population. In the hb line, an increased dosage of the TB-D1 gene causes significantly reduced tiller number (Dixon et al., 2018). Notably, in crop wheat, TB1 is also involved in the control of plant height and spike architecture (Dixon et al., 2018, 2020), suggesting a new target for concomitantly engineering spike morphology and plant architecture in wheat.
Spike development
The spike in tribe Triticeae is unbranched, with spikelets attached directly to the inflorescence axis, known as the spike rachis (Koppolu and Schnurbusch, 2019). Normally, a single rachis node bears only one spikelet, and each spikelet includes up to ~12 florets, of which only three to five florets set grains (Sakuma et al., 2019). Therefore, the grain number per spike is determined by two subcomponents: fertile spikelet number per spike (SNS) and fertile florets per spikelet.
Several genetic factors have been characterized as positive or negative regulators of SNS. One is APO1, the role of which in inflorescence determination was initially characterized in rice. The rice apo1 mutant shows abnormal floral organ identity, suggesting that APO1 may repress precocious conversion of inflorescence meristems to spikelet meristems, thus acting as a positive regulator of spikelet formation (Ikeda et al., 2007). Recently, independent genetic mapping studies in wheat have revealed a stable QTL on chromosome 7AL that contributes to increased SNS. Further evidence suggested that several natural variations in both the promoter and coding regions of APO-7A might explain the effect of this QTL (Kuzay et al., 2019; Muqaddasi et al., 2019; Voss-Fels et al., 2019).
The maize barren stalk1 (ba1) mutant fails to produce branches in male and female inflorescences (ears and tassels). Genetic cloning of the causal gene for this phenotype enabled the isolation of the Barren stalk 1 (BA1) gene, which encodes a non-canonical basic helix-loop-helix (bHLH) transcription factor (Gallavotti et al., 2004). The function of BA1 in the initiation of spike lateral meristems and the formation of spikelets is highly conserved in bread wheat. More intriguingly, downregulation of wheat SPL3 and SPL17 considerably attenuates the expression of BA1, further leading to dramatic decrease in SNS, suggesting SPL3 and SPL17 are two essential activators of BA1 expression (Liu et al., 2017).
Some endemic wheat accessions have more than one spikelet at a single rachis node; this unusual spike morphology is defined as the supernumerary spikelet (SS) trait (Zhang and Yuan, 2014; Dobrovolskaya et al., 2015). SS leads to increased SNS, illustrating the potential of this trait for improving wheat grain number and final yield. Paired spikelet, characterized by the emergence of a second spikelet below and adjacent to the typical spikelet, is the most commonly observed SS phenotype in wheat. Several studies have suggested a role for PPD1 and TB1 in the regulation of paired spikelet formation (Boden et al., 2015; Dixon et al., 2018). A high frequency of paired spikelets is observed exclusively in plants with the photoperiod-sensitive wild-type allele PPD-D1b; no paired spikelets are found in lines harboring the gain-of-function photoperiod-insensitive PPD1 allele, suggesting that PPD1 normally acts as an inhibitor of paired spikelet development (Boden et al., 2015). Further evidence suggested that FT is one of the downstream genes of PPD1 that confer this phenotypic variation. TB1, by contrast, is a positive regulator of paired spikelet formation. An increased dosage of TB-D1 triggers abnormal spike architecture with multiple paired spikelets, mainly through interaction of TB-D1 protein with FT and the resulting transcriptional repression of downstream meristem identity genes (Dixon et al., 2018). These findings illustrate a central role of FT in paired spikelet determination, since both PPD1 and TB1 condition paired spikelet formation through FT.
The other SS phenotypes, such as multirow spikes (MRS) and branched spikes (BS), represent more complex wheat spike architectures (Dobrovolskaya et al., 2015). A recent study identified the wheat ortholog of the rice FRIZZY PANICLE (FZP) gene, WFZP, as a causal gene for the triple-spikelet (TRS, a typical form of MRS) trait observed in the Tibetan wheat variety Zang734. A null mutation in WFZP-A coupled with the deletion of WFZP-D shaped this TRS trait (Du et al., 2021). Parallel studies showed that different mutant alleles of WFZP and different genetic backgrounds could lead to largely divergent SS traits, suggesting that the effect of WFZP on the SS trait is complex and depends on genetic background and/or environmental conditions (Dobrovolskaya et al., 2015; Li et al., 2021).
In contrast to the well-characterized genes for the determination of spikelet numbers in wheat, little is known about the genetic determinants of floret fertility (grain setting) in a given spikelet. A recent study opened up a new avenue for research by cloning a key regulator of floret fertility, Grain number increase 1 (GNI1) (Sakuma et al., 2019). A single amino acid substitution (105 Asp-to-Tyr; N105Y) in the A-genome-derived GNI1 protein, GNI-A1, attenuates its activity and leads to increased grain number per spikelet, explaining the effect of a previously characterized QTL for grain number per spikelet on chromosome 2AL (Sakuma et al., 2019). Intriguingly, GNI1 is an ortholog of barley VRS1 (Vulgare row-type spike 1, also known as HvHOX1), which encodes a homeodomain leucine zipper class I (HD-Zip I) transcription factor. It is worth noting that in barley, VRS1 acts as an inhibitor of lateral spikelet fertility and promotes the “two-rowed” state of the barley spike (Komatsuda et al., 2007), while in wheat, GNI1 is likely involved in the regulation of floret fertility, but not spikelet determination. These functional discrepancies between VRS1 and GNI1 might be attributed to differences in spikelet determinacy between barley and wheat (Koppolu and Schnurbusch, 2019). It has also been reported that photoperiod influences floret fertility in wheat, possibly though PPD1, but the underlying mechanism is not known (Prieto et al., 2018).
Grain size
The grain is the final product of wheat and, as such, directly determines the yield of wheat production. Increased grain size during wheat domestication and polyploidization is one of the most important traits that guaranteed the success of wheat as a major crop worldwide (Dubcovsky and Dvorak, 2007). However, current knowledge about the genetic determinants of wheat grain size is limited because grain development is a highly polygenic trait that is influenced not only by numerous minor-effect QTLs, but also by environmental context (Li and Yang, 2017; Brinton and Uauy, 2019). To date, only a few major-effect genes have been functionally characterized as acting in the determination of wheat grain size.
Grain weight 2 (GW2), encoding a RING-type E3 ubiquitin ligase, was initially characterized in rice to be a negative regulator of grain size. Loss of GW2 confers increased cell numbers in the spikelet hull, resulting in larger grain size and increased grain weight (Song et al., 2007). In wheat, recognition of the effect of GW2 on grain size and weight was facilitated by the detection of a QTL for grain weight and yield located in the centromeric region of chromosome 6A, which includes the GW-A2 gene (Su et al., 2011; Simmonds et al., 2014, 2016; Sukumaran et al., 2018; Zhai et al., 2018). The effects of the GW2 homoeologs (from the A, B, and D genomes) as well as their functional interactions were further validated through clustered regularly interspaced palindromic repeats (CRISPR)/CRISPR-associated protein 9 gene editing. Simultaneous knockout of two GW2 homoeologs (GW2-B and GW2-D, for example) produces larger grains than the single mutants, suggesting functional redundancy and additive interactions among GW2 homoeologs (Zhang et al., 2018). However, the loss of GW2 also leads to decreased grain number (Sukumaran et al., 2018; Zhai et al., 2018). This may be due to the well-known trade-off between grain size and number. Thus, it is still challenging to balance the trade-offs not only between grain size and number, but also among other complex yield components in wheat breeding. In addition, a recent GWAS revealed QTLs for grain size in different wheat accessions, among which one, qGW7B.1, was found to include TaSus1-7B, a sucrose synthase gene that has undergone significant selection during wheat breeding (Hou et al., 2014; Pang et al., 2020). These findings indicate the crucial role of endosperm starch synthesis in the determination of wheat grain size.
Notably, some species-forming genes, for example VRT-A2 and Tasg-D1, also affect the development of grains, illustrating their pleiotropic effects. Ectopic expression of VRT-A2 in Polish wheat promotes grain elongation without affecting grain width, resulting in increased grain size and weight (Backhaus et al., 2021; Liu et al., 2021; Xiao et al., 2021). In contrast, gain-of-function of Tasg-D1 in Indian dwarf wheat leads to small, round grains by repressing BR signaling, suggesting that Tasg-D1, in contrast to VRT-A2, restricts grain development (Cheng et al., 2020). Despite its negative effect on grain size, Tasg-D1 is still expected to have potential use in high-yield wheat breeding because it can shape a more compact plant architecture with erect leaves and decreased plant height, which is favorable for dense planting.
Based on the screening of a wheat EMS mutant library, a tgw1 mutant line with considerably smaller grains was recently identified. Molecular cloning further confirmed a premature stop mutation in the KAT-2B (Keto-acyl thiolase 2B) gene, explaining the small-grain phenotype of tgw1 (Chen et al., 2020). KAT-2B affects wheat grain size through the antagonistic manipulation of in vivo JA and ABA levels. More importantly, overexpression of KAT-2B led to increased grain weight and boosted yield, implying a potential application of KAT-2B for future high-yield molecular breeding in wheat (Chen et al., 2020).
GRAIN STORAGE PROTEIN CONTENT AND END-USE QUALITY
Wheat quality is defined by its diverse end uses. Wheat flour is used to produce a wide range of products, such as leavened and unleavened breads, steamed breads, noodles, pancakes, cakes, and cookies, each requiring different end-use qualities (Veraverbeke and Delcour, 2002). The end-use value of wheat flour mainly depends on the quantities and types of the seed storage proteins (SSPs).
Grain protein content (GPC) is one of the most important quality traits in wheat cultivar development, determining the nutritional value and end-use quality of wheat flour. GPC is a complex quantitative trait controlled by multiple genomic loci, and many QTLs distributed across all wheat chromosomes have been identified by linkage mapping based on bi-parental populations and by association mapping based on a core collection of germplasm (Quraishi et al., 2017; Nigro et al., 2019). One QTL on the short arm of chromosome 6B from the wild tetraploid wheat T. dicoccoides contributed up to 66% of the GPC variation. This QTL/gene encodes a NAC transcription factor (NAM-B1) that increases nutrient remobilization from leaves to developing grains (Uauy et al., 2006). In addition, GPC is highly dependent on nitrogen fertilization; thus, genes playing key roles in N uptake from the soil, amino acid metabolism, and allocation of N to grain proteins should be considered as good candidates for increasing GPC. Such genes include those encoding high-affinity nitrate transporters (NRT2) and the NIN-like protein involved in the regulation of nitrate assimilation.
In addition to GPC, the properties of SSPs, especially the gluten protein content, are generally regarded as important factors determining the end-use value of wheat flour. Gluten, accounting for approximately 75%–85% of total grain protein, is composed of polymeric glutenins, which provide dough elasticity as well as dough strength, and monomeric gliadins, which provide dough viscosity and extensibility (Payne et al., 1987; Biesiekierski, 2017). Glutenins are subdivided into high-molecular-weight glutenin subunits (HMW-GSs) and low-molecular-weight glutenin subunits (LMW-GSs). Although HMW-GSs account for only approximately 10% of gluten proteins, the variation in their composition explains up to 70% of genetic variation for bread processing quality (Payne et al., 1987; Eagles et al., 2002; Liu et al., 2005). Genetic studies of HMW-GSs have firmly established that the Glu-1 loci that encode them are located on the long arms of the homoeologous chromosomes 1A, 1B, and 1D, where each locus consists of two closely linked genes, encoding a larger x-type subunit with molecular weights of 80–88 kDa and a smaller y-type subunit with molecular weights of 67–73 kDa (Galili and Feldman, 1985). HMW-GS allelic variation was observed on a large scale in wheat cultivars/lines. In general, the Glu-1B and Glu-1D loci display more allelic variation than Glu-1A. Sodium dodecyl sulfate – polyacrylamide gel electrophoresis analysis of HMW-GSs revealed three Glu-1A alleles, 11 Glu-1B alleles, and seven Glu-1D alleles. With the sequencing of wheat genome and resequencing of wheat cultivars, more than 100 allelic variations at the Glu-1 loci have been identified (unpublished data). Over the past two decades, utilization of HMW-GS allelic variations has been an attractive and efficient strategy for improving wheat bread-making quality. The Glu-1D locus contributes more significantly to dough and bread-making properties than the Glu-1B and Glu-1A loci. In particular, the Glu-1D locus encoding 1Dx5 and 1Dy10 was reported to be associated with better bread-making qualities. Meanwhile, the contributions of each HMW-GS to end-use qualities were ranked in order as follows: 1Dx5 + 1Dy10 > 1Bx17 + 1By18 > 1Ax1 + Null (Li et al., 2020b). Compared to the other HMW-GSs, 1Dx5 contains more cysteine residues, which are key determinants of the formation of glutenin polymers and the resulting rheological parameters of the dough (Lutz et al., 2012; Li et al., 2016). It is worth noting that one allele, 1Bx7OE, in which gene duplication at the Glu-B1 locus mediated by insertion of a retroelement leads to overexpression of the Bx7 subunits, is highly associated with improved dough strength and has been found in many cultivars and landraces (Marchylo et al., 1992; D'Ovidio et al., 1997; Ragupathy et al., 2008).
The expression of SSP genes in the endosperm of developing wheat grains is primarily regulated at the transcriptional level through interactions between cis-acting DNA motifs and trans-acting factors. To date, the conserved cis-regulatory modules and a number of transcription factors involved in SSP gene regulation have been characterized (Figure 6). Storage protein activator (TaSPA), a basic leucine zipper TF, has been shown to bind the GCN4-like motif to activate SSP gene expression (Albani et al., 1997; Ravel et al., 2014), while a SPA heterodimerizing protein (SHP) was found to act as a suppressor of SSP expression (Boudet et al., 2019). Another TaSPA interacting protein, the B3-superfamily TF TaFUSCA3, activates HMW-GS gene expression by binding the RY-box motif (Sun et al., 2017). The DOF-type Prolamin-box binding factor (PBF) binds prolamin-box (P-box) motifs in the promoter regions of α-gliadin and LMW-GS genes in wheat (Dong et al., 2007), leading to promoter hypomethylation (Zhu et al., 2018). Recently, it was found that a molecular module consisting of TaGAMyb, the GCN5-like histone acetyltransferase HAG1, and TaNAC019 activates HMW-GS genes by binding to their promoters and regulating chromatin modification (Guo et al., 2015). TaNAC019-BI was identified as an elite allele for flour processing quality and will be a candidate gene for breeding wheat with improved quality (Gao et al., 2021). It is also interesting that the Glu-1Ay genes in hexaploid wheat are frequently defective, with defects in the coding regions, and likely to produce truncated proteins (Sun et al., 2004; Jiang et al., 2009). Thus, the utilization of fully functional Glu-1Ay genes might be a good strategy in breeding for improved wheat flour quality.

Transcriptional regulation of the SSP gene through the interactions among cis-acting motifs and trans-acting factors
Multiple protein components directly or indirectly bind to cis-regulatory elements in the promoter region of SSP to fine-tune its expression.
In bread wheat, a higher amount of LMW-GSs relative to HMW-GSs in the grain and certain LMW-GS compositions are associated with dough resistance and extensibility (Dong et al., 2010). Gliadin content is also associated with dough extensibility (Shewry et al., 2002). It was reported that the quantitative effect of individual HMW-GSs (Ragupathy et al., 2008; Li et al., 2015; Cho et al., 2017), as well as higher HMW-GS/LMW-GS and HMW-GS/gliadin ratios, are relevant for dough functionality (Veraverbeke and Delcour, 2002; Rasheed et al., 2014).
FUTURE PERSPECTIVES
Understanding the functions of genes in staple crops will accelerate crop improvement by allowing targeted breeding approaches. In recent years, there has been a dramatic expansion in the resources available to carry out functional genomics in wheat, largely based on improvements in the available reference sequences, large-scale expression data, and diverse mutants. In addition, the pan-genome of wheat also provides high-quality genome sequences for multiple varieties of wheat. The changing landscape of wheat genomics offers unprecedented opportunities for researchers, but also poses new challenges for gene discovery and functional characterization in polyploid wheat (Adamski et al., 2020).
First, hexaploid wheats have large genomes (16 Gbp), which consist mostly (>85%) of repetitive elements. Notably, genome-specific rearrangements and genomic introgression from a range of relatives significantly shaped the genetic diversity present in bread wheat cultivars both historically and in modern elite breeding lines, which appears likely to become one major obstacle for functional genomics analysis in wheat (Przewieslik-Allen et al., 2021). There is an urgent need to develop new, high-throughput, and time-efficient tools and methods for reducing the genome complexity and overcoming the limitations of traditional map-based cloning strategies, such as the need for large segregating populations and the low rate of genetic recombination. Recent advances in high-throughput sequencing, genotyping, and gene mapping pipelines, such as MutRenSeq (mutagenesis with resistance gene enrichment sequencing), AgRenSeq (association genetics with resistance gene enrichment sequencing), MRASeq (multiplex restriction amplicon sequencing), and QTG-Seq (quantitative trait gene sequencing), have enabled the establishment of a next-generation marker platform, along with the rapid cloning of important R genes and some developmental QTLs (Steuernagel et al., 2016; Arora et al., 2019; Zhang et al., 2019a; Bernardo et al., 2020). These advances will facilitate rapid and widespread gene cloning in polyploid wheats.
Second, bread wheats are polyploid species that carry multiple copies of each gene, presenting challenges for genetic manipulation. However, the presence of homoeologous copies for many wheat genes also offers opportunities to fine-tune phenotypes by exploiting a range of interactions between homoeologous genes, including functional redundancy between homoeologs, dosage effects of homoeologs, homoeolog dominance, and specialized interactions of homoeologs. Thus, understanding the relationships among homoeologs will be helpful for improving quantitative traits in polyploid wheat, because breeders can use allelic variation in each homoeolog separately or in combination to modulate the phenotype (Borrill et al., 2015).
Third, many agronomic traits in wheat are extremely complex quantitative phenotypes determined by multiple QTLs/genes, such that variation at a single gene locus would not be sufficient to cause substantial phenotypic changes. Although many QTLs for many traits have been detected on almost all chromosomes of wheat, very few have been cloned to date (Nezhadahmadi et al., 2013; Tricker et al., 2018; Sallam et al., 2019). Most of these QTLs are minor-effect loci which are greatly influenced by “genotype–environment” interactions, making it extremely difficult to perform phenotyping assays for further QTL mapping. Further efforts are still needed to identify the causal genes for these QTLs, which could further our understanding of the genetic architecture of complex agronomic traits in wheat and, more importantly, enable us to design modern wheat varieties with large spikes, compact architecture, SCMs, high harvest index, improved stress tolerance, and expanded potential for different end-uses (Figure 7).

Genetic architecture shaping the origin and agronomic improvement of polyploid crop wheats
Selection of natural mutations in the BTR1, Sog, Tg, and q gene loci enabled the domestication of diploid and polyploid wheats and their conversion from wild to cultivated growth habits. Some of the major-effect gene loci also contributed to the formation of wheat species/subspecies, including VRT-A2, sg-D1, Qt, and C, which facilitated the origin of Triticum polonicum/T. petropavlovskyi, T. sphaerococcum, T. aestivum ssp. tibetanum Shao, and T. compactum, respectively. Further selection of vernalization genes (VRN1, VRN2, and VRN3) and photoperiod-responsive genes (PHYC, PPD1, and CO) gave rise to the distinct flowering habits of winter and spring wheats. The stress adaptability of polyploid wheat was also improved by the selection of large sets of stress-responsive gene loci, such as HKT1;5, Bot-B5, TaMATE, and TaRSL4, along with many minor-effect quantitative trait loci that remain to be cloned. The resulting increased flexibility of flowering time and greatly improved stress tolerance (together with enhanced disease resistance) made it possible to cultivate polyploid wheat worldwide. Significantly, the introduction of the semi-dwarfing genes Rht-B1b and Rht-D1b since the 1960s resulted in a “Green Revolution” that considerably enhanced the yield potential of polyploid wheat. Other important semi-dwarfing genes that have been used in wheat breeding, such as Rht8 and Rht9, remain to be isolated. The improvement of complex agronomic traits and the design of modern wheat varieties is ongoing, and may be achieved or facilitated by the use of several sets of genetic resources, including GNI1, APO1, and PPD1 for spike development, TB1 and SPLs for tillering, and NAM-B1, GW2, and sg-D1 for favorable grain quality traits. Future directions for breeding ideal wheat crop varieties are still under debate. However, a possible scheme can be proposed on the basis that the modern wheat varieties should have high yield potential, be adapted for whole-process mechanization, be tolerant to biotic and abiotic stresses, and exhibit high grain quality. Thus, an ideal future wheat variety may exhibit large spikes, compact architecture, strong culms, improved photosynthetic efficiency, high harvest index, and high grain quality for different end uses.
ACKNOWLEDGEMENTS
We thank Yongming Chen and Changfeng Yang for their assistance in preparing figures. This work was supported by the National Natural Science Foundation of China (31991214, 91935304, 32072055, and 91935302). We apologize to colleagues whose work could not be cited due to space limitations.
AUTHOR CONTRIBUTIONS
Q.S., Z.N., and J.L. conceived the project and designed the review; Z.N., J.L., and Y.Y. wrote the paper; M.X. and H.P. provided information for preparing the manuscript. All authors reviewed and approved of the manuscript.