Identification and expression analysis of Phosphate Transporter 1 (PHT1) genes in the highly phosphorus-use-efficient Hakea prostrata (Proteaceae)
Abstract
Heavy and costly use of phosphorus (P) fertiliser is often needed to achieve high crop yields, but only a small amount of applied P fertiliser is available to most crop plants. Hakea prostrata (Proteaceae) is endemic to the P-impoverished landscape of southwest Australia and has several P-saving traits. We identified 16 members of the Phosphate Transporter 1 (PHT1) gene family (HpPHT1;1-HpPHT1;12d) in a long-read genome assembly of H. prostrata. Based on phylogenetics, sequence structure and expression patterns, we classified HpPHT1;1 as potentially involved in Pi uptake from soil and HpPHT1;8 and HpPHT1;9 as potentially involved in Pi uptake and root-to-shoot translocation. Three genes, HpPHT1;4, HpPHT1;6 and HpPHT1;8, lacked regulatory PHR1-binding sites (P1BS) in the promoter regions. Available expression data for HpPHT1;6 and HpPHT1;8 indicated they are not responsive to changes in P supply, potentially contributing to the high P sensitivity of H. prostrata. We also discovered a Proteaceae-specific clade of closely-spaced PHT1 genes that lacked conserved genetic architecture among genera, indicating an evolutionary hot spot within the genome. Overall, the genome assembly of H. prostrata provides a much-needed foundation for understanding the genetic mechanisms of novel adaptations to low P soils in southwest Australian plants.
1 INTRODUCTION
Phosphorus fertiliser is heavily used in agricultural systems to increase crop yield, but only 20%–30% of applied P fertiliser is taken up by crop plants (López-Arredondo et al., 2014). Plants take up P as inorganic phosphate (Pi), a form of P that is non-renewable, readily immobilised in soil and strongly competed for by soil microbes (Vitousek et al., 2010; Wang et al., 2017). Excess P fertiliser often leaches into the soil or is lost via surface run-off, causing damage to nearby vegetation and waterways (Conley et al., 2009).
The potential to reduce P fertiliser overuse through crop varieties with high P-use-efficiency (PUE) traits has been widely studied. In rice (Oryza sativa), the Phosphorus Starvation Tolerance 1 (PSTOL1) gene in the Kasalath rice variety was discovered to greatly increase tolerance to P-poor conditions (Chin et al., 2011; Gamuyao et al., 2012). However, another area of growing interest is research on the genetic basis for high PUE traits in non-crop plants such as Hakea prostrata (Proteaceae) (Lambers, Finnegan, et al., 2015). H. prostrata is endemic to southwest Australia, a landscape that has become extremely P-impoverished over millions of years (Laliberté et al., 2012; Lambers et al., 2011). H. prostrata has several high PUE traits, supporting fast rates of photosynthesis at a very low leaf P concentration (Lambers, Cawthray, et al., 2012; Lambers, 2022). These traits include a low abundance of ribosomal RNA (rRNA), which is a major pool of organic P in plants (Sulpice et al., 2014), and formation of dense clusters of determinate lateral roots, known as cluster roots, which mobilise Pi in soil through an exudative release of carboxylates and phosphatases (Shane & Lambers, 2005). The high density of rootlets in cluster roots greatly increases the root-to-soil interface for Pi acquisition (Shane & Lambers, 2005). Cluster roots in H. prostrata reach maturity approximately 12–13 days after emergence and release a large burst of the carboxylates malate and citrate before senescing about 21 days after emergence (Lamont, 2003; Shane, Cramer, et al., 2004). Cluster root formation in H. prostrata is suppressed by increased external P supply, supporting the view that they function in accessing soil-sorbed P when soil P availability is low (Shane, Szota, et al., 2004). Unlike most crop species, H. prostrata and most Proteaceae do not form symbioses with arbuscular mycorrhizal fungi for increased P uptake (Lambers, Clode, et al., 2015), relying solely on highly specialised adaptations to survive in low P soils. The study of these novel PUE traits in H. prostrata and other Proteaceae will increase understanding of the genetic and metabolic networks that allow plants to function at extremely low P availabilities, greatly aiding research into translation of similar or other major PUE traits into crop species.
Plants that have evolved in landscapes where Pi is sufficient or in excess can tolerate high P conditions by transporting Pi between cell compartments or by down-regulating P uptake by decreasing the expression of root Pi transporters (de Campos et al., 2013; Shane, Szota, et al., 2004). However, H. prostrata is highly susceptible to P toxicity and continues to take up P at much higher internal Pi concentrations than other plant species (Lambers et al., 2013; Shane, Szota, et al., 2004). The low capacity to down-regulate Pi uptake has been hypothesised to be the result of a more constitutive activity of the Pi uptake system in H. prostrata compared with that in other plant species (Bird et al., 2024), allowing for greater remobilisation of P from senescing leaves (de Campos et al., 2013). H. prostrata has a very high efficiency of P remobilisation, only losing up to 15% of P in senescing mature leaves (Shane et al., 2014), whereas plants typically lose approximately 50% of P in senescing leaves based on global averages of nutrient resorption rates (Reed et al., 2012; Vergutz et al., 2012). A constitutive activity of the Pi uptake system in Proteaceae is likely to be the result of a combination of specific genes for Pi uptake not being repressed or increasing in expression upon exposure to Pi, or inefficient breakdown of Pi uptake transporters in the plasma membrane when down-regulation is triggered (Lambers et al., 2013).
Plants primarily transport Pi from the soil into root cells through the Phosphate Transporter 1 (PHT1) family of transmembrane transporters (Isidra-Arellano et al., 2021). The PHT1 family belongs to the Pi/H+ symporter family, which is part of the Major Facilitator Superfamily (MFS) of proteins (Li et al., 2019). Proteins in the MFS are typically 400–600 amino acids long and contain 12 transmembrane α-helices that situate the protein in the plasma membrane of cells (Pao et al., 1998; Yan, 2015). The 12 transmembrane helices are separated into two groups by a large hydrophilic loop of amino acids between the sixth and seventh transmembrane helices and the N- and C-termini of the protein face into the cytosol of the cell (Liao et al., 2019; Raghothama, 1999). The PHT1 family has mostly been studied in Arabidopsis thaliana and O. sativa, with nine AtPHT1 genes (named AtPHT1;1 to AtPHT1;9) in A. thaliana and 13 OsPHT1 genes (named OsPHT1;1 to OsPHT1;13) in O. sativa (Srivastava et al., 2018; Victor Roch et al., 2019). Several PHT1 genes involved in Pi uptake and expressed in roots have also been studied in other plant species (Nussaume et al., 2011), including Medicago truncatula (Xiao et al., 2006), Zea mays (maize) (Nagy et al., 2006) and Glycine max (soybean) (Fan et al., 2013).
For many of the studied plant PHT1 genes, expression and subsequent Pi uptake are induced by low Pi supply as part of the Pi starvation response (PSR) (Isidra-Arellano et al., 2021; Nussaume et al., 2011). An important aspect of the regulation of PHT1 expression is through the transcription factors (TFs) Phosphate Response1 (PHR1) and its homolog PHR1-like (PHL1) (Bustos et al., 2010; Sun et al., 2016). PHR1 binds to the PHR1 Binding Site (P1BS) in promoters of many Pi-starvation-inducible genes, including most PHT1 genes, leading to increased expression in low P conditions (Franco-Zorrilla et al., 2004; Rubio et al., 2001). Post-transcriptional regulation of PHT1 expression occurs through the ubiquitin conjugases Phosphate2 (PHO2) and Nitrogen Limitation Adaptation 1 (NLA1), which cause the degradation of PHT1 proteins when Pi supplies increase (Isidra-Arellano et al., 2021). PHT1 gene activity is also regulated through Phosphate Transport Traffic Facilitator 1 (PHF1), which mediates the transport of PHT1 proteins out of the endoplasmic reticulum (Bayle et al., 2011; González et al., 2005).
The expression of 10 identified PHT1 transcripts has previously been studied in a root and leaf transcriptome of H. prostrata (Bird et al., 2024). Four of these HpPHT1 transcripts were highly expressed in white roots, suggesting they are responsible for Pi uptake from the external medium, while two other identified HpPHT1 transcript sequences that were orthologous to AtPHT1;8 and AtPHT1;9 potentially function in P translocation from roots to shoots (Lapis-Gaza et al., 2014; Remy et al., 2012). However, the H. prostrata PHT1 family cannot be fully identified or functionally analysed without examination of a genome assembly for the species. Currently, the only Proteaceae genome assemblies available are for Telopea speciosissima (Chen et al., 2022), Macadamia integrifolia (Nock et al., 2020), M. tetraphylla (Niu et al., 2022) and Protea cynaroides (Chang et al., 2022), which are found in higher P environments than H. prostrata (Gleason et al., 2009; Lambers, Bishop, et al., 2012; Turner & Laliberté, 2015). Here, we present an annotated pseudomolecule-level assembly of the H. prostrata genome. As a foundation for understanding the links between Pi uptake by roots and high P remobilisation in leaves of H. prostrata, we identified members of the PHT1 gene family in H. prostrata by sequence homology and phylogenetic analysis. We hypothesised potential functions based on gene expression patterns in white roots and cluster roots of H. prostrata grown in hydroponics and subjected to different Pi supplies (Bird et al., 2024) and across leaf maturation in plants in their native habitat as well as through assessment of conserved motifs and residues in PHT1 promoter and encoded protein sequences. This analysis will improve the understanding of Pi uptake by a highly P-use-efficient species and contribute to the development of high PUE traits in crop species.
2 MATERIALS AND METHODS
2.1 Genetic resources and genome size estimation
Young soft leaves from an H. prostrata plant in its natural habitat (University of Western Australia Field Station, Shenton Park, Western Australia; −31.948017°, 115.794407°) were collected and snap-frozen in liquid nitrogen (collection date September 2020). High molecular-weight genomic DNA was extracted (DNeasy Plant Mini Kit, Qiagen) and Nanopore libraries were prepared and sequenced (PromethION, Flow Cell r9.4.1, Kinghorn Centre for Clinical Genomics). A total of 8.7 million reads were generated, of which just over 6.8 million reads had a per-base quality score (Q-score) above 7 and a mean Q-score of 9.02. The N50 value was estimated to be 12 090 base pairs (bp). Bases in the Nanopore reads were called in high accuracy mode (Guppy v.4.0.11), resulting in 74.9 Gbp of sequence. After trimming adapters and filtering quality by a minimum score of seven, 61.4 Gbp remained, contained within 6.83 million reads with an estimated 86.4× genome coverage.
For Illumina sequencing, leaves from the H. prostrata plant used for Nanopore sequencing were collected and snap-frozen in liquid nitrogen (collection date 2017). Genomic DNA was extracted (DNeasy Plant Mini Kit, Qiagen) and 150 bp paired-end reads were sequenced by one Illumina XTen run (Garvan Institute) to give 151 Gbp contained within 1.01 billion reads after adapter trimming with coverage 214×. After analysis with FastQC v.0.11.9 (Andrews, 2010), Illumina reads were quality filtered using fastp v.0.20.1 (Chen et al., 2018). The last 7 bp of each Illumina read was trimmed because of errors in per base sequence content, resulting in a total of 963 M reads with a total size of 137 Gbp. Illumina reads were used to estimate the genome size of H. prostrata by 21-mer frequency distribution with Jellyfish v.2.2.10 (Marçais & Kingsford, 2011) and findGSE v.0.1.0 (Sun et al., 2018). Genome size was also estimated by flow cytometry of leaf mesophyll cells stained with propidium iodide. Finally, the heterozygosity rate of the H. prostrata genome was calculated by GenomeScope v.1.0 (Vurture et al., 2017).
2.2 Chromosome-scale assembly of the H. prostrata genome
Raw Nanopore reads were corrected and assembled de novo with NECAT v.0.0.1 (Chen et al., 2021) following the approach used for the draft genome assembly of the closely related T. speciosissima (Chen et al., 2022). Quality of the H. prostrata assembly was assessed by N50, proximity to the estimated genome size, and estimated gene completeness by BUSCO v.5.1.2 (Manni et al., 2021) using the embryophyta_odb10 database. Following assessment, the assembly was polished using raw Nanopore reads in Medaka v.1.4.3 (https://github.com/nanoporetech/medaka) and Illumina reads in NextPolish v.1.3.1 (Hu et al., 2020) following the approach of Wang et al. (2020). Heterozygous sequences were removed by PurgeHaplotigs v.1.1.1 (Roach et al., 2018) and non-plant contigs were filtered out by Basic Local Alignment Search Tool (BLAST) v.2.12.0 (Altschul et al., 1990) using the National Center for Biotechnology Information (NCBI) nucleotide collection (NT) (Sayers et al., 2022). Any sequence that did not return a sequence from the Embryophyta lineage as the top hit was removed.
The H. prostrata contigs were scaffolded into pseudomolecules by Ragtag v.2.0.1 (Alonge et al., 2021). Alignments of H. prostrata pseudomolecules to T. speciosissima chromosome sequences were examined using LASTZ v.1.02 (Harris, 2007). Quality of the scaffolds was assessed by BUSCO scores, mapping rates of Illumina short-reads by Minimap2 v.2.22-r1101 (Li, 2018) and RNAseq short-reads by HISAT2 v.2.2.1 (Kim et al., 2019). Continuity of the assembly based on long-terminal repeats (LTR) was assessed by LTR_finder v.1.07 (Xu & Wang, 2007), LTR_retriever v.2.9.0 (Ou & Jiang, 2018) and Long Terminal Repeat Assembly index (LAI; Ou et al., 2018).
2.3 Prediction and annotation of genes in the assembled H. prostrata genome
A library of repeats was constructed for the H. prostrata genome with RepeatModeler v.2.0.2a (Flynn et al., 2020) and used to mask repeats in the assembly with RepeatMasker v.4.1.2-p1 and Repbase 2018-1026 (Bao et al., 2015) before gene prediction and annotation. Genes were predicted by GeMoMa v.1.8 (parameters GeMoMa.Score = ReAlign AnnotationFinalizer.r = NO sc = false) (Keilwagen et al., 2019) using gene sequences from A. thaliana (Brassicaceae, GCF_000001735.4), M. integrifolia (Proteaceae, GCF_013358625.1) and Nelumbo nucifera (Nelumbonaceae, GCF_000365185.1), which were available through NCBI RefSeq (O'Leary et al., 2016), and T. speciosissima (Proteaceae), which was available from Chen (2021). Predicted protein sequences were filtered by BLAST (blastp) using the NCBI non-redundant (NR) database (Sayers et al., 2022) and removing sequences that did not return a protein sequence from the Embryophyta lineage as the top hit.
Functional annotations for H. prostrata genes were retrieved by BLAST (blastp) against the NR and Swiss-Prot v.2022_02 (The Uniprot Consortium, 2015) databases and taking the annotation of the top sequence hit. Gene ontology (GO) terms for each gene were obtained by eggNOG-mapper (Cantalapiedra et al., 2021) with annotations transferred only from orthologs in the Viridiplantae taxa group. Transfer RNA genes were predicted by tRNAscan-SE v.2.0.9 (Chan et al., 2021) and other non-coding RNA were predicted using Infernal v.1.1.4 (Nawrocki & Eddy, 2013) with the Rfam v.14.7 (Kalvari et al., 2021) database.
2.4 Gene ontology enrichment of H. prostrata orthogroups
Orthogroups were identified among the predicted H. prostrata protein set by OrthoFinder v.2.5.4 (Emms & Kelly, 2019) with protein sequence sets from A. thaliana, M. integrifolia, N. nucifera and T. speciosissima, and the NCBI RefSeq protein sequence sets for Eucalyptus grandis (Myrtaceae, GCF_016545825.1), Gossypium hirsutum (cotton, Malvaceae, GCF_007990345.1) and Vitis vinifera (grapevine, Vitaceae, GCF_000003745.3). A species tree was generated from single-copy orthologs of these species and species-specific orthogroups were analysed for significantly enriched GO terms using topGO v.2.48.0 (Alexa et al., 2006) with an adjusted p value cut-off of 0.05. GO terms were summarised by the removal of redundant terms with REVIGO (Supek et al., 2011).
2.5 Identification and analysis of PHT1 genes
PHT1 protein sequences were identified in the predicted protein sequences of H. prostrata and T. speciosissima (GCF_018873765.1), M. integrifolia, N. nucifera, M. truncatula (Fabaceae, GCF_003473485.1), E. grandis, V. vinifera and Z. mays (Poaceae, GCF_902167145.1) through sequence similarity and conserved domain searches. First, the nine PHT1 protein sequences from A. thaliana and the 13 PHT1 protein sequences from rice (O. sativa subsp. japonica (Poaceae)) (Supporting Information: Table S1) were used to search for PHT1-like sequences in each predicted protein sequence set by BLAST (blastp) v.2.5.0. An E-value cut-off of 1e−20 was chosen based on the E-values and annotations of resulting hits following the approach outlined in Nestor et al. (2023). Second, candidate PHT1 sequences were confirmed by identifying conserved domains using HMMER (hmmsearch) v3.3.2 (Eddy, 2011) with a profile generated from the BLAST results.
We also searched for PHT1-like nucleotide sequences that did not have encoded protein sequences in the predicted protein set of H. prostrata (Nestor et al., 2023). The HpPHT1 protein sequences identified above were used as query sequences in BLAST (tblastn) to search the genome sequence with an E-value cut-off of 1e−5. PHT1-like hits that did not overlap with an identified genomic feature in the H. prostrata GFF file were analysed further to confirm if in-frame stop codons were true mutations or were errors from genome polishing. Using MaSuRCA v.4.0.9 (Zimin et al., 2013) and Minimap2 v.2.17-r941 (Li, 2018), we assembled long-reads and short-read pairs that had a primary alignment of 16 kbp upstream or downstream of PHT1-like hits. Putative PHT1 protein sequences were extracted and translated from the hybrid assemblies based on alignments to the coding DNA sequence (CDS) of other HpPHT1 genes.
After removing duplicates, HpPHT1 protein sequences larger than 300 amino acids (aa) were aligned by MAFFT v.7.505 (Katoh & Standley, 2013). Phylogenetic trees were constructed using RAxML v.8.2.12 (parameters -f a -m PROTGAMMAAUTO -p 12345 -x 12345) (Stamatakis, 2014) and ggtree v.3.4.0 (Yu et al., 2017) with a PHT1 sequence from Amborella trichopoda (Amborellaceae, W1NHD1) as the root sequence. Chromosome positions of PHT1 genes in various plant species were visualised by clinker v.0.0.25 (Gilchrist & Chooi, 2021). A multiple sequence alignment of HpPHT1 and AtPHT1 protein sequences was constructed using Geneious v.2020.1 (Biomatters, https://www.geneious.com) and predicted transmembrane domains were annotated with DeepTMHMM v.1.0.11 (Hallgren et al., 2022). Conserved sequence motifs were predicted using Multiple Em for Motif Elicitation (MEME) (Bailey et al., 2015) and the PHT1 sugar transporter superfamily domain (PF00083) was identified using the NCBI Conserved Domain Database (Lu et al., 2020). The P1BS motif (GNATATNC) that is conserved in promoters of Pi-inducible PHT1 genes (Franco-Zorrilla et al., 2004) was searched within the 2000 bp region upstream of each identified PHT1 gene using Geneious v. 2020.1 (Biomatters, https://www.geneious.com) and New PLACE v. 30.0 (https://www.dna.affrc.go.jp/PLACE/?action=newplace) (Higo et al., 1999). Ubiquitination sites were searched in the N-terminal regions of HpPHT1 protein sequences using CKSAAP_UbSite (Chen et al., 2011) (http://systbio.cau.edu.cn/cksaap_ubsite/index.php) and AraUbiSite (Chen et al., 2019) (http://systbio.cau.edu.cn/araubisite/index.php). N-terminal regions of HpPHT1 proteins were extracted based on alignment with the N-terminal region of AtPHT1;1 (1–279 aa) designated by Lin et al. (2013). Finally, protein structures for PHT1 protein sequences were predicted using AlphaFold2 (Jumper et al., 2021) through ColabFold v.1.3.0 (Mirdita et al., 2022) and visualised with ChimeraX v.1.4 (Pettersen et al., 2021).
2.6 Expression of PHT1 genes
Replicated RNAseq reads from leaves representing five developmental stages (young to senescing) from plants growing in their natural habitat were available from a previously published H. prostrata transcriptome assembly (Bird et al., 2023) under Sequence Read Archive (SRA) accessions SRR22135859 to SRR22135884. These RNAseq reads were mapped to genes in the H. prostrata genome assembly using HISAT2 v.2.2.1 (Kim et al., 2019) and expression was quantified with featureCounts v.2.0.1 (Liao et al., 2014). Transcript expression was compared pairwise between each leaf development stage through DESeq2 v.1.32.0 (Love et al., 2014). An adjusted p value cut-off of 0.05 was used to determine differentially expressed genes using Trinity (run_DE_analysis.pl) v.2.12.0 (Haas et al., 2013). Plots of gene expression normalised by trimmed mean of M values (TMM) were also visualised using Trinity (run_TMM_normalization_write_FPKM_matrix.pl and matrix_to_gene_plots.pl).
3 RESULTS
3.1 Genome assembly and annotation statistics of H. prostrata
The genome size of H. prostrata was estimated to be 820 Mbp by flow cytometry (Supporting Information: Figure S1), and 809 Mbp based on 21-mer frequency distribution of short-read Illumina sequence reads. Genome heterozygosity was estimated to be approximately 1.24% based on short-reads (Supporting Information: Figure S2).
The polished long-read assembly of the H. prostrata genome contained 1003 contigs covering 711 Mbp with an N50 score of 1.18 Mb. The BUSCO completeness score for this assembly was 96.4% (Supporting Information: Table S2). Scaffolding the H. prostrata genomic contigs produced 11 pseudomolecules corresponding to the 11 chromosome sequences in T. speciosissima (Supporting Information: Figure S3). Large inversions of approximately 5–10 Mbp long were identified in H. prostrata relative to T. speciosissima in chromosomes 1, 2, 3, 10 and 11. The pseudomolecules contained 93% of the total genome length of 711 Mbp and ranged in size from 25.4 to 79.2 Mbp (Table 1). The entire scaffolded genome assembly of H. prostrata contained 226 contigs and scaffolded contigs with an N50 score of 63.8 Mbp (Table 1). Illumina reads had a mapping rate of 99% to the final assembly, while RNAseq reads had an average mapping rate of 82% (Supporting Information: Table S3).
Statistic | Number |
---|---|
Genome assembly | |
Assembly size (bp) | 710 650 021 |
N50 sequence length (bp) | 63 765 893 |
GC content | 39% |
LTR Assembly Index | 13.8 |
Pseudomolecule scaffolding | |
Number of pseudomolecules | 11 |
Total size of pseudomolecules (bp) | 656 443 986 |
Percentage of assembly within pseudomolecules | 92.9% |
Predicted genes | 51 246 |
Approximately 59.5% of the H. prostrata genome was masked as repetitive sequences before annotation (Supporting Information: Table S4). A total of 61 112 genes were predicted using annotation data from A. thaliana, T. speciosissima, M. integrifolia and N. nucifera. Filtering out sequences that did not likely belong to the Embryophyta lineage left 51 246 genes with a BUSCO completeness score of 93.7%. Overall scores for complete single-copy and duplicated BUSCO groups in H. prostrata were similar to those for predicted protein sequences of T. speciosissima and M. integrifolia (Supporting Information: Table S5). All H. prostrata protein sequences were functionally annotated through the NCBI NR database, while 81% of sequences were annotated through the Swiss-Prot database (Supporting Information: Figure S4). Functional annotation resulted in 69.3% of protein sequences being assigned GO terms. Finally, 0.078% of the genome sequence was annotated as encoding tRNA and other non-coding RNA (Supporting Information: Table S6).
3.2 Gene ontology analysis of H. prostrata orthogroups
A total of 94.9% of sequences from H. prostrata, T. speciosissima, M. integrifolia, N. nucifera, A. thaliana, E. grandis, G. hirsutum and V. vinifera were placed in orthogroups (Supporting Information: Table S7). Sequences from H. prostrata were represented in 52.8% of orthogroups, while 5868 genes were grouped in 1427 orthogroups unique to H. prostrata (Supporting Information: Figure S5). Genes in the orthogroups unique to H. prostrata were enriched in several GO terms related to P transport and recycling including transmembrane transport, RNA phosphodiester bond hydrolysis and exonucleolytic catabolism of deadenylated mRNA (Supporting Information: Table S8). Enriched GO terms related to N recycling included proteolysis, cellular response to amino acid stimulus, proteolysis involved in cellular protein catabolic process and leucine biosynthetic process. Another enriched GO term involving nutrient transport was cellular response to sulphate starvation. Enriched GO terms related to root architecture and cluster root formation included auxin homeostasis, response to salicylic acid and root hair initiation. Furthermore, the GO term positive regulation of flavonoid biosynthetic process was enriched, which relates to anthocyanin production.
Genes in orthogroups unique to the Proteaceae H. prostrata, T. speciosissima and M. integrifolia were enriched in GO terms related to nutrient transport including inorganic cation transmembrane transport and magnesium ion transmembrane transport. Other Proteaceae-specific genes were enriched in GO terms potentially related to cluster root formation including cellular response to hormone stimulus, regulation of pH and mitochondrial respiratory chain complex III assembly (Supporting Information: Table S9). The GO term secondary metabolite biosynthetic process was enriched, which likely relates to anthocyanin production. In addition, the GO term defence response to oomycetes was enriched, which likely relates to pathogen defence.
3.3 Identification of PHT1 genes
A total of 16 HpPHT1 genes were identified in H. prostrata. First, 13 HpPHT1 protein sequences were identified in the set of predicted protein sequences for H. prostrata. All sequences identified as PHT1 protein sequences from H. prostrata, T. speciosissima, M. integrifolia, N. nucifera, M. truncatula, E. grandis, V. vinifera and Z. mays with an E-value threshold of 1e−20 had E-values less than 4.43e−55 and were annotated as ‘Inorganic phosphate transporter’ or similar (Supporting Information: Table S10). All sequences immediately below the E-value threshold of 1e−20 had E-values above 5.51e−18 and were annotated as transporters for compounds other than phosphate such as inositol, polyol, carnitine and arabinose. After a tblastn search of the genome assembly for unannotated HpPHT1 genes, 15 potential PHT1 sequences were extracted with 16 kbp flanking either end and these were reassembled using mapped short-reads and long-reads. Based on alignment with identified HpPHT1 genes, eight of these re-assembled contigs were predicted to contain full-length PHT1-like sequences, including five genes without introns and three pseudogenes (Supporting Information: Figure S6). Three of the full-length genes and three of the pseudogenes were positioned in the predicted large (>10 kbp) introns of two previously annotated HpPHT1 genes. These two chimeric HpPHT1 genes were discarded, resulting in a total of 16 identified HpPHT1 genes (Supporting Information: Table S11). All 16 HpPHT1 protein sequences contained the sugar transporter superfamily domain (PF00083) conserved in MFS transporters, including PHT1 sequences. All PHT1 genes except HpPHT1;4, HpPHT1;6 and HpPHT1;8 contained one to three P1BS motifs (GNATATNC) in the 2000 bp promoter region directly upstream of the start codon (Supporting Information: Figure S7). In addition, all identified HpPHT1;1 protein sequences contained one to five predicted ubiquitination sites in the N-terminal region and there was high confidence for at least one ubiquitination site in HpPHT1;5, HpPHT1;10, HpPHT1;12b and HpPHT1;12d (Supporting Information: Table S12).
3.4 Phylogenetic analysis of PHT1 protein sequences
The 16 HpPHT1 protein sequences generally grouped closely with the 15 TsPHT1 and 14 MiPHT1 sequences in a phylogenetic tree of identified PHT1 sequences, resulting in similar numbers of orthologs in each clade (Figure 1). In the clade containing AtPHT1;8 and AtPHT1;9, every species examined had two PHT1 sequences in the clade including H. prostrata (HpPHT1;8 and HpPHT1;9). HpPHT1;7 was also sister to these clades and grouped closely with two E. grandis PHT1 members. However, no Proteaceae sequences were present in the clades containing AtPHT1;1, AtPHT1;2, AtPHT1;3, AtPHT1;5 and AtPHT1;6. While most monocot PHT1 protein sequences were found in a clade that did not contain any dicot sequences (OsPHT1;1 to OsPHT1;13), the clade sister to the monocot-specific clade contained only Proteaceae sequences (HpPHT1;10 to HpPHT1;12d). This Proteaceae-specific clade lacked PHT1 sequences from N. nucifera, also belonging to the Proteales, and contained the HpPHT1 genes identified by hybrid re-assembly during the PHT1 gene identification analysis. Each Proteaceae species had seven PHT1 sequences in this Proteaceae-specific clade.

The Proteaceae-specific clade was examined further through a phylogenetic tree built from Proteaceae species (Figure 2a). Several duplications not conserved among all Proteaceae were present in the HpPHT1;11 and HpPHT1;12 groups of genes. These genes were located on chromosome 11 of H. prostrata and T. speciosissima in a region covering approximately 150 kbp (Figure 2b). Of these genes, HpPHT1;12a,b,d likely arose from a duplication unique to H. prostrata and their clade included only one ortholog from each of T. speciosissima (TsPHT1;8) and M. integrifolia (MiPHT1;9). These three HpPHT1;12 genes were grouped with one HpPHT1 pseudogene (ψHpPHT1;12c) and showed sequence similarity to three TsPHT1 pseudogenes (Figure 2b). Conversely, the clade including HpPHT1;11a-c contained four T. speciosissima protein sequences (TsPHT1;4 to TsPHT1;7) and only one orthologous M. integrifolia sequence (MiPHT1;10). The three HpPHT1;11 genes were grouped with two HpPHT1 pseudogenes (HpPHT1;11d and HpPHT1;11e) and showed sequence similarity to two TsPHT1 pseudogenes (Figure 2b). The 150 kbp region encompassing these genes in H. prostrata and T. speciosissima was inverted relative to the orthologous regions of M. integrifolia and has likely undergone several additional duplications and inversions (Figure 2b). These duplications and inversions can be seen by a self-alignment of the 150 kbp region in H. prostrata (Supporting Information: Figure S8). Nearby genes were not conserved among the three species apart from a downstream phosphotransferase-encoding gene in H. prostrata and T. speciosissima (Figure 2b).

3.5 Structural analysis of predicted H. prostrata PHT1 protein sequences
As AtPHT1;1 is the most functionally studied plant PHT1 sequence, we compared the sequence structure of all HpPHT1 protein sequences to AtPHT1;1 to identify the most similar HpPHT1 sequence. Both HpPHT1;1 and HpPHT1;12a had all 19 sequence motifs found in AtPHT1;1 (Supporting Information: Figure S9). The other HpPHT1 proteins encoded by duplicated genes in the region of chromosome 11 that encoded HpPHT1;12a had at least 17 of the conserved motifs except for HpPHT1;12d, which only had 13 of the motifs. Alignment of the HpPHT1 sequences showed that all except HpPHT1;4 contained the 12 transmembrane domains found in AtPHT1 protein sequences and all MFS transporters. HpPHT1;4 was missing two transmembrane domains near the N-terminus (Figure 3). Eight of the HpPHT1 proteins had all 18 conserved residues considered to be important for Pi transport by PHT1 proteins. Two of the remaining eight HpPHT1 sequences were missing more than three residues. HpPHT1;2 was missing two residues, including D308 and Y312, which were conserved in the TsPHT1 and MiPHT1 sequences in the same clade. Likewise, HpPHT1;4 was missing three residues, including R134, D144 and Y145, which were conserved in orthologous MiPHT1, TsPHT1 and EgPHT1 sequences within the same clade. Finally, HpPHT1;5 was missing residues D35, D38 and E504, but only the missing E504 residue was conserved in the TsPHT1 sequence in the same clade.

Predicted tertiary structures for HpPHT1 protein sequences were generally similar to those predicted for AtPHT1 proteins, with differences mostly localised to the unstructured regions at the N- and C-termini. Both HpPHT1;1 and AtPHT1;1 had similar tertiary structures, including the position of conserved residues and transmembrane domains (Figure 4). However, the proteins differed slightly at the N- and C-termini, which can be seen in an overlayed structure alignment of the two proteins (Supporting Information: Figure S10). HpPHT1;12a and its paralogs 1;12b and 1;12d each had a longer unstructured sequence near the C-terminus than either AtPHT1;1 or HpPHT1;1. This long unstructured C-terminal sequence was shared by most other AtPHT1 and HpPHT1 proteins apart from AtPHT1;2 and AtPHT1;3 (Supporting Information: Figure S11). Comparing the remaining HpPHT1 protein structures with AtPHT1 structures revealed that most HpPHT1 proteins had similar transmembrane structures to AtPHT1 proteins, but varied in the unstructured domains on both sides of the cell membrane that connected the transmembrane domains (Supporting Information: Figure S11).

3.6 Expression of H. prostrata PHT1 transcripts across leaf maturation
Nine of the 16 HpPHT1 genes were expressed in leaves of H. prostrata plants growing in their natural habitat (Supporting Information: Figure S12). Three genes, HpPHT1;2, HpPHT1;6 and HpPHT1;9, increased in expression as young leaves developed to mature leaves with HpPHT1;9 having the greatest increase of 143-fold. The only gene that decreased in expression during leaf development was HpPHT1;1, which decreased approximately 90% in expression between stages III and IV of leaf development (Supporting Information: Table S13). Of the six HpPHT1 genes in the region of chromosome 11 that was inverted and contained duplicated HpPHT1 genes, only HpPHT1;12a and HpPHT1;12d were expressed in leaves, although this expression was low and did not vary much over leaf development apart from higher expression in mature leaves (Supporting Information: Figure S12). The remaining three genes, HpPHT1;3, HpPHT1;5 and HpPHT1;8, had a mostly low and constant expression across leaf development.
4 DISCUSSION
4.1 Quality of the H. prostrata genome assembly
In this paper, we present a genome assembly for H. prostrata, the first publicly available genome assembly for a southwest Australian Proteaceae. The H. prostrata genome assembly was constructed using long-read Nanopore technology, which has been incorporated in many recent plant genome assemblies (Miao et al., 2021; Wang et al., 2021), including assemblies for T. speciosissima (Chen et al., 2022) and M. integrifolia (Nock et al., 2020). The H. prostrata assembly had a scaffold N50 of 64 Mbp, which is similar to the T. speciosissima genome assembly (N50 of 69 Mbp) and the M. integrifolia genome assembly (N50 of 34 Mbp). Furthermore, genome assembly continuity was assessed by LTR Assembly Index (LAI) at 13.8, which indicated that the assembly was of reference standard quality (Ou et al., 2018). Identification of conserved BUSCO genes for the Embryophyta lineage in the H. prostrata genome sequence resulted in a BUSCO score of 96.4% complete genes, indicating that the majority of H. prostrata genes were correctly assembled. The high BUSCO score is comparable to the BUSCO complete scores for T. speciosissima (97.8%) and M. integrifolia (95.0%) genome assemblies. Likewise, the BUSCO complete score for proteins predicted from the H. prostrata assembly was 93.7%, similar to BUSCO complete scores for predicted proteins from T. speciosissima (94.4%) and M. integrifolia (96.2%) genome assemblies.
Identifying the structure of H. prostrata chromosomes was made possible by scaffolding the H. prostrata contigs based on chromosome sequences of the T. speciosissima assembly (Chen et al., 2022). This method was used due to difficulties in extracting intact nuclei for Hi-C library preparation in the present study, which is a known issue for high molecular-weight plants (Jones et al., 2021; Vaillancourt & Buell, 2019). Pseudomolecule scaffolding using a related species has previously been used to scaffold long-read assemblies for cherry (Prunus fruticosa) (Wöhner et al., 2021), upriver orange mangrove (Bruguiera sexangula) (Pootakham, Naktang, et al., 2022), wild asparagus (Asparagus kiusianus) (Shirasawa et al., 2022) and yellow mangrove (Ceriops zippeliana) (Pootakham, Sonthirod, et al., 2022). The close phylogenetic distance of H. prostrata to T. speciosissima (Sauquet et al., 2009) suggests that the reported H. prostrata pseudomolecules have a high accuracy.
4.2 Enriched gene family functions in Proteaceae plant species
To begin to understand the extent of nutrient-use-efficiency adaptation in Proteaceae, we compared enriched gene functions in species and family-specific gene families of the three Proteaceae species. H. prostrata had many species-specific genes involved in transmembrane transport, which could be involved in carboxylate exudation in cluster roots or may contribute to previously hypothesised adaptations in nutrient uptake strategies in H. prostrata such as an inability to strongly down-regulate Pi uptake (Bird et al., 2024; de Campos et al., 2013; Lambers et al., 2013). Other enriched H. prostrata-specific gene functions were RNA phosphodiester bond hydrolysis and exonucleolytic catabolism of deadenylated mRNA, suggesting that gene families involved in RNA degradation have diverged from those in east Australian Proteaceae, potentially increasing the efficiency of P recycling and contributing to the high P remobilisation efficiency of the species (Shane et al., 2014).
4.3 Overview of PHT1 genes in H. prostrata and other Australian Proteaceae
A total of 16 PHT1 genes were identified in H. prostrata compared with 15 in T. speciosissima and 14 in M. integrifolia, which indicates that there are likely only subtle differences in PHT1 gene number across the Proteaceae. In wheat (Triticum aestivum), a similar number of 16 representative PHT1 genes were identified (Grün et al., 2018). Minimal differences in the number of PHT1 genes occurred among monocots, but there was large variation in the genomic structure and arrangement of PHT1s (Grün et al., 2018), similar to the examined Proteaceae species. In comparison to southwest Australia, soils in east Australia where T. speciosissima and M. integrifolia evolved are not as P impoverished (Gleason et al., 2009; Turner & Laliberté, 2015), although those soils still have relatively lower P concentrations than most global soils (Beckmann, 1967; Gleason et al., 2009). Telopea speciosissima and M. integrifolia are also known to be sensitive to excess Pi similar to southwest Australian Proteaceae, showing reduced growth and leaf chlorosis (Grose, 1989; Zhao et al., 2021). The strong conservation of PHT1 members between Proteaceae species suggests that these species have similar Pi uptake strategies, although further investigation is needed into the expression of TsPHT1 and MiPHT1 genes as well as the identification of PHT1 members in other Proteaceae species.
All AtPHT1 genes are expressed in roots except AtPHT1;6 (Mudge et al., 2002), which is expressed solely in flowers of A. thaliana (Ayadi et al., 2015). Furthermore, all AtPHT1 genes except AtPHT1;6 are repressed by increasing P supply as protection against P toxicity (Lapis-Gaza et al., 2014). Hence, we used previously examined transcript data from roots of hydroponically-grown H. prostrata (Bird et al., 2024) to determine if this Pi repression also occurs for HpPHT1 genes given the low capacity of H. prostrata to down-regulate Pi uptake (Shane, Szota, et al., 2004).
A previous transcriptome study by Bird et al. (2024) identified 10 PHT1 genes in H. prostrata and reported an expression analysis on several PHT1 genes in roots over increasing Pi concentrations. Based on primer sequences used by Bird et al. (2024), seven of these PHT1 genes were matched to genes identified in the present study and the same names were kept: HpPHT1;1, HpPHT1;3, HpPHT1;5, HpPHT1;6, HpPHT1;8, HpPHT1;9 and HpPHT1;10. Three other transcripts identified by Bird et al. (2024) were incomplete at the 5′ end. These three incomplete transcripts were matched to one complete PHT1 gene, HpPHT1;12a (formerly HpPHT1;7) and two incomplete PHT1 pseudogenes, ψHpPHT1;11e (formerly HpPHT1;2) and ψHpPHT1;12c (formerly HpPHT1;4) in the Proteaceae-specific HpPHT1;11 and HpPHT1;12 groups. In Bird et al. (2024), transcripts from HpPHT1;1, HpPHT1;3, HpPHT1;6 and HpPHT1;12a (formerly HpPHT1;7) were the most abundant in white roots of H. prostrata, while transcripts from HpPHT1;6 and HpPHT1;12a were the most abundant in mature cluster roots. Transcript amounts from HpPHT1;6 and HpPHT1;8 were not significantly responsive to Pi in white roots or cluster roots (Bird et al., 2024). HpPHT1;9 was also not significantly responsive to Pi in cluster roots but was somewhat repressed by Pi in white roots (Bird et al., 2024). Similarly, HpPHT1;1 was not responsive to Pi in cluster roots but was repressed by Pi in white roots (Bird et al., 2024). HpPHT1;5 and HpPHT1;10 were not examined by Bird et al. (2024) due to a very low qPCR signal in white roots. Transcripts from ψHpPHT1;11e (formerly HpPHT1;2) and ψHpPHT1;12c (formerly HpPHT1;4) were expressed in white roots and cluster roots and ψHpPHT1;11e was strongly repressed by increased P supply (Bird et al., 2024). However, we deemed both ψHpPHT1;11e and ψHpPHT1;12c to be pseudogenes in the reference genome due to the presence of an in-frame stop codon in each of them. Further analysis of these genes in other H. prostrata individuals will be needed to confirm whether these pseudogenes are conserved species-wide.
4.4 HpPHT1 genes are likely important for Pi transport in H. prostrata
We identified HpPHT1;1 as a likely functional equivalent of AtPHT1;1, which encodes the major Pi uptake transporter in A. thaliana under P-sufficient conditions (Ayadi et al., 2015; Shin et al., 2004). Although the predicted protein sequence encoded by HpPHT1;1 did not cluster with AtPHT1;1 in the phylogenetic tree, the proteins had the same highly conserved sequence motifs, residues and transmembrane domains. Both encoded proteins also had similar predicted structures, with only slight differences in the helices and unstructured sequences situated inside the cell membrane, particularly near the C-terminus. HpPHT1;1 was one of the most highly expressed HpPHT1 transcripts in white roots and mature cluster roots of hydroponically-grown H. prostrata and was marginally repressed by increasing Pi in white roots, but not in mature cluster roots (Bird et al., 2024). The presence of two P1BS motifs in the HpPHT1;1 promoter region suggests that this gene is repressed through PHR1 and is part of PSR, similar to AtPHT1;1 (Franco-Zorrilla et al., 2004). Interestingly, HpPHT1;1 was the only HpPHT1 gene that decreased in expression during leaf development in plants in their natural habitat, decreasing between stages III and IV. This decrease in expression suggests a role for HpPHT1;1 in Pi import into leaves early in development that decreases as leaves mature. AtPHT1;1 is mainly expressed in roots and repressed by high P supply but is also expressed in mature leaves of A. thaliana (Lapis-Gaza et al., 2014; Mudge et al., 2002; Nussaume et al., 2011). Based on these similar expression patterns and the conservation of PHT1 sequence structure, HpPHT1;1 is potentially a functional ortholog of AtPHT1;1 involved in Pi uptake from the soil.
HpPHT1 genes were mostly grouped in clades separated from clades containing AtPHT1 genes, apart from HpPHT1;8 and HpPHT1;9. These two genes were orthologs of AtPHT1;8 and AtPHT1;9, genes that are necessary for Pi translocation from roots to shoots in A. thaliana (Lapis-Gaza et al., 2014). The proteins encoded by these four genes mostly shared conserved residues, transmembrane domains and conserved sequence motifs, including a C-terminal motif absent from any other HpPHT1 or AtPHT1 protein sequence. All examined species except Z. mays had two to four PHT1 orthologs in this clade, suggesting that these genes are mostly conserved among different plant lineages. The two O. sativa orthologs in this clade, OsPHT1;9 and OsPHT1;10, are expressed in the roots and leaves of O. sativa and have roles in root Pi uptake and translocation to the shoot (Wang et al., 2014). Transcripts of both HpPHT1;8 and HpPHT1;9 were found in white roots and cluster roots of H. prostrata (Bird et al., 2024), while transcripts of HpPHT1;9 were found in leaves in the present study, similar to the root and leaf expression patterns of OsPHT1;9 and OsPHT1;10 as well as AtPHT1;8 and AtPHT1;9 (Lapis-Gaza et al., 2014; Mudge et al., 2002; Victor Roch et al., 2019). Notably, HpPHT1;9 had a strong increase in expression throughout leaf development, suggesting it may also be involved in leaf Pi remobilisation as leaves mature. Overall, the expression patterns and close phylogenetic relationship of HpPHT1;8 and HpPHT1;9 to orthologs conserved in most plant lineages suggests that these H. prostrata genes may have a function in Pi uptake and Pi translocation from roots to shoots, but this remains to be directly tested.
The earlier study on HpPHT1;8 and HpPHT1;9 transcripts revealed that these genes were generally unresponsive to Pi supply in white and cluster roots, although HpPHT1;9 did decrease slightly in expression in white roots at P supplies approaching those that cause Pi toxicity (Bird et al., 2024). The lack of responsiveness of HpPHT1;8 to Pi supply may be due to the lack of detectable P1BS motifs in the promoter region of this gene and thus, a lack of transcriptional regulation by PHR1 (Rubio et al., 2001). A lack of P1BS motifs in promoters has previously been discovered for OsPHT1;1, which is also unresponsive to changes in Pi supply (Sun et al., 2012). Conversely, AtPHT1;8 and AtPHT1;9, orthologs of HpPHT1;8 that contain P1BS motifs (Franco-Zorrilla et al., 2004), are strongly repressed in roots of A. thaliana by high P supply (Lapis-Gaza et al., 2014). However, a lack of P1BS promoters does not necessarily mean a lack of transcriptional regulation. The O. sativa orthologs, OsPHT1;9 and OsPHT1;10, and the orthologous TaPHT1;10 in T. aestivum are up-regulated by Pi starvation, but OsPHT1;10 and TaPHT1;10 lack P1BS motifs in the promoter region, meaning they are likely regulated by other TFs (Grün et al., 2018; Wang et al., 2014). HpPHT1;9 contained one P1BS motif in its promoter region, potentially explaining the minor repression by Pi in white roots (Bird et al., 2024), but whether HpPHT1;8 and HpPHT1;9 show up-regulation as part of PSR needs to be examined in H. prostrata under P-starved conditions.
At the level of post-translational regulation, all HpPHT1 sequences, including HpPHT1;8 and HpPHT1;9, contained at least one predicted ubiquitination site in the N-terminal region. In A. thaliana, ubiquitination sites that are likely located in the N-terminal region (1–279 aa) of AtPHT1;1 can be recognised by NLA for ubiquitination and degradation, providing a mechanism for post-translational regulation (Lin et al., 2013). Hence, a lack of ubiquitination sites in HpPHT1 sequences is not a likely explanation for the low capacity of H. prostrata to down-regulate Pi uptake. Overall, the lack of responsiveness of HpPHT1;8 expression to Pi supply may contribute to the Pi sensitivity displayed by H. prostrata (Bird et al., 2024; Shane, Szota, et al., 2004), but further studies are needed into the function and post-transcriptional regulation of both HpPHT1;8 and HpPHT1;9.
We identified six HpPHT1 genes and three HpPHT1 pseudogenes that arose from duplication and rearrangement around the HpPHT1;11 and HpPHT1;12 groups on chromosome 11. Most of the proteins encoded by the six complete PHT1 genes had all the conserved PHT1 residues and transmembrane domains and so are potentially functional Pi transporters. Of the genes in these two groups, only HpPHT1;12a was identified previously by Bird et al. (2024) and was shown to have high expression in white roots, higher expression in cluster roots, and was marginally repressed by increased P supply. Furthermore, only HpPHT1;12a and HpPHT1;12d were expressed in leaves of H. prostrata in its native habitat and this expression was mostly limited to mature leaves, suggesting possible involvement in Pi remobilisation as leaves mature. Based on these results, the HpPHT1;12 group of genes are likely to be functional Pi transporters, but further expression analysis is needed for this group and the HpPHT1;11 group to determine their function.
The HpPHT1;11 and HpPHT1;12 groups were part of a Proteaceae-specific clade with the only sister clade being monocot PHT1 members. H. prostrata, T. speciosissima and M. integrifolia each contained seven PHT1 members in this clade, although the genetic architecture of the genes was not conserved based on large differences in the number of PHT1 pseudogenes and inversions and rearrangements of the genes in this region. In H. prostrata, both the HpPHT1;11 and HpPHT1;12 groups were likely the result of tandem duplication, although the HpPHT1;11 group appeared to also have a small inversion after this duplication. We also identified a single ortholog for each group in M. integrifolia and several orthologs in T. speciosissima. The proliferation of these PHT1 genes in H. prostrata and T. speciosissima was within a genomic inversion relative to M. integrifolia, suggesting that this inversion came after the evolutionary split of M. integrifolia in the Proteaceae lineage, but before the split separating Hakea and Telopea. Interestingly, there was only one functional ortholog of the HpPHT1;12 group in T. speciosissima, presenting the HpPHT1;12 group as a potentially unique group of functional PHT1 genes in southwest Australian Proteaceae.
4.5 Other HpPHT1 genes potentially functional for Pi transport
In addition to the HpPHT1 genes already discussed, four other PHT1 genes, HpPHT1;3, HpPHT1;6, HpPHT1;7 and HpPHT1;10 likely encode functional Pi transporters given their conserved PHT1 protein residues and transmembrane domains. HpPHT1;3 and HpPHT1;6 were previously found to have relatively high levels of expression in white roots and mature cluster roots of hydroponically-grown H. prostrata and HpPHT1;3 was repressed by increased P supply (Bird et al., 2024). HpPHT1;6 lacked P1BS motifs in its promoter region, potentially explaining its lack of responsiveness to Pi supply (Bird et al., 2024; Rubio et al., 2001). Notably, HpPHT1;6 strongly increased in expression across leaf maturation, suggesting a role in Pi remobilisation similar to the role we hypothesised for HpPHT1;9. HpPHT1;10 transcripts were not detected in white roots of H. prostrata due to a low qPCR signal (Bird et al., 2024) and were also not expressed in leaves of H. prostrata in its native habitat. HpPHT1;7 was not identified by Bird et al. (2024) and similarly to HpPHT1;10, had no expression in leaves of H. prostrata in its native habitat. These genes may instead be expressed in flowers or other organs similar to the flower-restricted expression of AtPHT1;6 (Ayadi et al., 2015). HpPHT1;7 was also the only PHT1 gene besides HpPHT1;8 and HpPHT1;9 which had an orthologous gene in E. grandis, a tree species endemic to Australia. The HpPHT1;7 gene did not have orthologs in any species examined besides E. grandis, T. speciosissima and M. integrifolia. The functions of HpPHT1;3, HpPHT1;6, HpPHT1;7 and HpPHT1;10 require further analysis to identify if they are involved in Pi uptake in H. prostrata.
Finally, we hypothesise that HpPHT1;2 and HpPHT1;5 are potentially functional. Both genes were transcribed in leaves of H. prostrata, even though the encoded proteins were missing one to two essential PHT1 residues that were conserved in orthologous TsPHT1 and MiPHT1 protein sequences. HpPHT1;2 transcripts were not identified in H. prostrata roots by Bird et al. (2024), but were highly abundant in leaves and increased in abundance as the leaves developed, suggesting a role in Pi remobilisation as leaves age. On the other hand, HpPHT1;5 had a very low qPCR signal in roots (Bird et al., 2024) and also had negligible expression in leaves of H. prostrata in its native habitat. The last PHT1 gene to be discussed, HpPHT1;4, may not encode a functional protein because it lacks two transmembrane domains near the N-terminus. HpPHT1;4 was also not identified by Bird et al. (2024) and was not expressed in leaves of H. prostrata in its native habitat. The HpPHT1;4 protein sequence lacked three conserved residues that were conserved in orthologous TsPHT1, MiPHT1 and EgPHT1 protein sequences, suggesting that loss of these sequence traits in HpPHT1;4 is unique and that the gene is either non-functional or has largely diverged in function.
4.6 Summary and further research
We have identified 16 PHT1 genes in the highly P-efficient H. prostrata. HpPHT1;1 encodes a protein structurally similar to AtPHT1;1 and is potentially involved in Pi uptake. HpPHT1;8 and HpPHT1;9 were orthologs of AtPHT1;8 and AtPHT1;9 and potentially function in Pi uptake and translocation from roots to shoots. HpPHT1;9 may also be involved in Pi remobilisation as H. prostrata leaves age. A lack of P1BS motifs in the HpPHT1;8 promoter region supports the poor repression of this gene at high P supplies and may contribute to the high P sensitivity of H. prostrata. HpPHT1;11a-c and HpPHT1;12a,b,d are unique duplications of PHT1 genes that seem to be restricted to Proteaceae, suggesting a unique role in Pi transport that requires further investigation. HpPHT1;2, HpPHT1;3, HpPHT1;5, HpPHT1;6, HpPHT1;7 and HpPHT1;10 potentially encode functional Pi transporters with HpPHT1;6 also lacking P1BS motifs and responsiveness to Pi and possibly being involved in Pi remobilisation from ageing leaves. Finally, HpPHT1;4 is likely a non-functional or highly diverged PHT1 gene. To better understand the functions of these genes, further studies are needed to analyse their response to Pi supply in major organs of H. prostrata. These results will increase the knowledge of P-saving strategies in Australian Proteaceae with high PUE traits and support further research into developing highly nutrient-efficient crops to reduce global fertiliser overuse.
ACKNOWLEDGEMENTS
This work was supported by Australian Research Council grants to HL and PMF (DP140100148 and LP0776252) and HL, PMF and MD (DP200101013). High-performance computing was provided by the Pawsey Supercomputing Research Centre with funding from the Australian Government and the Government of Western Australia. Thanks go to Mark Wallace (Kings Park and Botanic Garden, Department of Biodiversity, Conservation and Attractions) for the flow cytometry estimation of genome size. BJN and TB are both assisted by Research Training Fund Fee Offsets. BJN was supported by a University Co-funded Postgraduate Award from The University of Western Australia and TB was supported by a Research Training Fund Stipend Award and a Kwongan Foundation Top-up Award. Open access publishing facilitated by The University of Western Australia, as part of the Wiley - The University of Western Australia agreement via the Council of Australian University Librarians.
SUMMARY STATEMENT
Hakea prostrata (Proteaceae) is a highly phosphorus-use-efficient plant species native to southwest Australia. As a foundation for understanding the genetic basis of its P-saving traits, we identified and measured the expression of genes in the Phosphate Transporter 1 (PHT1) gene family, which are responsible for phosphorus uptake and transport in plants.
Open Research
DATA AVAILABILITY STATEMENT
The H. prostrata long-reads, short-reads and genome assembly used in this study are available through NCBI under the BioProject PRJNA896696. Long-reads were uploaded to the Sequence Read Archive (SRA) with accession SRR25498431 and short-reads with accession SRR25498432. The genome assembly of H. prostrata was uploaded to GenBank with accession number JAVCWE010000000. Annotation and protein sequence data were uploaded to the University of Western Australia Research Repository (doi:10.26182/zncj-e853). Nucleotide and protein sequences of HpPHT1 members were deposited in GenBank with accession numbers OR464054 to OR464069.