Volume 226, Issue 10 pp. 2469-2477
Rapid Communication
Full Access

Evaluation and identification of reliable reference genes for pharmacogenomics, toxicogenomics, and small RNA expression analysis

Dongliang Chen

Dongliang Chen

Department of Biology, East Carolina University, Greenville, North Carolina

Search for more papers by this author
Xiaoping Pan

Xiaoping Pan

Department of Biology, East Carolina University, Greenville, North Carolina

Search for more papers by this author
Peng Xiao

Peng Xiao

Department of Mathematics, East Carolina University, Greenville, North Carolina

Search for more papers by this author
Mary A. Farwell

Mary A. Farwell

Department of Biology, East Carolina University, Greenville, North Carolina

Search for more papers by this author
Baohong Zhang

Corresponding Author

Baohong Zhang

Department of Biology, East Carolina University, Greenville, North Carolina

Department of Biology, East Carolina University, Greenville, NC 27858.Search for more papers by this author
First published: 16 March 2011
Citations: 104

Abstract

Pharmacogenomics, toxicogenomics, and small RNA expression analysis are three of the most active research topics in the biological, biomedical, pharmaceutical, and toxicological fields. All of these studies are based on gene expression analysis, which requires reference genes to reduce the variations derived from different amounts of starting materials and different efficiencies of RNA extraction and cDNA synthesis. Thus, accurate normalization to one or several constitutively expressed reference genes is a prerequisite to valid gene expression studies. Although selection of reliable reference genes has been conducted in previous studies in several animals and plants, no research has been focused on pharmacological targets, and very few studies have had a toxicological context. More interestingly, no studies have been performed to identify reference genes for small RNA analysis although small RNA, particularly microRNA (miRNA)-related research is currently one of the fastest-moving topics. In this study, using MCF-7 breast cancer cells as a model, we employed quantitative real-time PCR (qRT-PCR), one of the most reliable methods for gene expression analysis in many research fields, to evaluate and to determine the most reliable reference genes for pharmacogenomics and toxicogenomics studies as well as for small RNA expression analysis. We tested the transcriptional expression of five protein-coding genes as well as five non-coding genes in MCF-7 cells treated with five different pharmaceuticals or toxicants [paclitaxel (PTX), gossypol (GOS), methyl jasmonate (JAS), L-nicotine (NIC), and melamine (mela)] and analyzed the stability of the selected reference genes by four different methods: geNorm, NormFinder, BestKeeper, and the comparative ΔCt method. According to our analysis, a protein-coding gene, hTBCA and four non-coding genes, hRNU44, hRNU48, hU6, and hRNU47, appear to be the most reliable reference genes for the five chemical treatments. Similar results were also obtained in dose-response and time-course assays with gossypol (GOS) treatment. Our results demonstrated that traditionally used reference genes, such as 18s RNA, β-actin, and GAPDH, are not reliable reference genes for pharmacogenomics and toxicogenomics studies. In contrast, hTBCA and small RNAs are more stable during drug treatment, and they are better reference genes for pharmacogenomics and toxicogenomics studies. To widely use these genes as reference genes, these results should be corroborated by studies with other human cell lines and additional drugs classes and hormonal treatments. J. Cell. Physiol. 226: 2469–2477, 2011. © 2011 Wiley-Liss, Inc.

Gene expression analysis is rapidly permeating into most fields within biology and biomedicine. No matter what techniques one employs, such as Northern blotting, microarray analysis, or quantitative real-time PCR (qRT-PCR), all gene expression analyses require references genes. Reference genes are used to reduce and to normalize the variations potentially derived from different amount of starting materials and different efficiencies of RNA extraction and cDNA synthesis to carry out direct or indirect comparison of gene expression levels between two or among multiple samples. Thus, identification of a reliable reference gene is critical and a prerequisite for any gene expression study. The best reference gene should be constitutively expressed at samples at different conditions. Compared to other genes, the expression of housekeeping genes is thought of as more stable, and therefore they are frequently used for reference genes. However, the expression level of almost all housekeeping genes will vary at different tissues during development under different conditions. Thus, there is no one best group of reference genes that exists in cells, and we must identify more reliable reference genes before performing gene expression studies.

Due to its high sensitivity and specificity, qRT-PCR is becoming one of the most reliable methods for gene expression analysis. Thus, qRT-PCR also becomes the best approach to identify reliable reference genes. In previous studies, selection of reliable reference genes was conducted in several animal and plant cells (Ahn et al., 2008; Huis et al., 2010), in different human tissues (Galiveti et al., 2010) as well as in several human cancer cell lines (Ahn et al., 2008; Rho et al., 2010). However, few studies were focused on reference gene selection in a pharmacological and toxicological context, with two exceptions: One concerning cetacean fibroblast cells' exposure to organochlorines (OCs), polybrominated diphenyl ethers (PBDEs) and 17β-estradiol (Spinsanti et al., 2008) and the other, concerning water flea (Daphnia magna) exposure to ibuprofen (Heckmann et al., 2006). Owing to a growing attention towards exploring the molecular mechanisms of drugs, toxicants, and chemicals, the choice of stable reference genes is of much importance in pharmacogenomics and toxicogenomics studies.

MicroRNAs (miRNAs) are an extensive class of newly identified small regulatory RNAs. Because of miRNA's versatile functions in almost all biological and metabolic processes, miRNA-related research has become one of the fastest growing research topics in biological and biomedical fields, although it was discovered and recognized only in the past decade. Although small RNA RNU 48 and U6 have been widely using for reference genes during miRNA expression analysis by qRT-PCR and Northern blotting, no systematic study has been reported on the stability of those reference genes. Recent investigations indicated that drugs and toxicants significantly altered miRNA expression profile based on microarray and qRT-PCR analysis, with U6 and RNU 48 as reference genes for a majority of studies (Jardim et al., 2009; Zhang and Pan, 2009; Bollati et al., 2010). However, without systematic studies on small RNA reference genes, miRNA-related research is severely constrained.

The purpose of this study is to identify reliable reference genes for pharmacogenomics, toxicogenomics, and small RNA expression analysis. To achieve this goal, we employed qRT-PCR to evaluate and to monitor the expression of five protein-coding reference genes as well as five non-coding reference genes during treatments by five different drugs or toxicants. Some of these ten genes have been used for reference genes in qRT-PCR, microarray, and Northern blotting. The five drugs and toxicants are commonly used for cancer treatment or faced daily in the environment by humans.

Materials and Methods

Chemicals and reagents

Paclitaxel (PTX) from Taxus brevifolia, methyl jasmonate (JAS) and DMSO were purchased from Sigma-Aldrich (St. Louis, MO); L-nicotine (NIC) and melamine (mela) from Acros Organics were obtained from Fisher Scientific (Pittsburgh, PA); GOS from MP Biomedicals (Solon, Ohio). mirVana PARIS protein and RNA isolation kit was purchased from Ambion (Austin, TX); TaqMan microRNA Reverse Transcription kit from Applied Biosystems (Foster City, CA); Real-Time SYBR Green PCR master mix from SuperArray Bioscience Corp. (Frederick, MD).

Cell culture and treatments

Human MCF-7 breast cancer cells was obtained from ATCC and kept at 37°C in a humidified 5% CO2 and 95% air incubator. MCF-7 cells were grown in Roswell Park Memorial Institute medium (RPMI) 1640 (GIBCO, Vienna, VA) supplemented with 10% FBS (PAA Laboratories, Dartmouth, MA), 4 µg/ml human recombinant insulin (GIBCO) and 2 µg/ml gentamicin (Sigma-Aldrich). Cells were trypsinized and 2.5 × 105 cells were plated into Costar 6-well cell culture clusters (Corning, NY). Twenty-four hours after passaging, the cells were exposed to the IC20 concentration of each chemical, which induces apoptosis, specifically, 6 nM PTX, 5 µM GOS, 100 µM JAS, 1 mM NIC, and 1 mM mela (data not shown). Since 0.1% DMSO was used to dissolve PTX, GOS, JAS and NIC, the same amount of DMSO was also added into mela and vehicle control groups to subtract potential effect of DMSO. After 48 h of exposure, cells were trypsinized and resuspended with 200 µl of RNAlater (Ambion).

Dose-response and time-course assays were also carried out after GOS treatment. Six concentrations of GOS were used to treat the cells, namely, NC (negative control, MCF-7 cultured without GOS or DMSO), VC (vehicle control, MCF-7 cultured with DMSO but without GOS), IC10 (500 nM, the highest concentration in which MCF-7 showed no growth inhibition), IC10 (2.65 µM), IC20 (4.38 µM), and IC50 (9.37 µM). For time-response assay, MCF-7 cells were treated with IC20 concentration of GOS for 12, 24, 36, 48, and 60 h. All the cells were trypsinized and resuspended with 200 µl of RNAlater and stored at −80°C.

Total RNA isolation

Total RNA was extracted from MCF-7 cells with mirVana PARIS kit according to the manufacturer's instructions. RNA quantification was performed with the NanoDrop ND-1000 Micro-Volume UV-Vis Spectrophotometer (NanoDrop Technologies, Wilmington, DE). RNA purity was evaluated by absorbance ratios of 260/280 and 260/230.

Candidate reference genes and primers

In this study, ten candidate reference genes were selected for determining the most reliable reference genes. These candidate reference genes have been widely used in many previous studies. The ten candidate reference genes tested in this study are β-actin (hACTB), glyceraldehyde-3P-dehydrogenase (hGAPDH), succinate dehydrogenase complex, subunit A, flavoprotein (Fp) (hSDHA), tubulin folding cofactor A (hTBCA), tubulin, alpha 1a (hTUBA1A), small nucleolar RNA, C/D box 44 (hRNU44), RNA, U6 small nuclear 1 (hU6), small nucleolar RNA, C/D box 48 (hRNU48), small nucleolar RNA, C/D box 47 (hRNU47) and RNA, 18S ribosomal 1 (h18S). All the primers are listed in Table 2.

Table 2. Primers used for real-time qPCR and their product size, efficiency, and R2
Name Forward primer Reverse primer Product size (bp) Primer efficiency (%) R2
hBAct CTCACCGAGCGCGGCTACAG GGAGCTGGAAGCAGCCGTGG 126 92 0.9997
hGAPDH CCCGCTTCGCTCTCTGCTCC GAGCGATGTGGCTCGGCTGG 77 101 0.9982
hSDHA CGACACCGTGAAGGGCTCCG TCTAGCTCGACCACGGCGGC 90 103 0.9997
hTBCA GCGTCGCCCTCCACGGTTAC ACCAACCGCTTCACCACGCC 120 95 0.9999
hTUBA1A CGCGAAGCAGCAACCATGCG GGCATCTGGCCATCGGGCTG 125 99 0.9967
hRNU44 CCTGGATGATGATAAGCAAATG GTCAGTTAGAGCTAATTAAGACC 60 99 0.9181
hU6 CTGCGCAAGGATGACACGCA AAAAATATGGAACGCTTCACG 45 102 1.0000
hRNU48 TGATGACCCCAGGTAACTCTGAGTGTG GGTCAGAGCGCTGCGGTGATG 58 98 0.9999
hRNU47 ACCAATGATGTAATGATTCTGCCA ACCTCAGAATCAAAATGGAACGG 75 93 1.0000
h18S TTGTACACACCGCCCGTCGC CTTCTCAGCGCTCCGCCAGG 102 65 0.9987

cDNA synthesis and real-time PCR

Total RNA (500 ng) was used for reverse transcription with TaqMan microRNA Reverse Transcription kit. In the reactions of five protein-coding reference genes, 2 µl of polyT was utilized as the reverse transcription primer; while in those of five non-coding reference genes, reverse primers for each gene were pooled and 3 µl of the pooled mixture was used as reverse transcription primer. After reverse transcription, 80 µl of DNase/RNase free water was added into each reverse transcription product to make the template for real-time qPCR.

Real-time qPCR was subsequently carried out on 96-well reaction plates in 7300 Real-Time PCR System (Applied Biosystems). In the total volume of 20 µl, the reaction mixture included 10 µl of Real-Time SYBR Green PCR master mix, 3 µl of diluted reverse transcription product, 0.5 µl of forward and reverse primer and 6 µl of DNase/RNase free water. Each reaction was carried out with two technical replicates.

The reaction mixtures were initially heated at 95°C for 10 min to activate the polymerase, followed by 40 cycles, which consisted of denaturation step at 95°C for 15 sec and a combined annealing/elongation step at 60°C for 60 sec. A melting curve analysis was immediately carried out with temperatures increasing from 60 to 95°C at a 0.1°C interval after the real-time PCR finished.

Analysis of real-time PCR data

qRT-PCR data were analyzed by means of four widely applied algorithms: geNorm (Vandesompele et al., 2002), NormFinder (Andersen et al., 2004), BestKeeper (Pfaffl et al., 2004), and the comparative ΔCt method (Silver et al., 2006).

geNorm is a freely available Excel Add-in utilized in this study. It provides the two most stable reference genes or a combination of multiple stable genes by calculating a gene expression normalization factor (M value) based on the geometric mean of a number of candidate reference genes (Vandesompele et al., 2002). M value, described as the average pair-wise variation of a single candidate with all others, is the indicator of expression stability in geNorm. Stepwise exclusion of the least stable gene with the highest M value will ultimately result in the two most stable genes that cannot be further ranked. An additional graph was generated by geNorm indicating the pair-wise variation between two sequential normalization factors containing one more gene. A large variation means the additional gene has a significant effect and therefore should preferably be included, while a small variation means there is no need to include the additional reference gene. An optimal number of reference genes can thereby be determined.

NormFinder is another Excel-based VBA (Visual Basic for Applications) applet aiming at identifying the optimal normalization gene among a panel of candidates according to their expression stability in a given sample set or given experimental designs (Andersen et al., 2004). This algorithm evaluates not only the overall expression variation of the candidate reference genes, but also the variation between subgroups of samples. In addition to data from real-time PCR, NormFinder can also analyze expression data obtained through other quantitative gene expression methods, such as microarray.

BestKeeper is also an Excel-based software tool which determines the best suited reference genes using pair-wise correlation analysis of candidate reference genes (Pfaffl et al., 2004). BestKeeper estimates correlations of the expression levels of all candidate genes, and highly correlated ones are then combined into an index. Three indicators, standard deviation, percent covariance, and power of the candidates, are calculated to help users determine the best reference genes.

The comparative ΔCt method assesses the most stable reference genes by comparing relative expression of “pairs of genes” within each tissue sample or each treatment (Silver et al., 2006). If the ΔCt values between two candidate genes remain constant in different treatment groups, it means these two genes are both transcriptionally stable or co-regulated. However, if the ΔCt values fluctuate, which is indicated by higher standard deviation, then at least one of these two candidates are variably transcribed. Stability of a gene is measured by the mean of standard deviation values derived from comparison between a particular reference gene and any other candidates.

All the other statistical analyses were performed in R programming language.

Results

The major goal of this study was to determine a set of reliable reference genes for gene expression analysis during drug/toxicant/chemical treatments by real-time qPCR. IC20 concentrations of five chemicals, 6 nM PTX, 5 µM GOS, 100 µM JAS, 1 mM NIC and 1 mM mela were used to treat breast cancer cells. Ten candidate reference genes were under investigation, among which five were protein-coding genes (ACTB, GAPDH, SDHA, TBCA, and TUBA1A) and the other five were non-protein-coding genes (SNORD44, RNU6-1, SNORD48, SNORD47, and RN18S1) (Table 1). Among the five protein-coding genes, ACTB and TUBA1A are cytoskeletal structural proteins, TBCA is involved in tubulin turnover, and GAPDH and SDHA are two enzymes that function in glucose metabolism. ACTB, GAPDH, and SDHA are frequently used as reference genes and are sometimes termed “gold standards.” The 18S ribosomal RNA, RN18S1, was also often used as an internal control in gene expression analysis (Kim et al., 2003; Ho-Pun-Cheung et al., 2009). Four small nuclear RNAs, SNORD44, SNORD47, SNORD48, and RNU6-1 were also taken into consideration because they were implied to be good reference genes in miRNA expression analysis of some commercial products (http://www.sabiosciences.com/). Details of the candidate reference genes and the forward and reverse primers are listed in Tables 1 and 2.

Table 1. Ten candidate reference genes evaluated in this study
Used name HGNC symbol Name Accession number Location Gene type Function
hBAct ACTB Actin, beta NM_001101 7p15-p12 Protein coding Cytoskeletal structural protein
hGAPDH GAPDH Glyceraldehyde-3-phosphate dehydrogenase NM_002046 12p13 Protein coding Oxidoreductase in glycolysis and gluconeogenesis
hSDHA SDHA Succinate dehydrogenase complex, subunit A, flavoprotein (Fp) NM_004168 5p15 Protein coding Involved in the oxidation of succinate
hTBCA TBCA Tubulin folding cofactor A NM_004607 5q14.1 Protein coding Capturing and stabilizing beta-tubulin intermediates
hTUBA1A TUBA1A Tubulin, alpha 1a NM_006009 12q12-q14.3 Protein coding Cytoskeletal structural protein
hRNU44 SNORD44 Small nucleolar RNA, C/D box 44 NR_002750 1q25.1 snoRNA RNA biogenesis
hU6 RNU6-1 RNA, U6 small nuclear 1 NR_004394 15q23 snRNA Splicing of pre-mRNA
hRNU48 SNORD48 Small nucleolar RNA, C/D box 48 NR_002745 6p21.33 snoRNA RNA biogenesis
hRNU47 SNORD47 Small nucleolar RNA, C/D box 47 NR_002746 1q25.1 snoRNA RNA biogenesis
h18S RN18S1 RNA, 18S ribosomal 1 NR_003286 22p12 rRNA A component of small ribosomal subunit

Primer specificity and efficiency analysis

All primer sets for candidate reference genes were determined according to general rules of qPCR primer design. They are 20–27 nt in length and the expected PCR product sizes are in the range of 45–126 bp (base pairs) (Table 2). After real-time qPCR, a dissociation program was performed to determine the specificity of individual primer sets. The melting curve of each gene has one single peak, indicating that all reference genes have only one PCR product and thus are highly specific (data not shown).

The amplification efficiency of each primer set was calculated using duplicates of a 10-fold dilution series of VC MCF-7 cDNA (50–0.05 ng) as templates. A comparative Ct method was utilized to determine the primer efficiency. Briefly, the individual Ct value was plotted against logarithm transformed concentration of template, and slope of the standard curve plot was determined by linear regression analysis. Primer efficiency is determined according to the following equation: Efficiency = [10(−1/slope) − 1] × 100%. All PCR primer sets showed correlation coefficients of R2 > 0.92 and primer efficiencies ranging from 92% to 103% except RN18S1 (Table 2). Although the primer efficiency of RN18S1 was as low as 65%, we still included this gene in further studies.

Expression levels of candidate reference genes

The threshold cycle (Ct) value is the amplification cycle number at which a defined threshold fluorescence is achieved. To assess the expression levels of all ten candidates in one VC group and five treatment groups, all obtained Ct values were shown in a boxplot graph (Fig. 1). All candidate reference genes included in this study displayed median Ct values ranging from 12 to 23. hBAct, hRNU44, hU6, hRNU48, hRNU47, and h18S showed relatively high expression with median Ct values around 15, while hGAPDH, hSDHA, hTBCA, and hTUBA1A were moderately abundant with median Cts about 22. The expression level of each candidate gene also presented a wide range of variability among different treatment groups, with IQR (interquartile range) of the Ct values from 0.21 (hTBCA) to 0.95 (hSDHA) and standard deviations of the Ct values ranging from 0.18 (hTBCA) to 0.55 (hSDHA). It is also of interest to note that except for hBAct, protein-coding genes were less abundant and displayed higher Ct values than non-coding genes. Protein-coding genes seemed to be also more variable than non-coding genes in general, although the least variable reference gene, hTBCA, is a protein-coding gene. However, although either IQR or standard deviation values alone can depict the expression stability of candidate reference genes convincingly, the robust analyses by four different algorithms are presented in the following sections.

Details are in the caption following the image

Real-time PCR cycle threshold (Ct) values for ten candidate reference genes in VC MCF-7 cells and in MCF-7 cells that were treated with paclitaxel, gossypol, methyl jasmonate, L-nicotine, and melamine. The solid lines within the boxes indicate median Ct values and the upper and lower hinges indicate 75 and 25 percentiles. The whiskers show the largest/smallest Ct values that fall within a distance of 1.5 times IQR (interquartile range) from the upper and lower hinges. Outliers are shown as small circles. The line in the middle of the figure separates protein-coding genes (left) from non-coding genes (right).

Evaluation of relative expression stability

The Ct values for each candidate reference genes in different treatment groups were analyzed for their relative expression stabilities using four different normalization methods: geNorm (Vandesompele et al., 2002), NormFinder (Andersen et al., 2004), BestKeeper (Pfaffl et al., 2004), and the comparative ΔCt method (Silver et al., 2006). Among all four methods, NormFinder, BestKeeper, and geNorm are Excel-based applets. NormFinder and geNorm use relative expression values as input data, while BestKeeper and the comparative ΔCt method use raw Ct values.

geNorm

The geNorm applet can identify the most stable reference genes in a given series of treatment groups, and it can also determine the minimal number of combinations of reference genes to achieve a reliable normalization result (Vandesompele et al., 2002). By calculating the average pair-wise variation of a particular gene with all other genes, all candidate reference genes are ranked based on average expression stability value (M value) from most stable to most variable. Stepwise exclusion of the least stable gene with the highest M value ultimately results in the two most stable genes that cannot be further ranked. In the pooled group, four snRNAs, hRNU44, hRNU48, hRNU47, and hU6, are listed as the most stable reference genes, followed by the protein-coding gene hTBCA, which has the smallest IQR and standard deviation values. hBAct, hGAPDH, and hSDHA, frequently used as reference genes in previous studies, are more variable than the snRNAs and hTBCA in the experimental treatments of this study. Besides its primer efficiency, h18S may not be a favorable reference gene for studying drug/chemical treatment according to this analysis because it seems to be most variable among all the candidates, both in each treatment group and in all treatment groups as a whole (Fig. 2A and Table 3). In each treatment group, the ranking of candidate reference genes seem to be similar but not identical. However, the PTX treatment group seems to be the most discrepant because h18S ranks as the 6th most-stable reference gene while hTUBA1A is shown to be the most variable one in this group (Table 3).

Details are in the caption following the image

A: Average expression stability (M) values of remaining reference genes during stepwise exclusion of the least stable reference gene with the highest M value in individual treatment groups and in a group with all treatments as a whole. B: Determination of the minimal number of reference genes. Large variation (Vn/n+1) means the additional gene has a significant effect and therefore should preferably be included.

Table 3. Reference genes ranked in groups treated with individual chemical and as a whole
PTX GOS JAS NIC mela Whole
hRNU44 and hRNU48 hRNU44 and hRNU47 hRNU44 and hRNU47 hRNU44 and hRNU47 hRNU44 and hRNU47 hRNU44 and hRNU48
hRNU47 hRNU48 hRNU48 hRNU48 hRNU48 hRNU47
hU6 hU6 hTBCA hTBCA hTBCA hU6
hTBCA hTUBA1A hU6 hTUBA1A hU6 hTBCA
h18S hTBCA hTUBA1A hU6 hTUBA1A hGAPDH
hGAPDH hGAPDH hGAPDH hGAPDH hBAct hBAct
hBAct hSDHA hBAct hBAct hGAPDH hSDHA
hSDHA hBAct hSDHA hSDHA hSDHA hTUBA1A
hTUBA1A h18S h18S h18S h18S h18S

Additionally, geNorm offers an estimate of the minimal number of reference gene combinations by calculating the pair-wise variation value (V value) among the candidate reference genes (Fig. 2B). This graph was generated by geNorm indicating the pair-wise variation V between two sequential normalization factors containing one more gene. A large variation means the additional gene has a significant effect and therefore should preferably be included, while a small variation means there is no need to include the additional reference gene. A cutoff value 0.1 was set in this study, which is more stringent than the default cutoff value 0.15 in the original report (Vandesompele et al., 2002). From Figure 2B, we can find that the variation between a combination of two and of three reference genes (V2/3) is less than 0.1 in any treatment group and also in the pooled group, so there is no need to include the third reference gene in the combination. In GOS, JAS, NIC, and mela treatment groups, a combination of hRNU44 and hRNU47 is enough, while in PTX and pooled groups, a combination of hRNU44 and hRNU48 is enough to achieve a satisfactory normalization result.

NormFinder

NormFinder uses the model-based strategy to identify suitable reference genes to normalize qRT-PCR data. Unlike geNorm, NormFinder assesses the expression stability of each candidate independently. The results of the NormFinder analysis are given in Table 6. hTBCA ranks the first in the list of most stable reference genes, followed by four snRNAs, which is consistent with our previous conclusion derived from analysis of IQR and standard deviation values of individual candidates but is slightly different from the result derived from geNorm software. Consistent with the results of geNorm, hBAct, hGAPDH, and hSDHA were not good reference genes for pharmaceutical and toxicological studies, which were only ranked 7th, 6th, and 8th place, respectively, according to the NormFinder program. Ribosomal RNA h18S ranks as the most variable reference gene compared with other candidates. In summary, hTBCA and hU6 are the best combination of two reference genes according to NormFinder.

Table 6. Ten candidate reference genes ranked by different methods.
Ranking geNorm NormFinder BestKeeper ΔCt method Overall ranking
1 hRNU44 and hRNU48 hTBCA hTBCA hTBCA hTBCA
2 hU6 hRNU44 hU6 hRNU44
3 hRNU47 hRNU48 hRNU48 hRNU44 hRNU48
4 hU6 hRNU44 hU6 hRNU48 hU6
5 hTBCA hRNU47 hRNU47 hRNU47 hRNU47
6 hGAPDH hGAPDH hTUBA1A hGAPDH hGAPDH
7 hBAct hBAct hGAPDH hBAct hBAct
8 hSDHA hSDHA hBAct hSDHA hTUBA1A
9 hTUBA1A hTUBA1A h18S hTUBA1A hSDHA
10 h18S h18S hSDHA h18S h18S

BestKeeper

Analysis with BestKeeper showed that the covariance and standard deviation values of hTBCA were the lowest of all reference genes analyzed, followed by those of snRNAs, demonstrating that the expression stability of these candidates was higher than that of others. However, hU6, hGAPDH, and hTBCA have the highest correlation coefficients, revealing that the expression of these candidates correlates well with each other and also with the BestKeeper index. The summarized results from BestKeeper analysis of ten candidate reference genes are listed in Table 4.

Table 4. Expression stability evaluated by BestKeeper
Factor hBAct hGAPDH hSDHA hTBCA hTUBA1A hRNU44 hU6 hRNU48 hRNU47 H18S
N 24 24 24 24 24 24 24 24 24 24
GM [Ct] 16.77 21.44 23.12 21.32 22.92 15.79 12.23 17.58 14.58 15.76
AM [Ct] 16.78 21.44 23.13 21.32 22.93 15.79 12.23 17.58 14.58 15.77
Min [Ct] 15.95 20.80 22.26 20.98 21.86 15.42 11.90 17.15 14.09 14.74
Max [Ct] 17.62 22.29 24.29 21.68 23.60 16.13 12.97 17.89 15.03 16.66
SD [±Ct] 0.44 0.35 0.48 0.15 0.35 0.18 0.21 0.19 0.22 0.44
CV [%Ct] 2.62 1.63 2.06 0.69 1.52 1.16 1.71 1.09 1.50 2.80
Min [x-fold] −1.76 −1.56 −1.82 −1.26 −2.09 −1.30 −1.25 −1.35 −1.40 −2.02
Max [x-fold] 1.81 1.81 2.25 1.28 1.60 1.27 1.68 1.23 1.36 1.87
SD [±x-fold] 1.36 1.27 1.39 1.11 1.27 1.14 1.16 1.14 1.16 1.36
Coeff. of corr. [R] 0.55 0.67 0.53 0.60 −0.21 −0.21 0.69 −0.20 0.24 0.28
P-value 0.005 0.001 0.008 0.002 0.333 0.314 0.001 0.352 0.255 0.194

The comparative ΔCt method

The comparative ΔCt method assesses the most stable reference genes by comparing relative expression of “pairs of genes” within each tissue sample or each treatment (Silver et al., 2006). No high-level mathematical methodology is required in this method, so this method is ideal for the non-specialist to determine the best reference genes. First, ΔCt and mean of ΔCt between every two groups are calculated, and then standard deviation of each set of ΔCt is determined. ΔCt variability is shown as median (bar), 25–75 percentile (box), range (whiskers), and outliers (small circles) for the VC group and five treatment groups (Fig. 3). To decide the stability of an individual gene, for example hBAct, the arithmetic mean of all standard deviations concerning hBAct is calculated. A higher arithmetic mean indicates less stability, while a lower arithmetic mean indicates more stability (Table 5).

Details are in the caption following the image

Pair-wise gene expression stability analysis of ten candidate reference genes using the comparative ΔCt method. ΔCt variability is shown as median (bar), 25–75 percentile (box), range (whiskers) and outliers (small circles) for the vehicle control group and five treatment groups.

Table 5. Comparative ΔCt method to determine reference gene stability
Sample Mean ΔCt StdDev Mean StdDev Sample Mean ΔCt StdDev Mean StdDev
hBAct vs. hGAPDH −4.666 0.376 hRNU44 vs. hBAct 0.964 0.690
hBAct vs. hSDHA −6.353 0.238 hRNU44 vs. hGAPDH 5.630 0.577
hBAct vs. hTBCA −4.541 0.403 hRNU44 vs. hSDHA 7.317 0.723
hBAct vs. hTUBA1A −6.152 0.675 hRNU44 vs. hTBCA 5.506 0.373
hBAct vs. hRNU44 0.964 0.690 hRNU44 vs. hTUBA1A 7.117 0.472
hBAct vs. hU6 4.547 0.508 hRNU44 vs. hU6 3.583 0.346
hBAct vs. hRNU48 −0.808 0.683 hRNU44 vs. hRNU48 −1.772 0.100
hBAct vs. hRNU47 2.192 0.682 hRNU44 vs. hRNU47 1.227 0.172
hBAct vs. h18S 1.008 0.820 0.564 hRNU44 vs. h18S 0.044 0.585 0.449
hGAPDH vs. hBAct −4.666 0.376 hU6 vs. hBAct 4.547 0.508
hGAPDH vs. hSDHA −1.687 0.432 hU6 vs. hGAPDH 9.213 0.438
hGAPDH vs. hTBCA 0.125 0.346 hU6 vs. hSDHA 10.900 0.557
hGAPDH vs. hTUBA1A −1.486 0.696 hU6 vs. hTBCA 9.089 0.263
hGAPDH vs. hRNU44 5.630 0.577 hU6 vs. hTUBA1A 10.700 0.571
hGAPDH vs. hU6 9.213 0.438 hU6 vs. hRNU44 3.583 0.346
hGAPDH vs. hRNU48 3.858 0.557 hU6 vs. hRNU48 −5.355 0.355
hGAPDH vs. hRNU47 6.858 0.537 hU6 vs. hRNU47 −2.356 0.324
hGAPDH vs. h18S 5.674 0.657 0.513 hU6 vs. h18S −3.539 0.572 0.437
hSDHA vs. hBAct −6.353 0.238 hRNU48 vs. hBAct −0.808 0.683
hSDHA vs. hGAPDH −1.687 0.432 hRNU48 vs. hGAPDH 3.858 0.557
hSDHA vs. hTBCA 1.811 0.455 hRNU48 vs. hSDHA 5.545 0.716
hSDHA vs. hTUBA1A 0.200 0.756 hRNU48 vs. hTBCA 3.733 0.370
hSDHA vs. hRNU44 7.317 0.723 hRNU48 vs. hTUBA1A 5.344 0.455
hSDHA vs. hU6 10.900 0.557 hRNU48 vs. hRNU44 −1.772 0.100
hSDHA vs. hRNU48 5.545 0.716 hRNU48 vs. hU6 −5.355 0.355
hSDHA vs. hRNU47 8.544 0.710 hRNU48 vs. hRNU47 3.000 0.193
hSDHA vs. h18S 7.361 0.884 0.608 hRNU48 vs. h18S 1.816 0.607 0.449
hTBCA vs. hBAct −4.541 0.403 hRNU47 vs. hBAct 2.192 0.682
hTBCA vs. hGAPDH 0.125 0.346 hRNU47 vs. hGAPDH 6.858 0.537
hTBCA vs. hSDHA 1.811 0.455 hRNU47 vs. hSDHA 8.544 0.710
hTBCA vs. hTUBA1A −1.611 0.559 hRNU47 vs. hTBCA 6.733 0.375
hTBCA vs. hRNU44 5.506 0.373 hRNU47 vs. hTUBA1A 8.344 0.574
hTBCA vs. hU6 9.089 0.263 hRNU47 vs. hRNU44 1.227 0.172
hTBCA vs. hRNU48 3.733 0.370 hRNU47 vs. hU6 −2.356 0.324
hTBCA vs. hRNU47 6.733 0.375 hRNU47 vs. hRNU48 3.000 0.193
hTBCA vs. h18S 5.550 0.520 0.407 hRNU47 vs. h18S −1.183 0.557 0.458
hTUBA1A vs. hBAct −6.152 0.675 h18S vs. hBAct 1.008 0.820
hTUBA1A vs. hGAPDH −1.486 0.696 h18S vs. hGAPDH 5.674 0.657
hTUBA1A vs. hSDHA 0.200 0.756 h18S vs. hSDHA 7.361 0.884
hTUBA1A vs. hTBCA −1.611 0.559 h18S vs. hTBCA 5.550 0.520
hTUBA1A vs. hRNU44 7.117 0.472 h18S vs. hTUBA1A 7.161 0.898
hTUBA1A vs. hU6 10.700 0.571 h18S vs. hRNU44 0.044 0.585
hTUBA1A vs. hRNU48 5.344 0.455 h18S vs. hU6 −3.539 0.572
hTUBA1A vs. hRNU47 8.344 0.574 h18S vs. hRNU48 1.816 0.607
hTUBA1A vs. h18S 7.161 0.898 0.629 h18S vs. hRNU47 −1.183 0.557 0.678

The results obtained from the ΔCt method were similar to those from NormFinder analysis and BestKeeper, but slightly different from those from geNorm. hTBCA was considered to be the most stably expressed reference gene, followed by four snRNAs, hU6, hRNU44, hRNU48, and hRNU47. Conventional reference genes, hGAPDH, hBAct, and hSDHA, took the next three places in the list and they are not the good choices for reference genes. Ribosomal RNA h18S is the most variable and thus improper reference gene in this experimental setting.

Final ranking of candidate reference genes

Taking into account the ranking results from all the four algorithms, we obtained an overall ranking of the best reference genes (Table 6). Here is the brief procedure by which we obtained the final ranking: first, list the ranking of each reference gene in all the algorithms, for example, hBAct ranked the 7th, 7th, 8th, and 7th place in geNorm, NormFinder, BestKeeper, and comparative ΔCt method, respectively; secondly, calculate the geometric mean of the four ranking numbers, thus for hBAct the geometric mean is 7.24 [(7 × 7 × 8 × 7)0.25]; finally, rank the candidate reference genes according to the geometric mean, the gene with smaller geometric mean being the most stable reference gene. So the overall ranking with geometric mean is hTBCA (1.50) > hRNU44 (2.21) > hRNU48 (2.45) > hU6 (2.83) > hRNU47 (4.40) > hGAPDH (6.24) > hBAct (7.24) > hTUBA1A (8.13) > hSDHA (8.46) > h18S (9.74) (Table 6). Thus, tubulin folding cofactor A (TBCA) is the most reliable reference gene, followed by four small RNAs (RNU 44, RNU 48, U6, and RNU 47). However, the traditionally used reference genes, 18s RNA, tubulin, and β-actin were not good reference genes for these pharmaceutical and toxicological studies.

Evaluation of candidate reference genes in dose- and time-response assay

In order to assess the stability of those reference genes at different concentrations of treatment and for different time points, we carried out dose-response and time-course assays after GOS treatment (Tables 7 and 8). After analyzing the results with all the four algorithms, we found that hTBCA and four small RNAs (RNU 44, RNU 47, RNU 48, and U6) were still the best-suited reference genes, although variation of reference genes in the dose-response assay seemed to be larger than in the time-course assay.

Table 7. Ten candidate reference genes ranked by different methods in GOS dose-response assays
Ranking geNorm NormFinder BestKeeper ΔCt method Overall ranking
1 hRNU44 and hRNU47 hTBCA hRNU47 hTBCA hTBCA
2 hU6 hRNU44 hU6 hRNU47
3 hRNU48 hSDHA hRNU48 hRNU48 hRNU44
4 hU6 hRNU48 hU6 hRNU44 hU6
5 h18S hRNU44 hTBCA hSDHA hRNU48
6 hTBCA hRNU47 h18S hRNU47 hSDHA
7 hSDHA hGAPDH hSDHA h18S h18S
8 hGAPDH h18S hGAPDH hGAPDH hGAPDH
9 hTUBA1A hTUBA1A hTUBA1A hTUBA1A hTUBA1A
10 hBAct hBAct hBAct hBAct hBAct
Table 8. Ten candidate reference genes ranked by different methods in GOS time-course assays
Ranking geNorm NormFinder BestKeeper ΔCt method Overall ranking
1 hRNU44 and hRNU47 hTBCA hRNU44 hTBCA hRNU44
2 hSDHA hU6 hU6 hTBCA
3 hRNU48 hU6 hRNU48 hRNU44 hU6
4 hU6 h18S hRNU47 h18S hRNU47
5 h18S hRNU44 h18S hSDHA hRNU48
6 hTBCA hGAPDH hTBCA hRNU48 h18S
7 hSDHA hRNU48 hSDHA hRNU47 hSDHA
8 hGAPDH hRNU47 hGAPDH hGAPDH hGAPDH
9 hTUBA1A hTUBA1A hTUBA1A hTUBA1A hTUBA1A
10 hBAct hBAct hBAct hBAct hBAct

Discussion

Gene expression analysis is becoming increasingly important in many fields of biological and biomedical research, with qRT-PCR being the most frequently used method for accurate expression profiling of selected genes. Since its introduction in 1990s, both qRT-PCR assays and qRT-PCR data analyses have experienced rapid development. The benefits of qRT-PCR over traditional methods for gene expression include its sensitivity, accuracy, reproducibility, large dynamic range, and the potential for high throughput. However, data obtained from qRT-PCR are open to question without normalization to appropriate reference genes. After discovering that frequently chosen reference genes such as GAPDH, β-actin, and 18S rRNA suffer from inconsistent expression levels in different tissues and experimental settings (Schmittgen and Zakrajsek, 2000; Suzuki et al., 2000; Tricarico et al., 2002; Radonic et al., 2004), research has been carried out to identify stable reference genes in various species (Ahn et al., 2008; Huis et al., 2010) and in different animal and plant tissues (Artico et al., 2010; Galiveti et al., 2010). However, to the best of our knowledge, there is no report concerning reference gene selection in cells treated with anti-cancer drugs, so this study is the first in-depth analysis. Only a few studies have been reported with regard to reference gene selection in a toxicological context (Heckmann et al., 2006; Spinsanti et al., 2008). Although small RNA RNU 48 and U6 have been widely used for reference genes during miRNA expression analysis and were implied to be good reference genes, no systematic study has been reported on expression stability of those reference genes. Thus, we conducted this study, which aimed to investigate the effects of three anti-cancer drugs (PTX, GOS, and JAS) and two toxic compounds (NIC and mela) on the RNA levels of both selected protein-coding and non-coding reference genes of interest.

Four algorithms, geNorm, NormFinder, BestKeeper, and the comparative ΔCt method, are used to evaluate the variation of reference genes. Among these four methods, geNorm can only find the optimal combination of genes, while NormFinder can find both the most stable gene as well as the best pair of genes. NormFinder also provides additional information about the inter- and intra-groups variation and choose a pair of genes with opposite expression biases to neutralize their biases. Unlike other algorithms, BestKeeper can also analyze up to ten target genes as well as ten housekeeping genes. The comparative ΔCt method is especially ideal for the non-specialist to determine the best reference genes, since no high-level mathematical methodology is required for this method. From a general point of view, the assessment results derived from NormFinder, BestKeeper, and the comparative ΔCt method are more consistent with each other than with the one from geNorm, which may be because geNorm takes into account only the overall expression level variation of candidate reference genes, while other methods include both expression level variation and overall expression level.

In spite of their slight discrepancy, results from all these methods show that the protein-coding reference gene hTBCA and four non-coding reference genes (hRNU44, hRNU48, hU6, and hRNU47) are better reference genes than others, including frequently used GAPDH, β-actin, and 18s RNA. Although any one of the top five reference genes mentioned above may be sufficient in some experimental settings, a combination of two reference genes is preferred to produce more accurate data. For example, hRNU44 and hRNU48 should be used together to normalize the PTX treatment group, alternatively, hRNU44 and hRNU47 are preferred in other treatment groups. One common reference gene, ribosomal RNA h18S, deserves further investigation as it is highly variable according to all methods used. However, we cannot rule out the possibility that the high variation may result from its low amplification efficiency.

In conclusion, the protein-coding reference gene hTBCA and four non-coding reference genes (hRNU44, hRNU48, hU6, and hRNU47) are the most favorable reference genes under these experimental conditions. Due to their similar origin, hRNU44, hRNU48, hU6, and hRNU47 may be more suitable than the protein-coding gene hTBCA for analyzing the expression of small non-coding RNAs, such as miRNAs and siRNAs. For example, a combination of hRNU44 and hRNU48 is sufficient for the PTX treated group for a good normalization result without involving too many reference genes, while a combination of hRNU44 and hRNU47 is sufficient for the other groups treated with GOS, JAS, NIC, and mela. Investigators engaging in pharmacological and toxicological research in breast cancer cell lines exposed to PTX, GOS, JAS, NIC, and mela can apply our findings without further validation. However, different chemicals may cause different expression patterns of a single gene, and different cell lines may respond to the same treatment differently. Due to the preliminary nature of our research, it is impossible to cover all the human cell lines in all experimental conditions in this study and thus should be corroborated with future studies in other human cell lines under additional drug classes and hormonal treatments. Our present study demonstrates the importance of reference gene selection in pharmacological and toxicological studies, and we strongly recommend a preliminary analysis of the stability of candidate reference genes before conducting a gene expression analysis by qRT-PCR under experimental settings different from those described herein.

Acknowledgements

This work was partially supported by ECU New Faculty Research Startup Funds Program and ECU Research/Creative Activity Grant.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.