Volume 68, Issue 6 pp. 563-573

Full Access

Linkage Disequilibrium and Haplotype Architecture for two ABC Transporter Genes (ABCC1 and ABCG2) in Chinese Population: Implications for Pharmacogenomic Association Studies

Haijian Wang,

Haijian Wang

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Chinese National Human Genome Center at Beijing, Beijing 100176

Laboratory of Systems Biology, Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, P. R. China

Search for more papers by this author

Bingtao Hao,

Bingtao Hao

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Chinese National Human Genome Center at Beijing, Beijing 100176

Search for more papers by this author

Kaixin Zhou,

Kaixin Zhou

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Chinese National Human Genome Center at Beijing, Beijing 100176

Search for more papers by this author

Xiaoping Chen,

Xiaoping Chen

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Chinese National Human Genome Center at Beijing, Beijing 100176

Search for more papers by this author

Songfeng Wu,

Songfeng Wu

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Search for more papers by this author

Gangqiao Zhou,

Gangqiao Zhou

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Chinese National Human Genome Center at Beijing, Beijing 100176

Search for more papers by this author

Yunping Zhu,

Yunping Zhu

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Search for more papers by this author

Fuchu He,

Corresponding Author

Fuchu He

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Chinese National Human Genome Center at Beijing, Beijing 100176

Laboratory of Systems Biology, Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, P. R. China

Correspondence to Dr. Fuchu He, Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, 27 Taiping Rd., Beijing 100850, P. R. China. Tel: +86-10-68171208; fax: +86-10-68214653; E-mail: [email protected]Search for more papers by this author

Haijian Wang,

Haijian Wang

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Chinese National Human Genome Center at Beijing, Beijing 100176

Laboratory of Systems Biology, Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, P. R. China

Search for more papers by this author

Bingtao Hao,

Bingtao Hao

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Chinese National Human Genome Center at Beijing, Beijing 100176

Search for more papers by this author

Kaixin Zhou,

Kaixin Zhou

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Chinese National Human Genome Center at Beijing, Beijing 100176

Search for more papers by this author

Xiaoping Chen,

Xiaoping Chen

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Chinese National Human Genome Center at Beijing, Beijing 100176

Search for more papers by this author

Songfeng Wu,

Songfeng Wu

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Search for more papers by this author

Gangqiao Zhou,

Gangqiao Zhou

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Chinese National Human Genome Center at Beijing, Beijing 100176

Search for more papers by this author

Yunping Zhu,

Yunping Zhu

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Search for more papers by this author

Fuchu He,

Corresponding Author

Fuchu He

Laboratory of Systems Biology, Beijing Institute of Radiation Medicine, Beijing 100850

Chinese National Human Genome Center at Beijing, Beijing 100176

Laboratory of Systems Biology, Institutes of Biomedical Sciences, Fudan University, Shanghai 200032, P. R. China

First published: 08 December 2004

https://doi.org/10.1046/j.1529-8817.2003.00124.x

Citations: 25

Share a link

Email
Wechat
Bluesky

Summary

Information about linkage disequilibrium (LD) patterns and haplotype structures for candidate genes is instructive for the design and analysis of genetic association studies for complex diseases and drug response. ABCC1 and ABCG2 are genes coding for two multidrug resistance (MDR) associated transporters; they are also related to some pathophysiological traits. To pinpoint the LD profiles of these MDR genes in Chinese, we systemically screened 27 unrelated individuals for single nucleotide polymorphisms (SNPs) in the coding and regulatory regions of these genes, and thereby characterized their haplotype structures. Despite marked variations in haplotype diversity, LD pattern and intragenic recombination intensity between the two genes, both loci could be partitioned into several LD blocks, in which a modest number of haplotypes accounted for a high fraction of the sampled chromosomes. We concluded that each locus has its own genomic LD profile, but that they still share a common segmental LD architecture with low haplotype diversity. Our data will benefit genetic association studies of complex traits and drug response possibly related to these genes.

Introduction

Multidrug resistance (MDR, cross-resistance to structurally and mechanically unrelated drugs) is a complex phenomenon in cancer chemotherapeutics caused by several different cellular mechanisms. Three members of the ATP-binding cassette (ABC) superfamily have been well documented in MDR related systems: ABCB1 (also known as P-gp and MDR1), ABCC1 (or multidrug resistance-associated protein 1, MRP1), and ABCG2 (also known as mitoxantrone-resistance protein, MXR; or breast cancer resistance protein, BCRP) (Gottesman et al. 2002). These MDR associated proteins are often expressed in both cancer cells and many normal tissues (Borst et al. 2000; Maliepaard et al. 2001). ABCC1 transports many exogenous chemotherapy drugs (Borst et al. 2000) and some endogenous factors such as glutathione conjugates and organic anions (Jedlitschky et al. 1996) and folate (Assaraf et al. 2003). ABCG2 is a methotrexate and methotrexate polyglutamate transporter (Volk et al. 2003). It also restricts exposure to the dietary carcinogen 2-amino-1-methyl-6-phenylimidazo (4,5-b) pyridine (PhIP), suggesting that interindividual differences in its activity may thus also affect the exposure to PhIP and related food carcinogens, with possible implications for cancer susceptibility (van Herwaarden et al. 2003). Though physiological substrates of ABCG2 are poorly known, its distinct role as a negative regulator of hematopoietic repopulating activity, and its specific expression in hematopoietic side-population stem cells have recently been clarified (Zhou et al. 2001), suggesting that ABCG2 may play an important role in normal and malignant hematopoiesis, such as relapsed acute myeloid leukemia (AML) (van Den Heuvel-Eibrink et al. 2002).

Recently, genetic polymorphisms in ABCB1 and their clinical relevance have been intensively studied: one synonymous SNP 3435 C>T is correlated with the expression level and activity of the transporter (Hoffmeyer et al. 2000), chemotherapeutic outcomes (Fellay et al. 2002; Illmer et al. 2002) and disease susceptibility (Siegsmund et al. 2002). It has also been reported that there is strong LD between multiple SNPs at the ABCB1 locus, and common alleles or haplotypes are associated with altered P-gp function (Kim et al. 2001). Though LD and haplotype profiles of ABCB1 have been documented and proved functionally relevant, knowledge of the LD extent and haplotype structure is very limited for pharmacogenomics studies of other multidrug resistance related ABC genes.

In this study, we conducted a systematic screening in 27 unrelated individuals for sequence variations in the coding and regulatory regions of ABCC1 and ABCG2, and further carried out genomic level analysis of their LD pattern and haplotype structure. The knowledge of SNP distribution, LD profile and haplotype structure within these genes will be useful in assessing their correlation or association with MDR and other complex clinical traits.

Materials and Methods

Sample, Sequence Accession and Primer Design

Genomic DNA from 27 unrelated healthy subjects was chosen from the sample collection, which was constructed for the Chinese Human Genome Diversity Project through a coordinated effort of several institutes (Chu et al. 1998). This study was approved by the Ethical Committee of Chinese National Human Genome Center. The sample included 54 chromosomes, providing a 95% confidence level to detect alleles with frequency >5%. Accession numbers used in this study are U91323.1, AC025778.4, AC003026.1 (ABCC1) and AC084732.1 (ABCG2). With a candidate-gene strategy, all exons (except ABCC1 exon1 because of high GC content), 5'flanking regions (with lengths of about 1.6 kb and 0.8 kb in ABCC1 and ABCGI, respectively), untranslated regions (UTR) (about 0.6 kb and 1.1 kb), and about 9.7 kb and 5.1 kb of intronic sequence of ABCC1 and ABCG, respectively, were covered for SNP screening. Primers were designed online using the Primer 3 program (http://www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi). Details regarding the position in genomic sequence, length, product size, and annealing temperature for each primer pair are denoted at the Appendix table. The same primers were used for both polymerase chain reactions (PCR) and sequencing reactions.

PCR-resequencing Based SNP Discovery

Combined hot start and touchdown PCR was carried out in a 25 μl volume system containing 10 ng of genomic DNA, 0.3 μM of each primer, 200 μM of each dNTP, 1.5 mM MgCl₂, 10×PCR buffer, 5×Q solution, and 1 Unit HotStarTaq™ DNA polymerase (QIAGEN). The thermal cycling conditions in GeneAmp PCR System 9700 Thermocyclers (Perkin-Elmer) were as follows: an initial denaturation of 95°C for 15 min followed by 13 cycles of 94°C for 30 sec, annealing temperatures stepdown every cycle of 0.5°C, and extension at 72°C for 50 sec; In the following stage of cycling, denaturation and extension phases were the same as above with a final extension at 72°C for 5 min. PCR products were purified with MultiScreen^™-PCR plates (Millipore) according to the protocol. Bidirectional sequencing was carried out using an ABI PRISM Big Dye Terminator and run on an ABI 3700 DNA Analyzer. Polymorphic sites were identified using the PolyPhred program. SNP genotypes were verified by manual evaluation of the individual sequence traces.

Assessing Nucleotide Diversity

Chi-square test for the significance of deviation of genotype distributions from Hardy-Weinberg equilibrium (HWE) was carried out for each site. Two commonly used estimates of nucleotide variation were calculated, which correct for both sample size and the length of the region surveyed: Pi (π), the direct estimate of per site heterozygosity, that is the average number of nucleotide differences per site between two sequences chosen from a randomly mating population; and Theta (θ), the estimate of population mutation parameter based on the number of polymorphic sites in the sample (Li, 1997).

Inference of Haplotype, Recombination, and Linkage Disequilibrium

Haplotypes were reconstructed on phase-unknown genotype data using PHASE version 0.21 (Stephens et al. 2001). Based on Bayesian statistics, PHASE infers haplotypes and their frequencies implementing the Markov chain-Monte Carlo (MCMC) algorithm. DnaSP version 3.51 (Rozas et al. 1999) was used to estimate nucleotide and haplotype diversity, recombination, and linkage disequilibrium. The normalized coefficient (Lewontin's coefficient) was used as an LD measure (Devlin et al. 1995). The four-gamete test and the Hudson and Kaplan recombination statistic R were used to test for recombination (Hudson, 1987). Positions for recombination events and the minimum number of recombination events were determined using the recombination module of DnaSP. We carefully conducted haplotype block partitioning according to reported principles and methods (Daly et al. 2001; Johnson et al. 2001). Here, the common haplotypes were defined as those that were observed at least four times (or >7%) in the sample, and the proportion of chromosomes accounted for by common haplotypes was used as a general surrogate of haplotype diversity information for the local region.

Results

SNP Detection and Distribution

In each of the 27 subjects, about 17 kb and 9 kb inconsecutive sequences within the respective loci were sampled for SNP screening. A total of 45 biallelic polymorphisms (32 in ABCC1 and 13 in ABCG2) were identified (Table 1), of which 22 in ABCC1 and 9 in ABCG2 had been deposited in dbSNP prior to our report (Revised August 11, 2003 2:18 PM). Details of SNPs and their genotype for each sample are also provided in the Appendix table. 12 SNPs located in coding exons, 24 in introns, 4 in 5' flanking regions, 2 in 5' UTRs, 2 in 3' UTRs and 1 in a 3' flanking region (Table 1). No other variants (deletions, insertions, microsatellites, or triallelic polymorphisms) were surveyed. Of the 12 SNPs within coding regions, 5 SNPs result in amino acid alteration, all of which seem to be benign or conservative in functional effect prediction using PolyPhen (http://www.bork.embl-heidelberg.de/PolyPhen/). Using the Chi-square test for deviation from HWE, only SNP19 of ABCC1 in its 5' flanking region did not fit with expectations of genotype distribution, with no heterozygote and 2 mutant homozygotes found in the sample. We measured the SNP data validity by repeating 10% of the genotype assays, based on the PCR and bi-direction resequencing SNP platform. The error rate was relatively low (3.6%).

Table 1. Position, sequence, and frequency of SNPs within ABCC1 and ABCG2

Locus	SNP No.	SNP position	Nucleotide sequence (major/minor)	Minor allele frequency (%)	dbSNP ID	effect
ABCC1	1	5'FR/−1862	gacccG/Aggcca	44.4
	2	5'FR/−1830	atcctA/Gtctac	1.9
	3	5'FR/−1680	gaggaG/Aaaaag	1.9
	4	5'FR/−471	cggatA/Gctgtc	7.4
	5	E2/218	caaaaC/Tcaaaa	3.7		Thr73Ile
	6	I2/−26	gttgtG/Aggggg	1.9	rs8187842
	7	I3/−66	ctgggT/Cgacaa	37.0	rs4148337
	8	I7/+54	ccactC/Actgtg	9.3	rs903880
	9	I7/+64	ggcctC/Gaatcc	48.1	rs246232
	10	E8/816	cagccG/Agtgaa	1.9		wobble
	11	E8/825	aaggtT/Cgtgta	38.9	rs246221	wobble
	12	E9/1062	gtgaaT/Cgacac	35.2	rs35587	wobble
	13	I9/+8	aggggA/Gcgctg	37.0	rs35588
	14	I12/−37	cactcA/Ggggca	20.4	rs35604
	15	E13/1684	tggccT/Ctgtgc	20.4	rs35605	wobble
	16	I13/+105	ccggtC/Tgggct	20.4	rs35606
	17	I14/+105	ccagcC/Tgcttg	1.9
	18	I15/+627	gctgtA/Gtttta	25.8	rs35628
	19	I15/+669	aatctG/Ttagaa	7.4*	rs4148353
	20	I15/−967	ctttcT/Ggctgt	37.0	rs152029
	21	E16/2007	atcccC/Tgaagg	3.7	rs2301666	wobble
	22	E17/2168	tctccG/Aagaaa	5.6	rs4148356	Arg723Gln
	23	I18/−30	gcactG/Cacgtg	16.7	rs2074087
	24	I22/+62	aattaT/Ctccct	27.8	rs3887893
	25	I22/−43	gtcagC/Ttccct	3.7
	26	E27/3915	gaggaC/Tctgga	1.9		wobble
	27	E28/4002	aagtcG/Atccct	11.1	rs2239330	wobble
	28	I28/−35	tcagcA/Gtgaca	27.8	rs212087
	29	I30/+30	gcacaG/Atggcc	29.6	rs212088
	30	3'UTR/+801	accccC/Gactcc	33.3	rs129081	noncoding
	31	3'UTR/+866	tactgT/Atccca	14.8	rs212090	noncoding
	32	3'FR/+1513	gttctT/Ctaagg	27.8
ABCG2	1	5'UTR/−407	cgcagC/Tgcctc	1.9
	2	5'UTR/−376	ggggaG/Acgctc	1.9
	3	E2/34	tcccaG/Atgtca	20.4	rs2231137	Val12Met
	4	I2/+36	ttttaA/Gtttac	25.9	rs4148152
	5	I3/+10	gtataA/Ggagag	20.4	rs2231138
	6	E5/421	acttaC/Agttct	22.2	rs2231142	Gln141Lys
	7	E7/805	acgggC/Tctgct	3.7		Pro269Ser
	8	I9/−126	agccaT/Gtgagt	7.4
	9	I11/+20	gttctA/Gggaac	31.5	rs2231153
	10	I12/+49	cctatG/Tggtga	16.7	rs2231156
	11	I13/+40	tgtttT/Ctttcc	24.1	rs2231157
	12	I13/−21	tgactC/Tttagt	29.6	rs2231162
	13	I14/−46	ttcttG/Aaaatt	48.1	rs2725267

SNPs in specific regions, i.e. 5'flanking region (5'FR), 5'untranslated region (5'UTR), intron (I), exon (E), 3'UTR, and 3'FR, are presented as region/+(−): for 5'FR and 5'UTR, n nucleotides upstream (−) from the translation initiation site; for 3'UTR and 3'FR, n nt downstream (+) from the third base of stop codon; for coding regions, n corresponds to positions of their cDNA with the first base of start codon set to 1; and for introns, n nt upstream (−) from 3' site or downstream (+) from 5' site of introns. Segregating sites in local sequences are denoted as major/minor allele. *In Chi-square tests for deviation from Hardy-Weinberg equilibrium (HWE), only one (I15/+669 in ABCC1) out of the 45 segregating sites showed statistical significance.

Nucleotide and Haplotype Diversity

We screened 45 SNPs in nearly 26 kb of genomic sequence. The indices of nucleotide and haplotype variation within the two genes are summarized in Table 2. ABCC1 and ABCG2 showed similar patterns of nucleotide diversity: θ, the expected proportion of polymorphic sites were 4.1×10⁻⁴ and 3.1×10⁻⁴ respectively; in the study population one SNP could be observed per 2 439 bp and 3 226 bp in the targeted sequences of ABCC1 and ABCG2, respectively. The mutation parameters (θ) for coding and non-coding regions were almost identical to the overall estimates, but nucleotide diversities (π) for coding regions were relatively lower than those for non-coding regions and overall estimates. When compared with the gene-based averages of sequence variation indices, which were from one systematic SNP screening (313 genes) effort with a similar strategy (Stephens et al. 2001), the two loci showed lower nucleotide variation (when θ but not π was measured). The discrepancy could mainly be accounted for by pronounced intergenic differences in the sequence mutation rate across the genome.

Table 2. Indices of nucleotide and haplotype variation within ABCC1 and ABCG2 in the Chinese population (2N = 54)

Locus	Screened Length (bp)	No. SNPs	Nucleotide variation (×10⁻⁴) (over all; coding; non-coding region)		No. haplotype	Haplotype (H) *
Locus	Screened Length (bp)	No. SNPs	π± SD	θ±SD	No. haplotype	Haplotype (H) *
ABCC1	17 006	32	5.0 ± 0.2	4.1 ± 1.3	46	0.99
			3.9 ± 0.3	4.3 ± 1.8
			5.5 ± 0.3	4.1 ± 1.4
ABCG2	9 094	13	4.0 ± 0.2	3.1 ± 1.2	16	0.89
			3.5 ± 0.4	3.4 ± 1.9
			4.1 ± 0.2	3.1 ± 1.3

*H, measure of haplotype diversity, is the expected heterozygosity based on haplotype frequencies, H= 1−∑_i−q_i².

When all the SNPs were used to reconstruct haplotypes, an excess of haplotypes was predicted, with 46 based on 32 SNPs in ABCC1, and 16 based on 13 SNPs in ABCG2, respectively. In order to further interrogate recombination and LD with more validity (as the estimates of LD, D', could be biased with small sample sizes), haplotype deconvoluting was also based on common SNPs (with minor allele frequency > 5% and in HWE). In ABCC1, 39 haplotypes were predicted using 22 common SNPs; and in ABCG2, 13 haplotypes based on 10 common SNPs were constructed. Assessment of data quality for common-SNP-based haplotypes was provided by PHASE. A pronounced fraction of haplotype calling was unambiguous (66% for ABCC1 and 76% for ABCG2). As to the configuration dataset for unphased position, 44% and 78% of haplotype calling was obtained with probability >0.95 for the two loci, respectively.

Recombination

According to the neutral infinite-sites mutation model, intragenic recombination in the two loci could be readily inferred from the above observation of haplotype diversity. We further assessed recombination using the four-gamete test (Fig. 1). In ABCC1 only 53 out of 231 pairs were found to be in complete LD (fewer than four gemetes observed in the sample), indicating extensive intragenic recombination throughout the gene. The estimates of the population recombination parameter R (= 4N_er, where N_e is the effective population size and r is the recombination rate per gene per generation) for ABCC1 and ABCG2 were 70.4 and 35.8 respectively. The minimum number of recombination events, R_m, indicated that recombination was detected for 15 pairs of SNPs in ABCC1. In ABCG2, though more than half of 45 pairs were in complete LD, five recombination events were detected.

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

Recombination measurements of *ABCC1* and *ABCG2*. Recombination was determined using the four-gamete test with R, potential recombination sites, indicated by ×s. Blackened boxes indicate site pairs having all possible four gametic types, which implies that recombination has occurred between these two sites; Blank boxes indicate site pairs having less than four gametic types.

Linkage Disequilibrium and Haplotype Structure

LD was measured using the statistic |D′| in a pairwise manner across all common SNPs. Despite the above inference of recombination in ABCC1 and ABCG2, however, pronounced LD was observed in their local regions. A very irregular picture of LD was observed for ABCC1 (Fig. 2A). For example, in the subgroup of SNP8-9-11-12-13 of ABCC1, which spans about 10 kb of sequence, only 4 pairs (SNP8/11, SNP8/12, SNP11/12, and SNP12/13) were in complete LD (|D′| = 1). However, the LD profile of ABCG2 was less complicated, with only one segmental LD subgroup partitioned (Fig. 2B). The haplotype structures of the two ABC genes are shown in Fig. 3. It became evident that the loci could be largely decomposed into three and two discrete blocks for ABCC1 and ABCG2, respectively. These blocks span from about 10 kb to about 50 kb and contain multiple (five or more) common SNPs. The major attribute of each block is that only a few (3-5) common haplotypes accounted for most (>80%) of the sampled chromosomes, although an excess of haplotypes was predicted in most of the blocks.

Discussion

It is well recognized that inherited differences in drug disposition systems (metabolizing enzymes and transporters) and drug effect systems (sensors, receptors and targets) have great influences on the efficacy and toxicity of medications, and the risks of some diseases related with xenobiotic exposure (Evans & Relling, 1999). Geneticists and physicians have come to view clinical drug responses as complex traits associated with polygenic determinants. The notable successes of LD-based positional cloning studies for Mendelian disorders, superimposed with the high availability of genome polymorphism data, has sparked a strong interest in LD-based association studies for pinpointing the genes underlying complex diseases and drug response. Thus, there is an urgent need to resolve the allelic architecture of xenobiotic disposition and effect systems underlying clinical phenotypes of drug response or disease predisposition.

The present study provides a comprehensive analysis in the Chinese population of genetic variants of two large ABC transporter genes (ABCC1 and ABCG2) introduced in multidrug resistance. Variations in LD pattern and haplotype diversity were observed between the two loci. However, haplotypes in the two loci could be partitioned into several LD units in which haplotype diversity is low and a few common haplotypes account for most sampled chromosomes. Our data contribute to the growing LD landscape in the human genome, and will facilitate marker choosing for pharmacogenomics studies of cancer chemotherapy and susceptibility.

Most of our analyses were based on haplotype data, which was reconstructed in silico from population unphased genotype data. Although PHASE is a well recognized program, it only provides estimates of unphased haplotypic substructure which may, at times, be biased; furthermore, our datasets lack phased genotypic information from pedigrees and these methodological limitations might to some extent affect the accuracy of configuration in our datasets. However, although our sample size was relatively small (2N = 54), there are at least three lines of evidence suggesting that our observations were not severely affected by ascertainment bias due to sampling and genotyping. Firstly, the genotype frequencies at nearly all the SNP sites followed HWE. Secondly, on the basis of the observed numbers of segregating sites, our estimates of θ (per sequence) for the two loci were 6.97 and 2.82 respectively (the estimates in Table 2 were calculated on per-site basis), which is largely in agreement with the observation that there were 6 and 2 singletons in the analysed regions of ABCC1 and ABCG2, respectively (under the infinite-sites neutral model, the expected number of singletons is simply θ (Fu & Li, 1993)). Furthermore, because rare alleles with frequencies <5% do not have sufficient statistic power for LD detection (Lewontin, 1995; Goddard et al. 2000), haplotype reconstruction for LD and recombination analysis was based only on common SNPs.

One feature of the present data that merits attention is the inference that there was somewhat pronounced variation in LD pattern and recombination between ABCC1 and ABCG2. For ABCC1, many more distinct haplotypes were observed than theoretically expected for the value of θ estimated from the segregating sites. When ten rare SNPs were subtracted from the total of 32, the number of predicted haplotypes only decreased from 46 to 39. These facts, and the observation that the proportion of haplotype calling with sound probability (>0.95) was relatively low (the error rate for PHASE algorithm partially depends on recombination rate), indicated extensive intragenic recombination throughout this locus. Comparison of LD measurements and the four-gamete matrices yielded close concurrence. The plethora of recombination throughout ABCC1 helps to explain the highly irregular pattern of LD in this locus. However, a low population recombination rate and regular LD pattern were observed at ABCG2. These results are consistent with the latest genetic map of the human genome (Kong et al. 2002), in which the deduced local recombination rate of the ABCC1 locus is much higher than that of ABCG2. Empirical data from other candidate-gene-based LD studies also show pronounced intergenic difference in LD pattern (Jeffreys et al. 2000; Bonnen et al. 2002). Clearly for the two loci, the varied sequence lengths surveyed, and therefore the different number of SNPs screened, might partly explain their different LD pattern. However, in simple population genetics models, the key parameter in determining the extent of linkage disequilibrium is the product of the recombination rate and the effective population size, often termed the population recombination rate R. Several molecular and population genetic factors may account for most of the heterogeneous LD pattern across genes and genome: sequence constitution dependent mutation rate, recombination rate, population demographic history, natural selection forces, and even genetic drift (Reich et al. 2002; Stumpf & Goldstein, 2003).

Recently, both regional and chromosome-wide studies of linkage disequilibrim and haplotype structure have revealed block-like patterns of LD (Daly et al. 2001; Jeffreys et al. 2001; Gabriel et al. 2002; Phillips et al. 2003). A small fraction of SNPs in each block are statistically sufficient to capture the haplotype information content of that block. Therefore, haplotype based association studies may hold promise for complex trait mapping. Taking into consideration their functional implications in MDR and other pathophysiological traits, and their marked genomic sequence length and complexity, we putatively dissected the haplotype structure of ABCC1 and ABCG2 invoking the scenario of “chromosome coverage” (Daly et al. 2001). Here, we subjectively defined the common haplotypes as those with a frequency >7% and defined the threshold of their “chromosome coverage” in one block as 80%. With common haplotypes defined here, attainable sample sizes in typical association studies could provide sufficient statistical power to detect insulting variants with an odds ratio greater than 1.5 (Johnson et al. 2001). The two loci were thereby empirically reduced into three and two haplotype blocks respectively. The outlined haplotype blocks made up 32% (63 out of 200 kb) and 76% (52 out of 68 kb) of the ABCC1 and ABCG2 genomic sequence, respectively, and spanned about 10 kb to about 50 kb. In each block, a small number of common haplotypes (3 to 5) typically captured >80% of all chromosomes in the sample. This general pattern is largely consistent with the recent high-resolution haplotype analyses based on chromosomes (Phillips et al. 2003) or large-scale autosomal regions (Gabriel et al. 2002). It is also apparent, however, that the haplotype structures of the two genes differ to some extent. The chromosome coverage of the common haplotypes in either block2 or block3 of ABCC1 is relatively low compared with that of ABCG2, which is reinforced by their agreement with the observation that intragenic recombination in ABCC1 was more extensive and more intensive than that in ABCG2. This pattern of haplotype complexity suggests that a set of markers would need to be more carefully chosen for ABCC1 than for ABCG2 for association studies.

Some inherent limitations of this study must be acknowledged while we discuss the implications of the data. Candidate-gene-based SNP screening usually targets on coding and regulatory regions rather than the complete genomic sequence of concerned loci. This strategy may be methodologically sound to some extent, and it has been demonstrated that there are relatively large numbers of SNPs within coding and regulatory regions of candidate genes, some of which might be functional, and that the LD strength and haplotype structure based on such SNPs are very informative in LD-based association studies (Tiret et al. 2002). It is also readily conceivable, however, that the candidate-gene-based LD pattern and haplotype structure could be obscured by such incomplete coverage. Take ABCG2 as an example to explain the bias resulting from a sampling strategy with partial information. Block 1, with a length of 7 kb, might extend had potential common SNPs in intron1 (about 18 bk) had been covered. Likewise, block 2 (spanning 45 kb) might be further partitioned into subgroups if some markers in the 3' segment of the block had been added. However, in this preliminary stage of a pharmacogenomics study, we have not attempted to catalogue all of the genetic variations in the loci. Instead, we wanted to obtain general information about LD and common haplotypes for these genes in the study population. Supporting our scenario, a systematic survey of haplotype structure reveals that in regions with a low recombination rate, a small number of randomly chosen common markers are sufficient to identify most common haplotypes (Gabriel et al. 2002).

Another limitation of our study is its lack of sampling of other major ethnic populations. Though it has been indicated that both the boundaries of blocks and common haplotypes are shared to a remarkable extent across different populations (Gabriel et al. 2002), theoretical and empirical data also emphasize the effects on LD pattern of recombination rate heterogeneity, demographic population history, and even stochastic effects (Reich et al. 2002; Stumpf & Goldstein, 2003; Wang et al. 2002). We cannot address here the interesting issues mentioned above, especially the extent to which block boundaries are conserved across populations. The implications of our dataset for association studies in other populations should be very prudently interpreted. Taking the two issues together, we should acknowledge that the LD profile and haplotype block architecture we outlined, might be only one of a possible variety of ways to describe the LD pattern, as the position and frequency of markers, mutation and recombination rate, methodology for block definition, etc, all together dynamically depict the fine allelic landscape in a population.

In summary, with a candidate-gene-based SNP screening strategy, we characterized the linkage disequilibrium and haplotype diversity within two multidrug resistance related genes in the Chinese population. ABCC1 showed much greater complexity than ABCG2 in LD pattern, intragenic recombination intensity and haplotype diversity. Though the LD profiles were complicated by intragenic recombination, and other factors, there are still a few blocks of low haplotype diversity in the two loci with a somewhat stringent definition. The LD and haplotype landscape for ABCC1 and ABCG2 may be informative, and benefit genetic association studies for complex diseases and drug response that might be related to these multidrug resistance genes.

Acknowledgements

We give special thanks to Prof. Wei Huang at Chinese National Human Genome Center at Shanghai, Prof. Li Jin at Fudan University, and Prof. Jiujin Xu and Prof. Ruofu Du at Institute of Genetics & developmental biology, CAS, for their contribution of sample collection and distribution. We fully acknowledge Prof. Yan Shen and Prof. Zhijian Yao at Chinese National Human Genome Center at Beijing for their support of the sequencing platform. We thank Xiaojia Dong, Xiumei Zhang, and Fengying He for their excellent bench work. We also thank Dr. Keyue Ding for his constructive discussion. This work was supported by a grant-in-aid from the National High Technology Project of China (2001AA224011) and Shanghai Science and Technology Developing Program (Grant No. Ø3DZ14024).

References

Assaraf, Y. G., Rothem, L., Hooijberg, J. H., Stark, M., Ifergan, I., Kathmann, I., Dijkmans, B. A., Peters, G. J. & Jansen, G. 2003 Loss of multidrug resistance protein 1 (MRP1) expression and folate efflux activity results in a highly concentrative folate transport in human leukemia cells. J Biol Chem 278, 6680–668.DOI: 10.1074/jbc.M209186200
10.1074/jbc.M209186200
CAS PubMed Web of Science® Google Scholar
Bonnen, P. E., Wang, P. J., Kimmel, M., Chakraborty, R. & Nelson, D. L. 2002 Haplotype and linkage disequilibrium architecture for human cancer-associated genes. Genome Res 12, 1846–1853.
10.1101/gr.483802
CAS PubMed Web of Science® Google Scholar
Borst, P., Evers, R., Kool, M. & Wijnholds, J. 2000 A family of drug transporters: the multidrug resistance-associated proteins. J Natl Cancer Inst 92, 1295–1302.
10.1093/jnci/92.16.1295
CAS PubMed Web of Science® Google Scholar
Chu, J. Y., Huang, W., Kuang, S. Q., Wang, J. M., Xu, J. J., Chu, Z. T., Yang, Z. Q., Lin, K. Q., Li, P., Wu, M., Geng, Z. C., Tan, C. C., Du, R. F. & Jin, L. 1998 Genetic relationship of populations in China. Proc Natl Acad Sci USA 95, 11763–11768.
10.1073/pnas.95.20.11763
CAS PubMed Web of Science® Google Scholar
Daly, M. J., Rioux, J. D., Schaffner, S. E., Hudson, T. J. & Lander, E. S. 2001 High-resolution haplotype structure in the human genome. Nat Genet 29, 229–232.DOI: 10.1038/ng1001-229
10.1038/ng1001-229
CAS PubMed Web of Science® Google Scholar
Devlin, B. and Risch, N. 1995 A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics 29, 311–322.DOI: 10.1006/geno.1995.9003
10.1006/geno.1995.9003
CAS PubMed Web of Science® Google Scholar
Evans, W. E. & Relling, M. V. 1999 Pharmacogenomics: translating functional genomics into rational therepeutics. Science 286, 478–491.
10.1126/science.286.5439.487
Web of Science® Google Scholar
Fellay, J., Marzolini, C., Meaden, E. R., Back, D. J., Buclin, T., Chave, J. P., Decosterd, L. A., Furrer, H., Opravil, M., Pantaleo, G., Retelska, D., Ruiz, L., Schinkel, A. H., Vernazza, P., Eap, C. B. & Telenti, A. 2002 Response to antiretroviral treatment in HIV-1-infected individuals with allelic variants of the multidrug resistance transporter 1: a pharmacogenetics study. Lancet 359, 30–36.DOI: 10.1016/S0140-6736(02)07276-8
10.1016/S0140-6736(02)07276-8
CAS PubMed Web of Science® Google Scholar
Fu, Y. & Li, W. H. 1993 New statistical tests of neutrality for NDA samples from a population. Genetics 133, 693–709.
CAS PubMed Web of Science® Google Scholar
Gabriel, S. B., Schaffner, S. F., Nguyen, H., Moore, J. M., Roy, J., Blumenstiel, B., Higgins, J., DeFelice, M., Lochner, A., Faggart, M., Liu-Cordero, S. N., Rotimi, C., Adeyemo, A., Cooper, R., Ward, R., Lander, E. S., Daly, M. J. & Altshuler, D. 2002 The structure of haplotype blocks in the human genome. Science 296, 2225–2229.DOI: 10.1126/science.1069424
10.1126/science.1069424
CAS PubMed Web of Science® Google Scholar
Goddard, K. A., Hopkins, P. J., Hall, J. M. & Witte, J. S. 2000 Linkage disequilibrium and allele-frequency distributions for 114 single-nucleotide polymorphisms in five populations. Am J Hum Genet 66, 216–234.DOI: 10.1086/302727
10.1086/302727
CAS PubMed Web of Science® Google Scholar
Gottesman, M. M., Fojo, T. & Bates, S. E. Multidrug resistance in cancer: role of ATP-dependent transporters. 2002 Nat Rev Cancer 2, 48–58.DOI: 10.1038/nrc706
10.1038/nrc706
CAS PubMed Web of Science® Google Scholar
Hoffmeyer, S., Burk, O., Von Richter, O., Arnold, H. P., Brockmoller, J., Johne, A., Cascorbi, I., Gerloff, T., Roots, I., Eichelbaum, M., Brinkmann, U. 2000 Functional polymorphisms of the human multidrug-resistance gene: multiple sequence variations and correlation of one allele with P-glycoprotein expression and activity in vivo. Proc Natl Acad Sci USA 97, 3473–3478.DOI: 10.1073/pnas.050585397
10.1073/pnas.050585397
CAS PubMed Web of Science® Google Scholar
Hudson, R. R. 1987 Estimating the recombination parameter of a finite population model without selection. Genet Res 50, 245–250.
10.1017/S0016672300023776
PubMed Web of Science® Google Scholar
Illmer, T., Schuler, U. S., Thiede, C., Schwarz, U. I., Kim, R. B., Gotthard, S., Freund, D., Schakel, U., Ehninger, G. & Schaich, M. 2002 MDR1 gene polymorphisms affect therapy outcome in acute myeloid leukemia patients. Cancer Res 62, 4955–4962.
CAS PubMed Web of Science® Google Scholar
Jedlitschky, G., Leier, I., Buchholz, U., Barnouin, K., Kurz, G. & Keppler, D. 1996 Transport of glutathione, glucuronate, and sulfate conjugates by the MRP gene-encoded conjugate export pump. Cancer Res 56, 988–994.
CAS PubMed Web of Science® Google Scholar
Jeffreys, A. J., Kauppi, L. & Neumann, R. 2001 Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat Genet 29, 217–222.DOI: 10.1038/ng1001-217
10.1038/ng1001-217
CAS PubMed Web of Science® Google Scholar
Jeffreys, A. J., Ritchie, A. & Neumann, R. 2000 High resolution analysis of haplotype diversity and meiotic crossover in the human TAP2 recombination hotspot. Hum Mol Genet 9, 725–733.DOI: 10.1093/hmg/9.5.725
10.1093/hmg/9.5.725
CAS PubMed Web of Science® Google Scholar
Johnson, G. C., Esposito, L., Barratt, B. J., Smith, A. N., Heward, J., Di Genova, G., Ueda, H., Cordell, H. J., Eaves, I. A., Dudbridge, F., Twells, R. C., Payne, F., Hughes, W., Nutland, S., Stevens, H., Carr, P., Tuomilehto-Wolf, E., Tuomilehto, J., Gough, S. C., Clayton, D. G. & Todd, J. A. 2001 Haplotype tagging for the identification of common disease genes. Nat Genet 29, 233–237.DOI: 10.1038/ng1001-233
10.1038/ng1001-233
CAS PubMed Web of Science® Google Scholar
Kim, R. B., Leake, B. F., Choo, E. F., Dresser, G. K., Kubba, S. V., Schwarz, U. I., Taylor, A., Xie, H. G., McKinsey, J., Zhou, S., Lan, L. B., Schuetz, J. D., Schuetz, E. G. & Wilkinson, G. R. 2001 Identification of functionally variant MDR1 alleles among European Americans and African Americans. Clin Pharmacol Ther 70, 189–199.DOI: 10.1067/mcp.2001.117412
10.1067/mcp.2001.117412
CAS PubMed Web of Science® Google Scholar
Kong, A., Gudbjartsson, D. F., Sainz, J., Jonsdottir, G. M., Gudjonsson, S. A., Richardsson, B., Sigurdardottir, S., Barnard, J., Hallbeck, B., Masson, G., Shlien, A., Palsson, S. T., Frigge, M. L., Thorgeirsson, T. E., Gulcher, J. R. & Stefansson, K. 2002 A high-resolution recombination map of the human genome. Nat Genet 31, 241–247.
10.1038/ng917
CAS PubMed Web of Science® Google Scholar
Lewontin, R. C. 1995 The detection of linkage disequilibrium in molecular sequence data. Genetics 140, 377–388.
CAS PubMed Web of Science® Google Scholar
Li, W. H. 1997 Molecular evolution. Sunderland Massachusetts : Sinauer Associates, Inc. .
Google Scholar
Maliepaard, M., Scheffer, G. L., Faneyte, I. F., Van Gastelen, M. A., Pijnenborg, A. C., Schinkel, A. H., Van De Vijver, M. J., Scheper, R. J. & Schellens, J. H. 2001 Subcellular localization and distribution of the breast cancer resistance protein transporter in normal human tissues. Cancer Res 61, 3458–3464.
CAS PubMed Web of Science® Google Scholar
Phillips, M. S., Lawrence, R., Sachidanandam, R., Morris, A. P., Balding, D. J., Donaldson, M. A., Studebaker, J. F., Ankener, W. M., Alfisi, S. V., Kuo, F. S., Camisa, A. L., Pazorov, V., Scott, K. E., Carey, B. J., Faith, J., Katari, G., Bhatti, H. A., Cyr, J. M., Derohannessian, V., Elosua, C., Forman, A. M., Grecco, N. M., Hock, C. R., Kuebler, J. M., Lathrop, J. A., Mockler, M. A., Nachtman, E. P., Restine, S. L., Varde, S. A., Hozza, M. J., Gelfand, C. A., Broxholme, J., Abecasis, G. R., Boyce-Jacino, M. T. & Cardon, L. R. 2003 Chromosome-wide distribution of haplotype blocks and the role of recombination hot spots. Nat Genet 33, 382–387.DOI: 10.1038/ng1100
10.1038/ng1100
CAS PubMed Web of Science® Google Scholar
Reich, D. E., Schaffner, S. F., Daly, M. J., McVean, G., Mullikin, J. C., Higgins, J. M., Richter, D. J., Lander, E. S. & Altshuler, D. 2002 Human genome sequence variation and the influence of gene history, mutation and recombination. Nat Genet 32, 135–142.DOI: 10.1038/ng947
10.1038/ng947
CAS PubMed Web of Science® Google Scholar
Rozas, J. & Rozas, R. 1999 DnaSP version 3: An integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15, 174–175.
10.1111/j.1365-2699.2005.01375.x
CAS PubMed Web of Science® Google Scholar
Siegsmund, M., Brinkmann, U., Schaffeler, E., Weirich, G., Schwab, M., Eichelbaum, M., Fritz, P., Burk, O., Decker, J., Alken, P., Rothenpieler, U., Kerb, R., Hoffmeyer, S. & Brauch, H. 2002 Association of the P-Glycoprotein transporter MDR1(C3435T) polymorphism with the susceptibility to renal epithelial tumors. J Am Soc Nephrol 13, 1847–1854.DOI: 10.1097/01.ASN.0000019412.87412.BC
10.1097/01.ASN.0000019412.87412.BC
CAS PubMed Web of Science® Google Scholar
Stephens, J. C., Schneider, J. A., Tanguay, D. A., Choi, J., Acharya, T., Stanley, S. E., Jiang, R., Messer, C. J., Chew, A., Han, J. H., Duan, J., Carr, J. L., Lee, M. S., Koshy, B., Kumar, A. M., Zhang, G., Newell, W. R., Windemuth, A., Xu, C., Kalbfleisch, T. S., Shaner, S. L., Arnold, K., Schulz, V., Drysdale, C. M., Nandabalan, K., Judson, R. S., Ruano, G. & Vovis, G. F. 2001 Haplotype variation and linkage disequilibrium in 313 human genes. Science 293, 489–493.DOI: 10.1126/science.1059431
10.1126/science.1059431
CAS PubMed Web of Science® Google Scholar
Stephens, M., Smith, N. J. & Donnelly, P. 2001 A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68, 978–989.DOI: 10.1086/319501
10.1086/319501
CAS PubMed Web of Science® Google Scholar
Stumpf, M. P. & Goldstein, D. B. 2003 Demography, recombination hotspot intensity, and the block structure of linkage disequilibrium. Curr Biol 13, 1–8.DOI: 10.1016/S0960-9822(02)01404-5
10.1016/S0960-9822(02)01404-5
CAS PubMed Web of Science® Google Scholar
Tiret, L., Poirier, O., Nicaud, V., Barbaux, S., Herrmann, S. M., Perret, C., Raoux, S., Francomme, C., Lebard, G., Tregouet, D. & Cambien, F. 2002 Heterogeneity of linkage disequilibrium in human genes has implications for association studies of common diseases. Hum Mol Genet 11, 419–429.DOI: 10.1093/hmg/11.4.419
10.1093/hmg/11.4.419
CAS PubMed Web of Science® Google Scholar
Wang, N., Akey, J. M., Zhang, K., Chakraborty, R. & Jin, L. 2002 Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. Am J Hum Genet 71, 1227–1234.DOI: 10.1086/344398
10.1086/344398
CAS PubMed Web of Science® Google Scholar
Heuvel-Eibrink, M. M., Wiemer, E. A., Prins, A., Meijerink, J. P., Vossebeld, P. J., Van Der Holt, B., Pieters, R. & Sonneveld, P. 2002 Increased expression of the breast cancer resistance protein (BCRP) in relapsed or refractory acute myeloid leukemia (AML). Leukemia 16, 833–839.DOI: 10.1038/sj.leu.2402496
10.1038/sj.leu.2402496
CAS PubMed Web of Science® Google Scholar
Van Herwaarden, A. E., Jonker, J. W., Wagenaar, E., Brinkhuis, R. F., Schellens, J. H., Beijnen, J. H. & Schinkel, A. H. 2003 The breast cancer resistance protein (Bcrp1/Abcg2) restricts exposure to the dietary carcinogen 2-amino-1-methyl-6-phenylimidazo[4,5-b]pyridine. Cancer Res. 63, 6447–6452.
CAS PubMed Web of Science® Google Scholar
Volk, E. L. & Schneider, E. 2003 Wild-type breast cancer resistance protein (BCRP/ABCG2) is a methotrexate polyglutamate transporter. Cancer Res. 63, 5538–5543.
CAS PubMed Web of Science® Google Scholar
Zhou, S., Schuetz, J. D., Bunting, K. D., Colapietro, A. M., Sampath, J., Morris, J. J., Lagutina, I., Grosveld, G. C., Osawa, M., Nakauchi, H. & Sorrentino, B. P. 2001 The ABC transporter Bcrp1/ABCG2 is expressed in a wide variety of stem cells and is a molecular determinant of the side-population phenotype. Nat Med 7, 1028–1034.DOI: 10.1038/nm0901-1028
10.1038/nm0901-1028
CAS PubMed Web of Science® Google Scholar

Supplementary Material

List of position in genomic sequence, length, product size, and annealing temperature for each primer pair, details of SNPs and their genotype for each sample.

Citing Literature

Volume68, Issue6

November 2004

Pages 563-573

Linkage Disequilibrium and Haplotype Architecture for two ABC Transporter Genes (ABCC1 and ABCG2) in Chinese Population: Implications for Pharmacogenomic Association Studies

Summary

Introduction

Materials and Methods