Volume 188, Issue 3 pp. 326-332
Review Article
Free Access

The role of genomics in common variable immunodeficiency disorders

A.-K. Kienzler

A.-K. Kienzler

NIHR Oxford Biomedical Research Centre, Clinical Immunology Group, Oxford, UK

Search for more papers by this author
C. E. Hargreaves

C. E. Hargreaves

NIHR Oxford Biomedical Research Centre, Clinical Immunology Group, Oxford, UK

Search for more papers by this author
S. Y. Patel

Corresponding Author

S. Y. Patel

NIHR Oxford Biomedical Research Centre, Clinical Immunology Group, Oxford, UK

Correspondence: Smita Patel, Clinical Immunology Group, NIHR Oxford Biomedical Research Centre, University of Oxford, Level 5, John Radcliffe Hospital, Oxford OX3 9DU, UK. E-mail: [email protected]Search for more papers by this author
First published: 25 February 2017
Citations: 70

Summary

The advent of next-generation sequencing (NGS) and ‘omic’ technologies has revolutionized the field of genetics, and its implementation in health care has the potential to realize precision medicine. Primary immunodeficiencies (PID) are a group of rare diseases which have benefited from NGS, with a massive increase in causative genes identified in the past few years. Common variable immunodeficiency disorders (CVID) are a heterogeneous form of PID and the most common form of antibody failure in children and adults. While a monogenic cause of disease has been identified in a small subset of CVID patients, a genomewide association study and whole genome sequencing have found that, in the majority, a polygenic cause is likely. Other NGS technologies such as RNA sequencing and epigenetic studies have contributed further to our understanding of the contribution of altered gene expression in CVID pathogenesis. We believe that to unravel further the complexities of CVID, a multi-omic approach, combining DNA sequencing with gene expression, methylation, proteomic and metabolomics data, will be essential to identify novel disease-associated pathways and therapeutic targets.

OTHER ARTICLES PUBLISHED IN THIS REVIEW SERIES

Clinical challenges in the management of patients with B cell immunodeficiencies. Clinical and Experimental Immunology 2017, 188: 323–5.

When to initiate immunoglobulin replacement therapy (IGRT) in antibody deficiency: a practical approach. Clinical and Experimental Immunology 2017, 188: 333–41.

Progressive multi-focal leucoencephalopathy – driven from rarity to clinical mainstream by iatrogenic immunodeficiency. Clinical and Experimental Immunology 2017, 188: 342–52.

Considerations for dosing immunoglobulin in obese patients. Clinical and Experimental Immunology 2017, 188: 353–62.

Chronic norovirus infection and common variable immunodeficiency. Clinical and Experimental Immunology 2017, 188: 363–70.

Introduction

The molecular structure of DNA was discovered 60 years ago. It then took almost 25 years to develop recombinant DNA technologies which aided the development of an advanced sequencing technique, enabling sequence determination of random DNA stretches from any source. We know this method today as ‘Sanger sequencing’ 1. In the late 1980s the first fully automated sequencing machine was introduced, marking the beginning of the ‘genomics’ era 2. Whereas the sequencing of the first human genome was the world's largest collaborative biological project and took approximately 13 years from launch to completion 3, the 1000 genomes project was completed in 4 years 4. This major advancement was facilitated by the development of next-generation sequencing (NGS) methods in 2010 and progress in computing technology. The announcement of the Precision Medicine Initiative (https://www.whitehouse.gov/precision-medicine) is a big step forward in implementing genomic medicine as part of tailored care of individual patients suffering from common and rare diseases alike.

Genomic technologies

Previously, the search for genetic causes of disease were reliant upon candidate gene studies and genomewide association studies (GWASs) 5. Candidate gene studies require biological and clinical knowledge to select probable disease-causing genes for study, with genes amplified using multiple polymerase chain reaction (PCR) for Sanger sequencing. Such an approach is time-intensive and overlooks pathogenic variants in untested genes. GWASs test for a statistical association between hundreds of thousands of single nucleotide polymorphisms (SNPs) and clinical phenotype, and became a major method for gene discovery in complex diseases. Large sample sizes detected common variants with small effect sizes; however, associations were often lost with correction for multiple testing and replication studies.

NGS technologies have revolutionized analysis of the human genome and its impact on health and disease 6. NGS involves the parallel sequencing of hundreds of millions of DNA molecules. Advantages of NGS over traditional methods include the unbiased sequencing of the entire genome or exome at high depth of coverage 7. Rather than analysing only select genes, this minimizes the chances of missing disease-associated variants and increases the confidence in variant calling, when each base is sequenced multiple times. While whole genome sequencing (WGS) will report on the entire human genome, including exons, introns, regulatory regions and intergenic regions, whole exome sequencing (WES) is targeted sequencing of the 1% of the human genome that is protein-coding, but is reported to harbour 85% of disease-causing variants 8. Targeted NGS may also be applied to smaller panels of genes, identified through initial larger screens such as WGS or WES. As well as DNA sequences, NGS can also be used to detect structural variation 9, the methylation status of DNA 10, RNA transcript sequence and abundance 11 and the openness of chromatin 12.

Genetics in PID diagnostics

Primary immunodeficiencies (PIDs) are a group of rare diseases which has particularly benefited from the advent of NGS. PIDs are defined as inborn errors of immunity, rather than immunodeficiency secondary to immunosuppressive drugs, malnutrition, infection or lymphoid malignancies. PIDs are characterized by partial or complete loss of immune system function. Patients may experience recurrent infections, autoimmunity, allergy and/or malignancy 13. A timely and accurate diagnosis of a PID is essential to prevent the significant morbidity and mortality associated with disease.

In 2011 the International Union of Immunological Societies (IUIS) listed 180 genetically defined single gene inborn errors of immunity discovered over a period of more than 40 years 14. The underlying gene defects were discovered by linkage analysis [e.g. interleukin 2 receptor gamma (IL2RG) in X-linked severe combined immunodeficiency (SCID)] 15, positional cloning [e.g. Bruton tyrosine kinase (BTK) in X-linked agammaglobulinaemia (XLA)] 16, candidate gene analysis [e.g. recombination activating 1 (RAG1) and RAG2 in T cell-negative B cell-negative natural killer cell-positive (TBNK+) SCID] 17 and fluorescence in-situ hybridization (FISH) (e.g. 22q11.2 deletions in DiGeorge syndrome) 18. All these discoveries depended upon either large families or well-informed guesses based on the clinical presentation. The fast development of NGS techniques has accelerated the discovery of novel PIDs drastically, as exemplified by more than 300 genetically defined single gene PIDs to date 13. Mutations in GATA-binding protein 2 (GATA2), magnesium transporter 1 (MAGT1) and signal transducer and activator of transcription 1 (STAT1) were the first ones identified in PID patients by WES 19-21. However, not all PIDs are monogenic defects. There are examples of digenic conditions such as familial haemophagocytic lymphohistiocytosis 22, and the most common form of antibody deficiency, common variable immunodeficiency disorder (CVID), is a heterogeneous polygenic disorder in most cases 23 (Fig. 1).

Details are in the caption following the image

There are multiple genetic components involved in common variable immunodeficiency disorder (CVID) pathogenesis. Monogenic causes of CVID has been found in approximately 10% of cases, and primarily modify molecules associated with the B cell receptor complex. In most cases there is a polygenic mode of inheritance, whereby variants in multiple genes can contribute to the same or diverse phenotypes. Further genetic complexity may come from transcriptional and epigenetic disturbances.

There is great potential for NGS in PID patient care. It may provide a molecular diagnosis where previously the patient was unclassified, and thus identify therapeutic options, e.g. haematopoietic stem cell transplantation, providing a prognosis or risk-group stratification. It may be relevant to screen the patient's family and offer genetic counselling. Selecting patients and families to put forward for NGS based on their clinical and immune phenotypes remains a pivotal and indispensable step to increase the chances of success. Detailed phenotypical data need to be validated and applied to the analysis steps to correlate disease-causing and disease-associated variants. To this end, clinicians need to ensure that a standardized and robust process is in place to collect and categorize clinical and functional phenotypical data. Despite the technical and clinical advancements made as a result of NGS, the identification of genetic defects in PIDs is still a major challenge 24. These challenges are of a technical as well as disease-inherent nature. Technical complications may depend upon the genomic platform used. Targeted gene panels and WES detect only exonic mutations. None the less, the discovery rate for disease-causing variants using WES or a targeted PID panel is 25–60% 25, 26. Undiagnosed PID patients might have mutations in the 10% of exons not covered by WES 27 or in intronic, promotor or regulatory regions. The latter types of variations could be detected by WGS; however, the amount of data generated presents a bioinformatics bottleneck, and is currently a major limitation of this method for use in routine PID diagnostics. For an unknown gain in signal by applying WGS instead of WES, the data noise increases 100-fold 28. In addition, it is not always clear that the immune deficiency is due to germline mutations and the possibility of somatic and post-zygotic mutations may complicate analysis, as these variants may be discarded during the analysis phase. Disease-associated challenges include the fact that mutations in different genes can result in similar phenotypes (locus heterogeneity; Fig. 1), while mutations in different parts of the same gene can present with distinct phenotypes (allelic heterogeneity) 26. The polygenic and heterogeneous nature of CVID presents it as one of the greater challenges. In this review, we will focus on the advances made and challenges we face to unravel CVID with NGS techniques.

CVID

Clinical presentation

CVID comprises a heterogeneous group of PID, with low to undetectable antibody levels as the common denominator. All CVID patients suffer from an increased susceptibility to microbial infections but, in addition, approximately two-thirds of the patients develop different complications, including chronic inflammatory disorders (e.g. colitis, granulomas), polycloncal lymphoproliferation, autoimmune syndromes (e.g. cytopenias) and/or malignancies (e.g. leukaemia, lymphoma, colon cancer) 29, 30. These CVID comorbidities reduce the life expectancy and quality of life of the affected patients. Due to the non-specific clinical presentation and absence of fundamental laboratory definition, the diagnosis is based on exclusion of other conditions associated with hypogammaglobulinaemia. Not only the clinical but also the cellular and genetic phenotypes of CVID are extremely heterogeneous.

Monogenetic defects with CVID-like phenotypes

Molecular defects causing CVID have been discovered mainly in consanguineous families with several members with a CVID-like phenotype. The first ones to be described have been recessively inherited mutations in components affecting B cell activation (CD19, CD20, CD21, CD81), isotype switching and somatic hypermutation (ICOS) 31-36. More recently autosomal dominant CVID-like conditions have been described. Monogenic, familial CVID accounts for less than 20% of all CVID cases. In most cases, patients present early with severe recurrent life-threatening infections. There is a clear B cell defect as measured by antibody production and vaccine responses.

Challenges of the CVID cohort

A genetic component of CVID was already recognized in 1968 37. Understanding the genetic variability in CVID is critical to developing personalized approaches to treatment, comorbidity monitoring and care of patients with CVID. However, clinical heterogeneity and a predominantly sporadic disease onset hamper the identification of molecular defects in CVID. Mutations identified in familial CVID could not be detected in a cohort of sporadic CVIDs in Oxford (unpublished observations), suggesting two types of CVID: monogenic CVID-like diseases and classical CVID, which is highly probably a complex polygenic disease. Unlike monogenic diseases with clear genotype–phenotype correlations, complex diseases are affected by cumulative effects of polygenic determinants that could even be of incomplete penetrance, gene–gene interactions and regulatory variation in non-coding regions. Thus, molecular approaches thus far have been largely unrewarding in identifying the disease cause in non-familial CVID.

NGS in CVID investigation

GWAS and CVID susceptibility loci

The team of Orange et al. 23 performed the first GWAS study in CVID, genotyping 363 patients with 610 000 SNPs. This study identified a strong association of CVID with disintegrin metalloproteinase (ADAM) genes and confirmed the previously identified association with the major histocompatibility complex (MHC) region 38. ADAM genes encode a family of zinc metalloproteases involved in many immunological processes, e.g. leucocyte adhesion. Although a single causative susceptibility locus for CVID could not be identified, it became clear that the 1000 most significant SNPs were strongly predictive of the CVID phenotype and allowed prediction of CVID by genetic profiling. In addition to CVID-associated SNP regions, Orange et al. identified several disease-associated deleterious duplications and deletions. These structural variants suggest novel genetic causes of CVID; however, none of the variants is as yet confirmed functionally. Many of the identified SNPs and structural variants are unique to individual patients, indicating a great mechanistic diversity, and support the hypothesis that a collection of diverse mechanisms underlies complex phenotypes such as CVID. Furthermore, the observed increase in copy number variation in CVID does not correlate with patient age or incidence of malignancy or other subphenotypes 39.

Two recent studies used the Illumina Immuno BeadChip for detection of previously described SNPs associated with autoimmune and inflammatory diseases 40, 41. Given that autoimmune manifestations are a comorbidity of CVID and the human leucocyte antigen (HLA) locus is associated with CVID susceptibility, Li et al. 40 hypothesized that CVID and autoimmune syndromes have common risk loci. Genotyping well-defined autoimmune risk loci in a cohort of 778 CVID patients identified C-type lectin domain family 16 member A (CLEC16A) at 16p13.13 as a novel risk locus for CVID. CLEC16A is expressed differentially based on the risk allele with the protective minor allele expressed more highly than wild-type. The importance of CLEC16A for B cell homeostasis was confirmed by a Clec16 knock-down in mice. Although the Clec16 knock-down mouse phenotype [reduced B cells and elevated immunoglobulin (Ig)M] does not resemble human CVID fully, it confirms the importance of CLEC16A in B cell homeostasis and suggests that CLEC16A, together with other dysregulated pathways, contributes to the B cell phenotype in CVID. Genotyping autoimmune risk loci in CVID revealed an additional six loci [Fc receptor-like A (FCRLA), eomesodermin homologue (EOMES), tumour necrosis factor (TNF)AIP3 interacting protein 1 (TNIP1), TNF alpha-induced protein 3 (TNFAIP3), TNF (ligand) superfamily, TNF (ligand) superfamily member 11 (TNFSF11) and protein tyrosine phosphatase, non-receptor type 2 (PTPN2)] that suggest an association with CVID, although further functional proof is needed. It is poorly understood how autoimmunity develops in an immune-deficient individual. CLEC16A, and possibly other autoimmune risk loci, could be genes/proteins linking autoimmunity and immunodeficiency in CVID, although further validation of this result in a larger patient cohort is needed.

Maggadottir et al. 41 genotyped 164 CVID patients with the Immuno BeadChip. They focused their analysis on the detection of rare CVID-associated variants [minor allele frequency (MAF) < 5%] following the hypothesis that CVID is the result of a combination of rare variants in each individual patient, rather than variants common to many patients. Next to the previously described association of CVID with the MHC locus, the study identified 11 SNPs at the 16p11.2 locus, with the most significant SNPs being in fusion [involved in t(12;16) in malignant liposarcoma] (FUS) and integrin, alpha M (complement component 3 receptor 3 subunit) (ITGAM) (encoding CD11b). ITGAM is a component of complement receptor 3 important in proinflammatory responses. Identification of functional association partners for ITGAM revealed several genes containing SNPs with nominally significant associations with CVID. Furthermore, pathway analysis of the ITGAM network demonstrated an enrichment of multiple B and T cell signalling pathways that might be relevant to CVID pathophysiology. However, functional validation of the effect of the SNP on mRNA and protein levels as well as functional validation to determine a genotype–phenotype correlation for the ITGAM SNP has not yet been established. Interestingly, 80% of the patients with the rare ITGAM variant had low switched memory B cells which was seen in only 60% of the patients without the ITGAM variant. It is known that CVID patients with low switched memory B cells are at higher risk of autoimmune diseases. Considering that the Immuno BeadChip was designed based on GWAS data on autoimmune and inflammatory conditions, it might be not surprising that in the rather small CVID study cohort no further associations with other comorbidities have been detected.

Immuno BeadChip assesses only SNPs identified previously in GWAS studies of autoimmune and inflammatory diseases. Although this is a promising tool to identify common pathways in immune dysregulation symptoms, it is unlikely that it will explain the whole spectrum of CVID. Given the polygenic nature of CVID, it is likely that susceptibility loci will contribute to or modify the phenotypes that are not associated with autoimmune/inflammatory diseases

CVID transcriptional signature

In-silico analysis of the identified SNPs associated with CVID susceptibility highlighted that CVID SNPs are enriched for enhancer markers, DNase sites, promotor/enhancer histone markers and transcription regulatory motifs, suggesting that CVID is not only the result of variations in protein-coding genes, but also of deregulated transcriptional regulation 41 (Fig. 1).

To understand CVID pathogenesis and the functional consequence of identified regulatory SNPs more clearly, it is important to reveal the transcriptional signature of immune cells of CVID patients. To date, only two studies have analysed the transcriptional signature of CVID. Park et al. 42 compared the transcriptomes of peripheral blood mononuclear cells (PBMCs) from CVID patients with and without inflammatory/autoimmune manifestations and healthy controls. The patient group with inflammatory/autoimmune complications contributed most of the transcriptional difference between CVID and healthy subjects. A more detailed modular analysis to group genes according to their shared expression patterns across health and disease revealed that in patients with inflammatory/autoimmune conditions interferon (IFN)-related modules are over-expressed and transcripts related to B cell, plasma cell and T cell modules are down-regulated. After in-vitro stimulation, CVID PBMCs produced less IFN-γ compared to healthy controls. Despite the differential gene expression between the infections-only and inflammatory CVID groups, IFN-γ expression and secretion was not different between the two patient groups. In contrast, CVID patients with inflammatory conditions produced less IFN-α than PBMCs from patients without these conditions. These results indicate that analysing RNA microarray data alone is not sensitive enough to discriminate clinical phenotypes and that the identification of new biomarkers for CVID depends greatly upon the chosen readout system. Transcriptome analysis helps in identifying differentially expressed genes; however, functional validation is crucial to validate a correlation between transcriptional signature and phenotype.

Epigenetic component of CVID

In CVID, cellular markers, e.g. CD21low B cells, are not associated with specific genetic defects and vary even in patients with identical genetic defects 43. This phenomenon could result from (i) the polygenic nature of the disease and/or (ii) environmental cofactors that modify disease susceptibility and molecular phenotype (Fig. 1). A recent study showed that epigenetic variation also contributes to the development of CVID. Rodríguez-Cortez et al. 44 reported distinct DNA methylation patterns in monozygotic twins discordant for CVID. B cells from the individual with CVID showed a gain of DNA methylation in genes critical for B cell function, particularly those involved in the B cell receptor (BCR) signalling pathway, such as PIK3CD. The CVID-associated hypermethylation of genes is specific to B cells among immune cells. CVID patients’ B cells showed an impaired ability to demethylate and up-regulate affected genes in transitioning from naive to memory B cells. In healthy B cells, analysis of histone modifications in sequences containing the cytosine–phosphate–guanine (CpG) sites undergoing demethylation showed an increase in the activating histone mark H3K4me3 during naive to memory cell differentiation in B cell-specific genes such as PI3KCD. Thus, gene expression is linked tightly to DNA methylation and histone modifications which are influenced by environmental factors. In a certain genetic background, DNA methylation changes are related to the onset of CVID. It is not yet clear when the changes in methylation occur and whether alterations in DNA methylation are a cause or consequence of CVID. The importance of epigenetic changes as a cause of CVID pathogenesis is emphasized by the finding that differential DNA methylation plays a crucial role during development and activation of human B cells 45, 46. In addition, inhibitors of certain histone-modifying enzymes interfere with plasma cell development 47. Global reprogramming of the epigenome during the naive to memory B cell transition would allow an adult somatic cell to differentiate into diverse multiple cell types 45. Further studies are needed to link the mainly descriptive studies on changes in DNA methylation to defined molecular mechanisms modifying B cell development in healthy individuals and CVID patients.

Combining genomic technologies

Given the clinically heterogeneous and probable polygenic nature of CVID, we performed the first study of WGS in CVID 48, combined with RNA-seq in selected patients, to gain a comprehensive understanding of the genomic contribution to disease pathogenesis.

This study presented numerous lines of evidence for the polygenic nature of sporadic CVID. There was an average of 9·4 variants per patient, and 84% of variants were shared between two or more patients. We identified variants in 10 CVID-associated genes, 19 PID-associated genes and 24 genes identified previously by GWAS. Seven recurrent CVID-associated variants were identified, five of which were in TACI. We also identified novel variants in CVID-associated genes, including lipopolysaccharide (LPS)-responsive vesicle trafficking, beach and anchor (LRBA). The most notable novel variant was a hemizygous frameshift leading to a premature stop codon in BTK in one patient. Previous screening for known BTK variants in this patient was unsuccessful, and phenotypically the patient fits suspected X-linked agammagloblulinaemia caused by BTK deficiency. This finding demonstrates the power of an unbiased genomic approach in generating clinically relevant information and in providing a molecular diagnosis in PID.

WGS also provides the advantage of allowing analysis of genes linked functionally with disease-associated genes. Through pathway analysis we identified 112 variants, 38 of which were novel, in 101 genes. This analysis strategy increased the power to discover novel disease-associated variants or genes, as well as biological pathways. Pathway analysis is important, as genes function in concert with others to control biological mechanisms. It is possible that combinations of variants in the same or overlapping pathways contribute to the pathogenesis of heterogeneous diseases such as CVID. To explore further pathway analysis and the contribution of variation in regulatory regions of the genome such as promoters, 5' and 3' untranslated regions (UTRs) and miRNA binding sites, we sequenced the transcriptome of isolated B cells from three patients using RNA-seq. Importantly, there was significant overlap of pathways identified in WGS and RNA-seq. We identified an enrichment of variants in pathways such as B cell receptor signalling, DNA repair, apoptosis and T cell regulation. Combining transcriptomics to WGS data aids in the validation of variants. Genetic findings must be validated experimentally, and one such way would be to show impaired mRNA expression. We identified variants associated with altered mRNA expression, including PID- and CVID-associated genes.

These data demonstrate the power of NGS in understanding the genomic architecture of CVID and the strength of combining complementary technologies. It has laid the ground for further transcriptomic studies to stratify clinical phenotypes and interrogate pathways of interest.

Conclusions

In the field of CVID there remains divided opinion regarding the likelihood of finding monogenic aetiology in sporadic CVID cases. Attempts to WGS individual cohorts has so far not been successful, and will not provide enough power for cohort analysis. Overall, CVID genomics remains a complex challenge and it is our opinion that to understand CVID we need to approach the disease in a different manner. By using NGS, epigenetics, proteomics and possibly metabolomics in combination we may be able to define which pathways are dysregulated and open to therapeutic modification. Hence, we may come several steps closer to improving treatment for patients which, for many years, has remained static. Such an undertaking would require collaboration within the PID research community to increase cohort size and across the scientific and pharmaceutical communities. The future, however, is optimistic.

Disclosure

None to declare.

    The full text of this article hosted at iucr.org is unavailable due to technical difficulties.