Single-cell omics: Overview, analysis, and application in biomedical science
Abstract
Single-cell sequencing methods provide the highest resolution insight into cellular heterogeneity. Owing to their rapid growth and decreasing cost, they are now widely accessible to scientists worldwide. Single-cell technologies enable analysis of a large number of cells, making them powerful tools to characterise rare cell types and refine our understanding of diverse cell states. Moreover, single-cell application in biomedical sciences helps to unravel mechanisms related to disease pathogenesis and outcome. In this Viewpoint, we briefly describe existing single-cell methods (genomics, transcriptomics, epigenomics, proteomics, and mulitomics), comment on available analysis tools, and give examples of method applications in the biomedical field.
1 INTRODUCTION
Human tissues are composed of heterogeneous cell populations that harbour different cell types. With the help of bulk sequencing, where individual cells cannot be distinguished from each other, many valuable insights into tissue dynamics have already been obtained. However, this approach only captures average properties of the population constituents, which often do not accurately portray the state of an individual cell.1 In recent years, development of single-cell methods allowed us to identify the heterogeneity and cell-to-cell variations from a new perspective by providing a complete and unbiased analysis of each cell. The rising popularity of single-cell genomics was subsequently recognized as “Method of the Year in 2013,” and most recently, the combination of methodologies, single-cell multiomics, was awarded “Method of the Year 2019.”2 The detailed technical aspects of the numerous single-cell approaches have already been reviewed elsewhere,3-6 and will, therefore, not be covered here. The aim of this Viewpoint article is to provide a summary of the various single-cell omic approaches, briefly cover analysis methods and present examples of their applications in biomedical sciences.
2 SINGLE-CELL GENOMICS
Single-cell DNA sequencing (scDNA-seq) helps to unmask intercellular variation and heterogeneity at genomic level. Currently scDNA-seq methods allow to study single nucleotide variants, copy number variations and microsatellite variations.7-9 Cancer biology is one of the research areas where scDNA-seq is widely applied. It can be used to trace the expansion of different clones and reconstruct cell lineages in the mosaic tissue of the tumour.10, 11 In addition, scDNA-seq has the power to characterise rare cell types (e.g., cancer stem cells) that would have been missed in conventional bulk analyses.12 Despite the benefits, numerous challenges still exist that need to be addressed such as selection bias caused by the whole genome amplification step. As a single cell only holds two copies of genomic DNA with an approximate weight of 6 pg, a wide variety of amplification methods had to be developed to obtain enough material for library preparation. More recently, microfluidic systems were used to build libraries without preamplification steps by direct tagmentation of single-cell DNAs that reach a more uniform coverage and provide high-resolution single-cell copy-number profiles.13 Other limiting factors include the throughput of cell isolation techniques, the occurrence of allele dropout events, loss of coverage uniformity, and false-positive, false-negative errors.14, 15
3 SINGLE-CELL TRANSCRIPTOMICS
Single-cell transcriptomics have developed rapidly ever since 2009, when the first single-cell transcriptome profile was described.16 Cell populations with homogenous cell surface markers harbour cell-to-cell variations with a considerable impact on cell function.17 This heterogeneity can be resolved using single-cell RNA sequencing (scRNA-seq). Currently available scRNA-seq technologies can be divided into two categories: (1) droplet-based (e.g., Drop-seq,18 inDrop,19 10x Genomics,20 Seq-well21) and (2) plate-based (e.g., STRT-seq,22 Smart-Seq. 1–3.23-25 All available methods are based on the conversion of RNA into complementary DNA followed by amplification steps to obtain sufficient amounts of DNA for sequencing. Plate-based approaches (e.g., Smart-seq) generate full-length transcripts that can detect lowly expressed genes, alternative splicing events and allele-specific expression. Droplet-based methods (e.g., 10x Genomics), on the other hand, can analyse a larger number of single cells, however, since they sequence only the 5′ or 3′ end of the transcript, no allele-specific expression or isoforms can be detected.26
Single-cell transcriptomics is the most mature among single-cell methods. However, it also harbours several limitations. The majority of the currently available protocols only focuses on the polyadenylated messenger RNA (mRNA) fraction, thus excluding microRNAs and other regulatory RNAs from the analysis. To circumvent that problem, new methods have been developed that capture the entire RNA content of a single cell.27 Another problem stems from tissue processing that can lead to both a distorted reflection of the composition of the original tissue and altered gene expression patterns.28, 29 To avoid that, new protocols have been developed that aim to minimise gene expression artefacts related to tissue preprocessing.30
4 SINGLE-CELL EPIGENOMICS
Epigenetics is a complex regulatory network that modulates chromatin structures and genome function via chemical modifications of DNA and histone proteins.31, 32 Currently available single-cell methods focus on the investigation of the methylation pattern and chromatin state.
4.1 Methylation profile
The analysis of single-cell DNA methylation typically relies on methylation-sensitive restriction enzymes and bisulfite conversion. The quantitative information of bisulfite sequencing is considered the gold standard for genome-wide methylation analysis, but its application to single cells is hampered by DNA degradation, resulting in high dropout rates.38 Single-cell bisulfite sequencing (scBS-seq) allows the detection of the 5mC methylation status at CpG sites, genomic regions where a cytosine nucleotide is followed by a guanine nucleotide.39 The main limitation of this method is a relatively poor genome coverage of only up to 48% with high sequencing depth in a single cell. Due to the lack of genome-wide coverage, allele-specific differences in methylation are difficult to detect.40 Alternative to scBS-seq is single-cell reduced representation bisulfite sequencing (scRRBSseq).41, 42 Compared with the scBS-seq technique, scRRBSseq covers fewer CpG sites, but it provides better coverage for CpG islands, which are likely to be the most informative elements for DNA methylation.42
4.2 Chromatin state
Single-cell techniques can also assess chromatin accessibility by transposase-accessible chromatin sequencing (ATAC-seq).33 Currently, both plate-based (interrogation of hundreds of cells) and droplet-based (interrogation of tens of thousands of cells) methods for scATAC-seq are available.34, 35 Additionally, DNA-protein interactions in single cells can be investigated via chromatin immunoprecipitation sequencing (ChIP-seq).36 This method identifies binding sites of DNA-associated proteins. Recently also methods allowing for investigation of histone modifications at single-cell level are becoming available.37 For example, single-cell Cleavage Under Targets and Tagmentation (scCUT&Tag) allows for profiling diverse chromatin components at single-cell resolution including histone acetylation.37
5 SINGLE-CELL PROTEOMICS
Proteome analysis at single-cell level can provide additional information on state and function of a cell. However, analysing the protein content of a single cell is challenging, mainly due to the lack of methods for protein amplification and the added complexity of secondary and tertiary structures. In recent single-cell studies, cytometry approaches based on fluorescence-activated cell sorting and single-cell mass spectrometry (CyTOF) became available with medium throughput (∼40–50 proteins). CyTOF has been used to analyze surface and intracellular proteins by using antibodies, conjugated to rare heavy metal isotopes. This method overcomes the spectra overlap issue characteristic for multicolor flow cytometry.43
6 SINGLE-CELL MULTIOMICS
Ultimately, the combination of the aforementioned single-cell sequencing techniques has raised attention as it allows for parallel analysis of multiple molecular features within the same cell.4 One type of methods in single-cell multiomic studies includes genome and transcriptome sequencing (G&T-seq) which allows for simultaneous interrogation of DNA and mRNA from a single cell to dissect the effect of genomic variation on gene expression.44 Another type of multiomic methods combines single-cell analyses of the epigenome and transcriptome. The simultaneous analysis of the epigenome and transcriptome enables scientists to investigate gene regulatory pathways. Accordingly, several approaches have been developed that profile chromatin accessibility and gene expression45,46 as well as DNA methylation and gene expression.41, 47, 48 In addition, commercially available solutions such as the “10x Chromium Single Cell Multiome ATAC + Gene expression” platform are now also available. Finally, a third type of multiomic methods enables the analysis of cellular proteins and the transcriptome profile of individual cells.49, 50 One example is the cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq). CITE-seq allows the detection of the protein of interest and the corresponding mRNA levels at single-cell resolution. It provides phenotypic information based on cell surface protein levels together with standard scRNAseq for an unbiased transcriptome analysis. It uses oligonucleotide-labeled antibodies to detect extracellular proteins and measures the transcriptome profile simultaneously, thus providing the joint information about abundance of protein and corresponding mRNA level.49
7 SPATIAL TRANSCRIPTOMICS
Current single-cell sequencing methods require tissue dissociation and thus result in loss of spatial contextualisation. The emerging field of spatial transcriptomics addresses this problem by aiming to characterise gene expression profiles while retaining the spatial information of a tissue.51 Spatial transcriptomics enables the visualisation of the mRNA distribution in tissue sections. Existing spatial transcriptomics methods can be roughly divided into (1) fluorescent in situ hybridization methods (e.g., RNAscope,52 MERFISH53, seqFISH54 and (2) methods based on scRNA-seq (e.g., Slide-seq,55 sci-Space56). Fluorescence-based methods rely on direct labelling of tissue sections with fluorescent probes52-54 whereas scRNA-seq methods apply oligonucleotide or bead-based spatial barcoding.55, 56 Spatial transcriptomics at single-cell resolution is a rapidly developing field which provides a unique opportunity to dissect the tissue architecture together with underlying cellular interactions and thus offering a better understanding of morphologically complex tissues.57 A commercially available Visium HD solution from 10x Genomics for single-cell spatial transcriptomics is planned to be released in the first half of 2022. Availability of an easy-to-use commercial product will increase the accessibility of spatial transcriptomics and further broaden its applications.
8 COMPUTATIONAL METHODS FOR SINGLE-CELL ANALYSIS
The rapid growth and decreasing costs of single-cell methods is making them more accessible to scientists around the world.28, 58 The information about cellular phenotypes, developmental dynamics and communication networks is encrypted in a complex sequencing dataset. Current multiomics methods can measure multiple features of single cells, including the genome, transcriptome, epigenome and proteome. However, the implementation of computational methods is crucial to extract biological information from the data59 and the volume and complexity of the data pose unique computational challenges. Single-cell data is characterised by a high level of technical noise and multifactorial variability between cells.60, 61 The widely recognised challenges within single-cell data science include (1) scaling up to higher dimensionalities regarding cells and features, (2) integration of single-cell data across different samples, experiments and modalities, and (3) validation and benchmarking of available analysis tools.14
Analysing single-cell data is especially challenging for first time users without strong bioinformatic support (Figure 1). Therefore, there is a growing focus on enhancing the user-friendliness and accessibility of available analytical tools.62 Open source analysis tools with tutorials like Seurat from Satija's group (in R programming language)63 or Scanpy from Theis's group (in Python programming language)64 are available. In addition, Bioconductor (https://www.bioconductor.org/), an open source software for bioinformatics, provides sophisticated tools for the analysis of high-throughput genomic data. Nevertheless, the application of those tools still requires basic programming skills. Therefore, commercial providers of single-cell technologies are developing softwares compatible with their products allowing easy interpretation of single-cell data for wet lab scientists without programming skills. One example of such software is Loupe Browser from 10x Genomics, which is a desktop application with point-and-click user interface that facilitates the visualisation and interpretation of data from different 10x Genomics solutions. This includes expression profile and/or chromatin accessibility of single-cells to subsequently identify and characterise cell types. Satija lab also developed a web application, Azimuth (https://satijalab.org/azimuth/) that uses publicly available data sets to facilitate the interpretation of scRNA-seq data. Azimuth uses the count matrix of gene expression from the single-cell experiment and performs data processing including normalisation and clustering. It is a reference-based mapping in which the uploaded dataset is compared to the reference dataset with annotated cell clusters to predict cluster identity. Further development of easy analysis tools will facilitate first interpretation of the data (Table 1). However, in-depth analysis of the complex dataset requires good understanding of biological processes and advanced bioinformatics skills. It fosters the collaboration between dry and wet lab scientists.

Name | Short description | Reference |
---|---|---|
Azimuth | Web application for quick interpretation of scRNA-seq data. | https://satijalab.org/azimuth/ |
Cell Ranger | Four pipelines to analyse data from different 10x Genomics solutions. Read alignements, generation of feature-barcode matrices, clustering, gene expression analysis. | 10x Genomics |
CogentAP | Processing of the sequencing data generated by ICELL8 technology including generation of gene matrix for downstream analysis. | Takara Bio |
Cogenta NGS Discovery Software | Visualisation of the data generated by Cogenta AP. | Takara Bio |
EpiScanpy | Preprocessing and analysis of epigenomic data (scATAC-seq and single-cell DNA methylation) including dimensionality reduction, clustering and visualisation. | Danese et al., (2019)65 |
Loupe Browser | Visualisation of the files generated by Cell Ranger pipelines. | 10x Genomics |
MultiMap | Tool for integration of scRNA-seq, scATAC-seq, single-cell DNA methylation and spatial data. | Jain et al., (2021)66 |
Scanpy | Preprocessing and analysis of scRNA-seq data including visualisation, clustering, trajectory interference, simulation of gene regulatory networks. | Wolf et al., (2018)64 |
Seurat | R package for quality control, analysis and exploration of scRNA-seq data. | https://satijalab.org/seurat/ |
Single Cell Interactive Application (SCiAp) | Tools and workflows for scRNA-seq data analysis from Human Cell Atlas and Single Cell Expression Atlas projects. | Moreno et al. (2021)67 |
https://humancellatlas.usegalaxy.eu/ | ||
Singular Analysis Toolset Software | Analysis and visualisation of the data from Fluidigm systems. Gene expression profile and mutation patterns. | Fluidigm |
Tapestri Insight | Software for single-cell DNA analysis and visualisation compatible with Tapestri Platform. | Mission Bio |
9 APPLICATION OF SINGLE-CELL TECHNOLOGIES
The rapid growth of single-cell methods is reflected in their increasing application in biology and medicine.68 In general, the use of single-cell sequencing in this context can be roughly grouped into three main themes, (1) developmental studies,69-71 (2) atlasing,72 and (3) precision medicine.73, 74
9.1 Developmental studies
Single-cell approaches capture cells at various developmental stages, namely before, during and after lineage commitment, thus are well suited to resolve heterogeneity and steps during the developmental process. This approach was already successfully applied to study the development of nematodes,70 mice71 and zebrafish.69 Single-cell methods are well suited to study the development of multicellular organisms as they can provide simultaneous measurement of clonal history and cell identity. In this context information about gene expression heterogeneity across differentiation pathways brings increased resolution to our understanding of the lineage commitment process.
9.2 Atlasing
High-throughput single-cell methods are currently being applied to create a detailed map of the human organism under the initiative of the Human Cell Atlas (https://www.humancellatlas.org/). The scope is to create a detailed catalogue of all cell types present in the human body. The comparison of the cell types present during homeostasis with the diseased tissue will help to identify molecular changes driving pathogenesis.72 Additionally, the recent advances in mitochondrial DNA (mtDNA) tracing reveal the power of somatic mtDNA variation as natural genetic barcodes in primary human samples.75 Using somatic mutation in mtDNA can be applied to trace clonal evolution of malignancies.
9.3 Precision medicine
Highly innovative single-cell approaches are slowly starting their transition from basic science to clinical applications by enabling detailed characterization of cell types, states and pathways associated with human diseases. A single-cell multiomic approach with combined single-cell transcriptomics and lineage tracing was successfully applied to identify leukaemic and preleukaemic stem cells in acute myeloid leukaemia. It allowed the characterisation of the differentiation block stemming from the presence of leukaemic mutations.74 Another great example illustrating the potential clinical application of single-cell methods, used a combination of whole exome sequencing and single-cell genotyping to understand genetic mechanisms driving progression and resistance in myelofibrosis.73 There is an increased interest in potential clinical application of single-cell techniques reflected in collaborative initiatives like LifeTime. LifeTime aims to understand the complex behaviour of human cells during disease progression and analyse their response to the therapy, all at single-cell resolution.76 To achieve that, further development and integration of multiomics methods is urgently needed. In the future, the drop in single-cell method costs and development of robust sample preprocessing pipelines will facilitate their transition to the clinic.
10 CONCLUDING REMARKS
Single-cell methods are transforming our understanding of biological processes by allowing more complex profiling of cells forming an organism. Further technological developments will make it possible to obtain information on the past (genome mutations), present (transcriptome and proteome), and future (chromatin accessibility) state of a cell from a single experimental snapshot.
ACKNOWLEDGEMENTS
This work was supported by BSIO Graduate Programme (to C.M.S), Deutsche Krebshilfe, 70113643 (to F.D), Postdoctoral Research Fellowship from Alexander von Humboldt-Stiftung (to P.M.S.). R.W. is supported by the German Research Foundation (WE 2554/13-1 and WE 2554/15-1).
CONFLICT OF INTERESTS
The authors declare that there are no conflict of interests.