RESEARCH ARTICLE

Differential expression of single-cell RNA-seq data using Tweedie models

Corresponding Author

Himel Mallick

[email protected]

orcid.org/0000-0003-4956-2429

Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, Rahway, New Jersey, USA

Correspondence Himel Mallick, Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA.

Email: [email protected]

Ali Rahnavard, Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.

Email: [email protected]

Stephanie C. Hicks, Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA.

Email: [email protected]

Search for more papers by this author

Suvo Chatterjee,

Suvo Chatterjee

Epidemiology Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland, USA

Search for more papers by this author

Shrabanti Chowdhury,

Shrabanti Chowdhury

Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, New York, USA

Search for more papers by this author

Saptarshi Chatterjee,

Saptarshi Chatterjee

Department of Statistics, Data and Analytics, Eli Lilly & Company, Indianapolis, Indianapolis, Indiana, USA

Search for more papers by this author

Ali Rahnavard,

Corresponding Author

Ali Rahnavard

[email protected]

Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC, USA

Correspondence Himel Mallick, Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA.

Email: [email protected]

Ali Rahnavard, Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.

Email: [email protected]

Stephanie C. Hicks, Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA.

Email: [email protected]

Search for more papers by this author

Stephanie C. Hicks,

Corresponding Author

Stephanie C. Hicks

[email protected]

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA

Correspondence Himel Mallick, Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA.

Email: [email protected]

Ali Rahnavard, Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.

Email: [email protected]

Stephanie C. Hicks, Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA.

Email: [email protected]

Search for more papers by this author

Himel Mallick,

Corresponding Author

Himel Mallick

[email protected]

orcid.org/0000-0003-4956-2429

Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, Rahway, New Jersey, USA

Correspondence Himel Mallick, Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA.

Email: [email protected]

Ali Rahnavard, Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.

Email: [email protected]

Stephanie C. Hicks, Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA.

Email: [email protected]

Search for more papers by this author

Suvo Chatterjee,

Suvo Chatterjee

Search for more papers by this author

Shrabanti Chowdhury,

Shrabanti Chowdhury

Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, New York, USA

Search for more papers by this author

Saptarshi Chatterjee,

Saptarshi Chatterjee

Department of Statistics, Data and Analytics, Eli Lilly & Company, Indianapolis, Indianapolis, Indiana, USA

Search for more papers by this author

Ali Rahnavard,

Corresponding Author

Ali Rahnavard

[email protected]

Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC, USA

Correspondence Himel Mallick, Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA.

Email: [email protected]

Ali Rahnavard, Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.

Email: [email protected]

Stephanie C. Hicks, Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA.

Email: [email protected]

Search for more papers by this author

Stephanie C. Hicks,

Corresponding Author

Stephanie C. Hicks

[email protected]

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA

Correspondence Himel Mallick, Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ 07065, USA.

Email: [email protected]

Ali Rahnavard, Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA.

Email: [email protected]

Stephanie C. Hicks, Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA.

Email: [email protected]

Search for more papers by this author

First published: 02 June 2022

https://doi.org/10.1002/sim.9430

Citations: 7

Himel Mallick and Suvo Chatterjee contributed equally to this article.

Funding information: Bill and Melinda Gates Foundation, Grant/Award Number: INV-016930; Division of Environmental Biology, Grant/Award Number: DEB-2028280; National Human Genome Research Institute, Grant/Award Number: R00HG009007

Share a link

Email
Wechat
Bluesky

Abstract

The performance of computational methods and software to identify differentially expressed features in single-cell RNA-sequencing (scRNA-seq) has been shown to be influenced by several factors, including the choice of the normalization method used and the choice of the experimental platform (or library preparation protocol) to profile gene expression in individual cells. Currently, it is up to the practitioner to choose the most appropriate differential expression (DE) method out of over 100 DE tools available to date, each relying on their own assumptions to model scRNA-seq expression features. To model the technological variability in cross-platform scRNA-seq data, here we propose to use Tweedie generalized linear models that can flexibly capture a large dynamic range of observed scRNA-seq expression profiles across experimental platforms induced by platform- and gene-specific statistical properties such as heavy tails, sparsity, and gene expression distributions. We also propose a zero-inflated Tweedie model that allows zero probability mass to exceed a traditional Tweedie distribution to model zero-inflated scRNA-seq data with excessive zero counts. Using both synthetic and published plate- and droplet-based scRNA-seq datasets, we perform a systematic benchmark evaluation of more than 10 representative DE methods and demonstrate that our method (Tweedieverse) outperforms the state-of-the-art DE approaches across experimental platforms in terms of statistical power and false discovery rate control. Our open-source software (R/Bioconductor package) is available at https://github.com/himelmallick/Tweedieverse.

Open Research

DATA AVAILABILITY STATEMENT

Previously published data used in this study are appropriately cited in the main text as well as in the References section. The detailed data summary is provided in Table S1. Unless otherwise noted, most of the corresponding annotated digital expression matrices are available from the NCBI Gene Expression Omnibus database. In addition, analysis scripts to process and analyse these datasets are available at https://github.com/himelmallick/Tweedie_SingleCell

Supporting Information

REFERENCES

1Hicks SC, Townes FW, Teng M, Irizarry RA. Missing data and technical variability in single-cell RNA-sequencing experiments. Biostatistics. 2018; 19(4): 562-578. doi:10.1093/biostatistics/kxx053
10.1093/biostatistics/kxx053
PubMed Web of Science® Google Scholar
2Ding J, Adiconis X, Simmons SK, et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat Biotechnol. 2020; 38(6): 737-746.
10.1038/s41587-020-0465-8
CAS PubMed Web of Science® Google Scholar
3Chen G, Ning B, Shi T. Single-cell RNA-seq technologies and related computational data analysis. Front Genet. 2019; 10: 317.
10.3389/fgene.2019.00317
CAS PubMed Web of Science® Google Scholar
4Zheng GX, Terry JM, Belgrader P, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017; 8(1): 1-12.
10.1038/ncomms14049
PubMed Web of Science® Google Scholar
5Macosko EZ, Basu A, Satija R, et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell. 2015; 161(5): 1202-1214.
10.1016/j.cell.2015.05.002
CAS PubMed Web of Science® Google Scholar
6Islam S, Zeisel A, Joost S, et al. Quantitative single-cell RNA-seq with unique molecular identifiers. Nat Methods. 2014; 11(2): 163-166. doi:10.1038/nmeth.2772
10.1038/nmeth.2772
CAS PubMed Web of Science® Google Scholar
7Grün D, Kester L, van Oudenaarden A. Validation of noise models for single-cell transcriptomics. Nat Methods. 2014; 11(6): 637-640. doi:10.1038/nmeth.2930
10.1038/nmeth.2930
CAS PubMed Web of Science® Google Scholar
8Picelli S, Faridani OR, Björklund ÅK, Winberg G, Sagasser S, Sandberg R. Full-length RNA-seq from single cells using smart-seq2. Nat Protoc. 2014; 9(1): 171-181.
10.1038/nprot.2014.006
CAS PubMed Web of Science® Google Scholar
9Pollen AA, Nowakowski TJ, Shuga J, et al. Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex. Nat Biotechnol. 2014; 32(10): 1053.
10.1038/nbt.2967
CAS PubMed Web of Science® Google Scholar
10Vieth B, Ziegenhain C, Parekh S, Enard W, Hellmann I. powsimR: power analysis for bulk and single cell RNA-seq experiments. Bioinformatics. 2017; 33(21): 3486-3488. doi:10.1093/bioinformatics/btx435
10.1093/bioinformatics/btx435
CAS PubMed Web of Science® Google Scholar
11Townes FW, Hicks SC, Aryee MJ, Irizarry RA. Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model. Genome Biol. 2019; 20(1): 1-16.
10.1186/s13059-019-1861-6
PubMed Web of Science® Google Scholar
12Hafemeister C, Satija R. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol. 2019; 20(1): 1-15.
10.1186/s13059-019-1874-1
PubMed Web of Science® Google Scholar
13Svensson V. Droplet scRNA-seq is not zero-inflated. Nat Biotechnol. 2020; 38(2): 147-150.
10.1038/s41587-019-0379-5
CAS PubMed Web of Science® Google Scholar
14Cao Y, Kitanovski S, Küppers R, Hoffmann D. UMI or not UMI, that is the question for scRNA-seq zero-inflation. Nat Biotechnol. 2021; 39(2): 158-159.
10.1038/s41587-020-00810-6
CAS PubMed Web of Science® Google Scholar
15Paulson JN, Stine OC, Bravo HC, Pop M. Differential abundance analysis for microbial marker-gene surveys. Nat Methods. 2013; 10(12): 1200-1202.
10.1038/nmeth.2658
CAS PubMed Web of Science® Google Scholar
16Korthauer KD, Chu LF, Newton MA, et al. A statistical approach for identifying differential distributions in single-cell RNA-seq experiments. Genome Biol. 2016; 17(1): 222.
10.1186/s13059-016-1077-y
PubMed Web of Science® Google Scholar
17Soneson C, Robinson MD. Bias, robustness and scalability in single-cell differential expression analysis. Nat Methods. 2018; 15(4): 255.
10.1038/nmeth.4612
CAS PubMed Web of Science® Google Scholar
18Finak G, McDavid A, Yajima M, et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 2015; 16(1): 1-13.
10.1186/s13059-015-0844-5
PubMed Web of Science® Google Scholar
19Sekula M, Gaskins J, Datta S. Detection of differentially expressed genes in discrete single-cell RNA sequencing data using a hurdle model with correlated random effects. Biometrics. 2019; 75(4): 1051-1062.
10.1111/biom.13074
CAS PubMed Web of Science® Google Scholar
20Risso D, Perraudeau F, Gribkova S, Dudoit S, Vert JP. A general and flexible method for signal extraction from single-cell RNA-seq data. Nat Commun. 2018; 9(1): 284. doi:10.1038/s41467-017-02554-5
10.1038/s41467-017-02554-5
PubMed Web of Science® Google Scholar
21Alessandrı̀ L, Arigoni M, Calogero R. Differential expression analysis in single-cell transcriptomics. Methods Mol Biol. 1979; 2019: 425-432.
Google Scholar
22Hie B, Peters J, Nyquist SK, Shalek AK, Berger B, Bryson BD. Computational methods for single-cell RNA sequencing. Annu Rev Biomed Data Sci. 2020; 3: 339-364.
10.1146/annurev-biodatasci-012220-100601
Google Scholar
23Van Buren E, Hu M, Weng C, et al. TWO-SIGMA: a novel two-component single cell model-based association method for single-cell RNA-seq data. Genet Epidemiol. 2021; 45(2): 142-153.
10.1002/gepi.22361
CAS PubMed Web of Science® Google Scholar
24Miao Z, Deng K, Wang X, Zhang X. DEsingle for detecting three types of differential expression in single-cell RNA-seq data. Bioinformatics. 2018; 34(18): 3223-3224.
10.1093/bioinformatics/bty332
CAS PubMed Web of Science® Google Scholar
25Hu MC, Pavlicova M, Nunes EV. Zero-inflated and hurdle models of count data with extra zeros: examples from an HIV-risk reduction intervention trial. Am J Drug Alcohol Abuse. 2011; 37(5): 367-375.
10.3109/00952990.2011.597280
PubMed Web of Science® Google Scholar
26Robinson MD, McCarthy DJ, Smyth GK. edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010; 26(1): 139-140.
10.1093/bioinformatics/btp616
CAS PubMed Web of Science® Google Scholar
27Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014; 15(12): 550.
10.1186/s13059-014-0550-8
PubMed Web of Science® Google Scholar
28Hawinkel S, Rayner J, Bijnens L, Thas O. Sequence count data are poorly fit by the negative binomial distribution. PLoS One. 2020; 15(4):e0224909.
10.1371/journal.pone.0224909
CAS PubMed Web of Science® Google Scholar
29Zappia L, Phipson B, Oshlack A. Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database. PLoS Comput Biol. 2018; 14(6):e1006245. doi:10.1371/journal.pcbi.1006245
10.1371/journal.pcbi.1006245
PubMed Web of Science® Google Scholar
30Zhang Y. Likelihood-based and Bayesian methods for Tweedie compound Poisson linear mixed models. Stat Comput. 2013; 23: 743-757.
10.1007/s11222-012-9343-7
CAS PubMed Web of Science® Google Scholar
31Tweedie MC. An index which distinguishes between some important exponential families, 579. 1984.
Google Scholar
32Jørgensen B. Exponential dispersion models. J Royal Stat Soc Ser B (Methodol). 1987; 49(2): 127-145.
10.1111/j.2517-6161.1987.tb01685.x
Web of Science® Google Scholar
33Kurz C. Tweedie distributions for fitting semicontinuous health care utilization cost data. BMC Med Res Methodol. 2017; 17: 171.
10.1186/s12874-017-0445-y
PubMed Web of Science® Google Scholar
34Sarkar A, Stephens M. Separating measurement and expression models clarifies confusion in single-cell RNA sequencing analysis. Nat Genet. 2021; 53(6): 770-777.
10.1038/s41588-021-00873-4
CAS PubMed Web of Science® Google Scholar
35van der Berge K, Perraudeau F, Soneson C, et al. Observation weights unlock bulk RNA-seq tools for zero inflation and single-cell applications. Genome Biol. 2018; 19(1): 24. doi:10.1186/s13059-018-1406-4
10.1186/s13059-018-1406-4
PubMed Google Scholar
36McCullagh P, Nelder J. Generalized Linear Models. 2nd ed. Boca Raton, FL: Chapman & Hall; 1989.
10.1007/978-1-4899-3242-6
CAS Google Scholar
37Cox D, Reid N. Parameter orthogonality and approximate conditional inference. J Royal Stat Soc Ser B. 2017; 49(1): 1-139.
Google Scholar
38Dunn P, Smyth G. Evaluation of Tweedie exponential dispersion model densities by Fourier inversion. Stat Comput. 2007; 18: 73-86.
10.1007/s11222-007-9039-6
Web of Science® Google Scholar
39Dunn PK, Smyth GK. Series evaluation of Tweedie exponential dispersion models. Stat Comput. 2005; 15: 267-280.
10.1007/s11222-005-4070-y
Web of Science® Google Scholar
40Dunn PK, Smyth GK. Evaluation of Tweedie exponential dispersion models using Fourier inversion. Stat Comput. 2008; 18: 73-86.
10.1007/s11222-007-9039-6
Web of Science® Google Scholar
41Bonat WH, Kokonendji CC. Flexible Tweedie regression models for continuous data. J Stat Comput Simul. 2017; 87(11): 2138-2152.
10.1080/00949655.2017.1318876
Web of Science® Google Scholar
42Ma Y, Sun S, Shang X, Keller ET, Chen M, Zhou X. Integrative differential expression and gene set enrichment analysis using summary statistics for scRNA-seq studies. Nat Commun. 2020; 11(1): 1-13.
10.1038/s41467-020-15600-6
PubMed Web of Science® Google Scholar
43Giner G, Smyth GK. statmod: probability calculations for the inverse Gaussian distribution. R J. 2016; 8(1): 339-351.
10.32614/RJ-2016-024
Web of Science® Google Scholar
44Amezquita RA, Lun ATL, Becht E, et al. Orchestrating single-cell analysis with bioconductor. Nat Methods. 2020; 17(2): 137-145. doi:10.1038/s41592-019-0654-x
10.1038/s41592-019-0654-x
CAS PubMed Web of Science® Google Scholar
45Lun AT, Bach K, Marioni JC. Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol. 2016; 17(1): 1-14.
PubMed Web of Science® Google Scholar
46Luecken MD, Theis FJ. Current best practices in single-cell RNA-seq analysis: a tutorial. Mol Syst Biol. 2019; 15(6):e8746.
10.15252/msb.20188746
PubMed Web of Science® Google Scholar
47Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Ann Stat. 2001; 29(4): 1165-1188.
10.1214/aos/1013699998
Web of Science® Google Scholar
48Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc Ser B (Methodol). 1995; 57(1): 289-300.
10.1111/j.2517-6161.1995.tb02031.x
Google Scholar
49Assefa AT, Vandesompele J, Thas O. SPsimSeq: semi-parametric simulation of bulk and single-cell RNA-sequencing data. Bioinformatics. 2020; 36(10): 3276-3278.
10.1093/bioinformatics/btaa105
CAS PubMed Web of Science® Google Scholar
50Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 2008; 9(1): 1-13.
10.1186/1471-2105-9-559
CAS PubMed Web of Science® Google Scholar
51Cohen J. Statistical Power Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbaum Associates; 2013.
10.4324/9780203771587
Google Scholar
52Zappia L, Phipson B, Oshlack A. Splatter: simulation of single-cell RNA sequencing data. Genome Biol. 2017; 18(1): 1-15.
10.1186/s13059-017-1305-0
PubMed Web of Science® Google Scholar
53Islam S, Kjällquist U, Moliner A, et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Res. 2011; 21(7): 1160-1167.
10.1101/gr.110882.110
CAS PubMed Web of Science® Google Scholar
54Klein AM, Mazutis L, Akartuna I, et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell. 2015; 161(5): 1187-1201.
10.1016/j.cell.2015.04.044
CAS PubMed Web of Science® Google Scholar
55Trapnell C, Cacchiarelli D, Grimsby J, et al. Pseudo-temporal ordering of individual cells reveals dynamics and regulators of cell fate decisions. Nat Biotechnol. 2014; 32(4): 381.
10.1038/nbt.2859
CAS PubMed Web of Science® Google Scholar
56Wu Z, Zhang Y, Stitzel ML, Wu H. Two-phase differential expression analysis for single cell RNA-seq. Bioinformatics. 2018; 34(19): 3340-3348.
10.1093/bioinformatics/bty329
CAS PubMed Web of Science® Google Scholar
57Darmanis S, Sloan SA, Zhang Y, et al. A survey of human brain transcriptome diversity at the single cell level. Proc Natl Acad Sci. 2015; 112(23): 7285-7290.
10.1073/pnas.1507125112
CAS PubMed Web of Science® Google Scholar
58Petropoulos S, Edsgärd D, Reinius B, et al. Single-cell RNA-seq reveals lineage and X chromosome dynamics in human preimplantation embryos. Cell. 2016; 165(4): 1012-1026.
10.1016/j.cell.2016.03.023
CAS PubMed Web of Science® Google Scholar
59Snyder BN, Cho Y, Qian Y, Coad JE, Flynn DC, Cunnick JM. AFAP1L1 is a novel adaptor protein of the AFAP family that interacts with cortactin and localizes to invadosomes. Eur J Cell Biol. 2011; 90(5): 376-389.
10.1016/j.ejcb.2010.11.016
CAS PubMed Web of Science® Google Scholar
60Furu M, Kajita Y, Nagayama S, et al. Identification of AFAP1L1 as a prognostic marker for spindle cell sarcomas. Oncogene. 2011; 30(38): 4015-4025.
10.1038/onc.2011.108
CAS PubMed Web of Science® Google Scholar
61Beiter RM, Fernández-Castañeda A, Rivet-Noor C, et al. Evidence for oligodendrocyte progenitor cell heterogeneity in the adult mouse brain. bioRxiv; 2020.
Google Scholar
62He X, Cheng R, Benyajati S, Jx M. PEDF and its roles in physiological and pathological conditions: implication in diabetic and hypoxia-induced angiogenic diseases. Clin Sci. 2015; 128(11): 805-823.
10.1042/CS20130463
Web of Science® Google Scholar
63Ek ET, Dass CR, Choong PF. PEDF: a potential molecular therapeutic target with multiple anti-cancer activities. Trends Mol Med. 2006; 12(10): 497-502.
10.1016/j.molmed.2006.08.009
CAS PubMed Web of Science® Google Scholar
64Aran D, Hu Z, Butte AJ. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 2017; 18(1): 1-14.
10.1186/s13059-017-1349-1
PubMed Web of Science® Google Scholar
65Park J, Shrestha R, Qiu C, et al. Single-cell transcriptomics of the mouse kidney reveals potential cellular targets of kidney disease. Science. 2018; 360(6390): 758-763.
10.1126/science.aar2131
CAS PubMed Web of Science® Google Scholar
66Wang X, Park J, Susztak K, Zhang NR, Li M. Bulk tissue cell type deconvolution with multi-subject single-cell expression reference. Nat Commun. 2019; 10(1): 1-9.
10.1038/s41467-019-13800-3
PubMed Web of Science® Google Scholar
67Svensson V, Natarajan KN, Ly LH, et al. Power analysis of single-cell RNA-sequencing experiments. Nat Methods. 2017; 14(4): 381-387.
10.1038/nmeth.4220
CAS PubMed Web of Science® Google Scholar
68Qin F, Luo X, Xiao F, Cai G. SCRIP: an accurate simulator for single-cell RNA sequencing data. Bioinformatics. 2022; 38(5): 1304-1311.
10.1093/bioinformatics/btab824
CAS PubMed Web of Science® Google Scholar
69Crowell HL, Leonardo SXM, Soneson C, Robinson MD. Built on sand: the shaky foundations of simulating single-cell RNA sequencing data. bioRxiv; 2021.
Google Scholar
70Mallick H, Ma S, Franzosa EA, Vatanen T, Morgan XC, Huttenhower C. Experimental design and quantitative analysis of microbial community multiomics. Genome Biol. 2017; 18(1): 228.
10.1186/s13059-017-1359-z
PubMed Web of Science® Google Scholar
71Mallick H, Rahnavard A, McIver LJ, et al. Multivariable association discovery in population-scale meta-omics studies. PLoS Comput Biol. 2021; 17(11):e1009442.
10.1371/journal.pcbi.1009442
CAS PubMed Web of Science® Google Scholar
72Zhang Y, Thompson KN, Huttenhower C, Franzosa EA. Statistical approaches for differential expression analysis in metatranscriptomics. Bioinformatics. 2021; 37(Suppl_1): i34-i41.
10.1093/bioinformatics/btab327
CAS PubMed Web of Science® Google Scholar
73Sun S, Zhu J, Zhou X. Statistical analysis of spatial expression patterns for spatially resolved transcriptomic studies. Nat Methods. 2020; 17(2): 193-200.
10.1038/s41592-019-0701-7
CAS PubMed Web of Science® Google Scholar
74Li Q, Zhang M, Xie Y, Xiao G. Bayesian modeling of spatial molecular profiling data via Gaussian process. Bioinformatics. 2021; 37(22): 4129-4136.
10.1093/bioinformatics/btab455
CAS PubMed Web of Science® Google Scholar
75Clivio O, Lopez R, Regier J, Gayoso A, Jordan MI, Yosef N. Detecting zero-inflated genes in single-cell transcriptomics data. bioRxiv; 2019:794875.
Google Scholar
76Merkle EC, You D, Preacher KJ. Testing nonnested structural equation models. Psychol Methods. 2016; 21(2): 151.
10.1037/met0000038
PubMed Web of Science® Google Scholar
77Stephens M. False discovery rates: a new deal. Biostatistics. 2017; 18(2): 275-294.
PubMed Web of Science® Google Scholar
78Zhu A, Ibrahim JG, Love MI. Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences. Bioinformatics. 2019; 35(12): 2084-2092.
10.1093/bioinformatics/bty895
CAS PubMed Web of Science® Google Scholar
79Zhang M, Liu S, Miao Z, Han F, Gottardo R, Sun W. IDEAS: individual level differential expression analysis for single-cell RNA-seq data. Genome Biol. 2022; 23(1): 1-17.
10.1186/s13059-022-02605-1
PubMed Web of Science® Google Scholar
80Preisser JS, Das K, Long DL, Divaris K. Marginalized zero-inflated negative binomial regression with application to dental caries. Stat Med. 2016; 35(10): 1722-1735.
10.1002/sim.6804
PubMed Web of Science® Google Scholar
81Long DL, Preisser JS, Herring AH, Golin CE. A marginalized zero-inflated Poisson regression model with overall exposure effects. Stat Med. 2014; 33(29): 5151-5165.
10.1002/sim.6293
PubMed Web of Science® Google Scholar

Citing Literature

Volume41, Issue18

15 August 2022

Pages 3492-3510

Differential expression of single-cell RNA-seq data using Tweedie models

Abstract

Open Research

DATA AVAILABILITY STATEMENT

Supporting Information

REFERENCES

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Differential expression of single-cell RNA-seq data using Tweedie models

Abstract

Open Research

DATA AVAILABILITY STATEMENT

Supporting Information

REFERENCES

Citing Literature

References

Related

Information