Volume 63, Issue 6 pp. 1841-1845
Technical Note

GrigoraSNPs: Optimized Analysis of SNPs for DNA Forensics,

Darrell O. Ricke Ph.D.

Corresponding Author

Darrell O. Ricke Ph.D.

Bioengineering Systems & Technologies, Massachusetts Institute of Technology Lincoln Laboratory, 244 Wood Street, Lexington, MA, 02421-6426

Additional information and reprint requests:

Darrell O. Ricke, Ph.D.

Bioengineering Systems & Technologies

Massachusetts Institute of Technology Lincoln Laboratory

244 Wood Street

Lexington

MA 02421-6426

E-mail: [email protected]

Search for more papers by this author
Anna Shcherbina M.Eng.

Anna Shcherbina M.Eng.

Bioengineering Systems & Technologies, Massachusetts Institute of Technology Lincoln Laboratory, 244 Wood Street, Lexington, MA, 02421-6426

Search for more papers by this author
Adam Michaleas

Adam Michaleas

Bioengineering Systems & Technologies, Massachusetts Institute of Technology Lincoln Laboratory, 244 Wood Street, Lexington, MA, 02421-6426

Search for more papers by this author
Philip Fremont-Smith M.S.

Philip Fremont-Smith M.S.

Bioengineering Systems & Technologies, Massachusetts Institute of Technology Lincoln Laboratory, 244 Wood Street, Lexington, MA, 02421-6426

Search for more papers by this author
First published: 16 April 2018
Citations: 5
This material is based upon work supported under Air Force Contract No. FA8721-05-C-0002 and/or FA8702-15-D-0001.
Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the U.S. Air Force.

Abstract

High-throughput sequencing (HTS) of single nucleotide polymorphisms (SNPs) enables additional DNA forensic capabilities not attainable using traditional STR panels. However, the inclusion of sets of loci selected for mixture analysis, extended kinship, phenotype, biogeographic ancestry prediction, etc., can result in large panel sizes that are difficult to analyze in a rapid fashion. GrigoraSNP was developed to address the allele-calling bottleneck that was encountered when analyzing SNP panels with more than 5000 loci using HTS. GrigoraSNPs uses a MapReduce parallel data processing on multiple computational threads plus a novel locus-identification hashing strategy leveraging target sequence tags. This tool optimizes the SNP calling module of the DNA analysis pipeline with runtimes that scale linearly with the number of HTS reads. Results are compared with SNP analysis pipelines implemented with SAMtools and GATK. GrigoraSNPs removes a computational bottleneck for processing forensic samples with large HTS SNP panels.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.