Estimating population size using single-nucleotide polymorphism-based pedigree data
Abstract
Reliable population estimates are an important aspect of sustainable wildlife management and conservation but can be difficult to obtain for rare and elusive species. Here, we test a new census method based on pedigree reconstruction recently developed by Creel and Rosenblatt (2013). Using a panel of 96 single-nucleotide polymorphisms (SNPs), we genotyped fecal samples from two Swedish brown bear populations for pedigree reconstruction. Based on 433 genotypes from central Sweden (CS) and 265 from northern Sweden (NS), the population estimates (N = 630 for CS, N = 408 for NS) fell within the 95% CI of the official estimates. The precision and accuracy improved with increasing sampling intensity. Like genetic capture–mark–recapture methods, this method can be applied to data from a single sampling session. Pedigree reconstruction combined with noninvasive genetic sampling may thus augment population estimates, particularly for rare and elusive species for which sampling may be challenging.
Introduction
Estimates of the population size and its fluctuations are often fundamental for understanding ecological, behavioral, or genetic processes (Ojaveer et al. 2004; Dochtermann and Peacock 2013; Valderrama et al. 2013) and practically indispensable for management and conservation (Katzner et al. 2011). This includes estimates of both effective and true population size, where the former is usually based on genetic data and models, while the latter typically use some form of census data, sometimes genetic. For example, such population size and trend estimates help identify particular factors that drive population dynamics and are hence critical for modeling the future of a population under different management scenarios (Lewellen and Vessey 1998). Moreover, estimates of true population size and trend are the basis for adaptive harvest quotas (Wilson and Delahay 2001) as well as identifying populations under threat of becoming endangered or extinct (Vié et al. 2009). However, reliable estimates are difficult to obtain. This is especially true for rare and elusive species, which are frequently of high conservation concern (Rolland et al. 2011). Large carnivores are no exception to this (Kindberg et al. 2009; Creel and Rosenblatt 2013) as they are generally solitary and cryptic, in addition to occurring at low densities and across large home ranges. For several ecological, economic, and societal reasons, large carnivores in particular receive a disproportional amount of attention from research, conservation, and management. For example, carnivores may strongly affect the ecosystems they occupy, by changing the behavior of other carnivores and by direct and indirect effects on prey, which can lead to downstream effects on primary production (Creel et al. 2007). In some areas, carnivores pose a threat to humans or come into direct conflict with human husbandry practices causing economic losses. For these, and more reasons, a range of remote or noninvasive methods have therefore been employed to study carnivores (Jackson et al. 2006; Kojola et al. 2014). An increasingly popular and cost-efficient approach is to use noninvasive genetic sampling for the assessment of the number of individuals in a population (e.g., Mowry et al. 2011; Sugimoto et al. 2012; Stansbury et al. 2014) often by collecting fecal samples during other management activities or by citizen volunteers (e.g., Kindberg et al. 2011).
Genetic data may also be used to assess a population's effective population size, an important parameter especially for small populations at risk of inbreeding or genetic drift. The framework of population genetics provides several way of inferring the effective population size, but the estimators suffer from being slow to respond to recent events, instead showing historic averages (Palsboll et al. 2013). To obtain more contemporary estimates, genetic data can be also used to derive demographic data used for calculating the current effective population size (e.g., Creel 2002). For many studies in ecology, however, the actual population size is a more important parameter to know than the underlying effective population size. Also in conservation, much focus has been placed on the effective population size. Yet, as pointed out by Lande (1988), the drivers of extinction are primarily habitat loss and overharvest, not lack of genetic variation. So while it is informative to know the effective population size, once its relationship to the actual population size has been determined, its continuous monitoring may be less important than knowing the actual population size. This is equally true for critically endangered populations at the verge of extinction as for larger populations not acutely threatened. Here, the actual population size is typically what management operates on when setting targets for quotas, dispersal events, or the population size and distribution.
Most statistical methods for estimating population size rely on multiple sampling events, known as capture–mark–recapture (CMR) techniques which are comprehensively discussed by Krebs (1999) and Sutherland (2006). A distinct disadvantage of classical CMR methods lies in the circumstance that the physical capture, particularly of large predators, is often impractical, costly, and potentially harmful to both sides (Mowat et al. 1994; Logan et al. 1999; Muñoz-Igualada et al. 2008). In addition, differences in catchability resulting from trap-shy or trap-happy individuals could introduce systematic trapping bias. Such differences in personality traits (Sih and Bell 2008) have been documented for many species, including badgers (Tuyttens et al. 1999), stoats (King et al. 2003), or rabbits (Sunnucks 1998).
Newer methods, such as camera trapping, have largely made classical CMR approaches obsolete in studies of large animals. But many cameras are needed to reach reasonable detection probabilities (and cameras are sometimes removed or destroyed by humans or other animals). Even more problematic is that relatively few species are reliably individually identifiable from photographs. In contrast, an individual's genotype is a unique and permanent mark. Noninvasively collected DNA samples (e.g., from feces or hair) in combination with molecular techniques offer another noninvasive alternative (Kohn and Wayne 1997; Taberlet et al. 1999; Waits and Paetkau 2005; Swenson et al. 2011). In direct genetic census methods, the genotype simply becomes a “molecular tag” (Schwartz et al. 2007) which replaces traditional means of identification like earmarks or leg bands. Genotypes can thus be used as molecular tags in a CMR framework. But genetic data contain more information than just individual genotypes, such as information on pedigree structures in the population. From such information, unsampled individuals could potentially be inferred by their genetic fingerprint and included into the population estimates.
In 2013, Creel and Rosenblatt suggested a new, pedigree-based estimator for total population size. They evaluated the performance of their method through simulations parameterized with demographic data of African lions (Panthera leo) from Zambia. The method, henceforth referred to as the Creel–Rosenblatt estimator (CRE), incorporates the sum of sampled individuals (Ns), number of breeders (Bs), number of individuals inferred from pedigree reconstruction (Nin), and the estimated number of individuals that did not breed nor were sampled (rendering them invisible to pedigree reconstruction) into the population estimate. As such, it purports to increase the precision of genetically based population estimates.
As other genetically based CMR methods, the CRE requires only one sampling event (although multiple sampling events are also possible). This makes it a useful extension to the suite of tools available to estimate population sizes under circumstances where repeated sampling is difficult. In addition to being a novel approach for estimating population size, pedigree reconstruction can be used to investigate population structure (Calboli et al. 2008; Pemberton 2008), mating behavior (Pemberton et al. 1992), or dispersal (Norman and Spong 2015). The ideal genetic marker for pedigree reconstruction should provide high genomic resolution and be geared toward providing reliable relatedness estimates (Creel and Rosenblatt 2013). Single-nucleotide polymorphisms (SNPs) have proven to be a powerful tool for studying genetic variation in populations (Brumfield et al. 2003; Morin et al. 2004). Compared with microsatellites, another type of frequently used genetic marker, SNP, offers lower error rates from mistyping and allelic dropout (Morin and McCarthy 2007; Norman et al. 2013). They are also reproducible across laboratories and are cheaper, allowing for higher genomic resolution within a given economic frame (Anderson and Garza 2006). Because only short intact sequences, typically 50–70 bp, of DNA are required for successful amplification, SNPs are especially suitable when working with degraded DNA, as is usually the case with noninvasively obtained samples (Morin et al. 2004).
Here, we use a panel of 96 SNPs recently developed for studying relatedness in the Scandinavian brown bear (Ursus arctos, Fig. 1) population (Norman et al. 2013). We reconstructed pedigrees based on hunter-collected feces and apply the CRE method to estimate the size of the brown bear populations in the Swedish counties of Dalarna, Gävleborg, and Västerbotten. Already existing population estimates for the brown bear in these areas (Kindberg and Swenson 2013, 2015) provide us with a benchmark that can be used to empirically assess the performance of the CRE outside of a simulation environment making this study system appropriate. For further comparison, we also performed rarefaction analyses to estimate population size. This constitutes the first time that the estimator is applied to empirical data, as we were unable to find any reference describing the application of this method in the Web of Science™ publication database, with the last search completed on 17 November 2015.

Materials and Methods
Study area and sample collection
The two study areas in central and northern Sweden encompassed the Swedish counties of Dalarna and Gävleborg (ca. 46,300 km²) and Västerbotten (ca. 55,200 km²), respectively. To the west, these areas are delimited by the Scandinavian mountain range and to the east by the Baltic Sea (Fig. 2). The southern border of Dalarna–Gävleborg also demarcates the approximate southern limit of the brown bear distribution in Sweden. Dalarna–Gävleborg is home to an estimated number of 793 bears, 95% CI [621, 1179] (Kindberg and Swenson 2013). In 2014, the Västerbotten population was estimated to be 362 bears, 95% CI [310, 459] (Kindberg and Swenson 2015). Studies of maternally inherited mitochondrial DNA (mtDNA) have shown that brown bears in Sweden belong to two genetically distinct lineages with approximately 7% differentiation between them (Taberlet and Bouvet 1994). The western lineage, found in south-central Sweden, originated from the Iberian refugium during the last ice age (today's France and Spain), whereas the eastern lineage, found throughout northern Sweden, can be traced to Karelia in Russia (Taberlet and Bouvet 1994). At present, the two lineages remain largely separated around a well-documented contact zone at the height of Östersund in central Sweden (Taberlet et al. 1995). Population monitoring could be especially important in case of the western mtDNA haplotype (southern population) which is only found in Europe, whereas the eastern haplotype is also prevalent in Asia and North America (Waits et al. 2000; Saarma et al. 2007; Korsten et al. 2009; Hirata et al. 2013).

Fecal samples were collected by volunteers, predominantly moose (Alces alces)-hunters, following the protocol of Bellemain et al. (2005) and Kindberg et al. (2011) during the periods of August–October 2012 in Dalarna–Gävleborg and August–December 2014 in Västerbotten.
Volunteers recorded collection date and coordinates of the sample location and mailed this information together with their samples to the county administrations (in the case of the Dalarna–Gävleborg collection) or in Västerbotten directly to the Molecular Ecology Group at the Swedish University of Agricultural Sciences (SLU) in Umeå. Upon arrival samples were stored in 70% ethanol solution at −20°C as recommended by Frantzen et al. (1998).
Molecular analysis
DNA extraction from the Dalarna–Gävleborg samples was carried out by Bioforsk, Norway (Hagen and Aarnes 2013), following procedures described by Schregel et al. (2012). In Västerbotten, DNA extraction was performed at SLU using a QIAsymphony SP (Qiagen, Hilden, Germany) robot according to the manufacturer's instructions.
SNPs were genotyped on a Fluidigm Biomark™ (Fluidigm Corporation, San Francisco, USA) using the 96 SNP panel developed by Norman et al. (2013). Since its first publication, the panel has undergone slight modifications (e.g., two linked SNPs were substituted with Y-chromosome SNPs) and now consists of 85 autosomal SNPs, four mtDNA SNPs as well as four Y-chromosome and three X-chromosome markers for sex determination (Norman and Spong 2015). Each run included negative controls with water in place of DNA. The genotype clusters assigned by the Biomark software were manually screened, and loci of questionable cluster affiliation were invalidated and removed from subsequent analyses. Species and sex were assigned according to the following criteria:
- bear = mtDNA SNP calls ≥3
- male = Y-chromosome SNP calls ≥3
- female = Y-chromosome SNP calls = 0 and X-chromosome SNP calls ≥2
The above criteria were designed to avoid possible misidentification of poorly amplified male samples as females. In males, Y and X markers occur in equal proportion. The requirement that at least two of three X markers had to amplify for samples to be called female makes it extremely unlikely that such a sample was a male that had not amplified for the Y markers. As we only included samples that had amplified for more than 70 loci, the risk of having none of four Y markers but two of the X amplify for a male is 1 × 10−4.
Pedigree reconstruction
To reconstruct pedigrees, we used FRANz software version 2.0.0 (Riester et al. 2009) which uses Markov Chain Monte Carlo (MCMC) simulation for estimating the statistical confidence of parentage inference. The software requires specifying an approximate maximum number of females and males (Nfmax and Nmmax) to avoid an empty pedigree due to convergence of the Markov Chain to a very high number of individuals (Riester et al. 2009). We used the estimates from rarefaction analysis and the sex ratio present in the genotyped samples to set Nfmax/Nmmax to 538/419 (Dalarna–Gävleborg) and 249/239 (Västerbotten), respectively. Typing errors were empirically determined to 1.538 × 10−4 for Dalarna–Gävleborg and 0.01 for Västerbotten. The error rates from the two areas differ. This is because samples from Dalarna–Gävleborg held the best available quality extract from each individual successfully genotyped with microsatellites at Bioforsk, whereas the error rate for the Västerbotten samples includes all samples that passed the amplification threshold for SNP genotyping. As microsatellite genotyping requires much higher quality DNA, the error rate of such samples becomes much lower. The maximum likelihood pedigrees produced by FRANz identify the putative sire and dam of sampled individuals. We further verified the FRANz reconstructed pedigrees by calculating the Lynch–Ritland relatedness coefficient (r) (Lynch and Ritland 1999) for all identified parent–offspring (PO) and full-sibling (FS) pairs using COANCESTRY version 1.0.1.2. (Wang 2011). We chose the Lynch–Ritland relatedness coefficient because it has been found to have the lowest rate of misclassification and lower overall variance compared to other pairwise relatedness estimators (Stone and Björklund 2001; Csillery et al. 2006).
Population estimates
Rarefaction, also referred to as accumulation-curve method, has traditionally been used to estimate species diversity in an area by plotting the cumulative number of newly recorded species against the total number sampled (Colwell and Coddington 1994). The same underlying logic can be applied for estimating population size by substituting the species count with the number of unique individuals/genotypes. As suggested by Kohn et al. (1999), a curve defined by the equation y = ax/(b+x) was fitted to our data. In this model, y equals the number of unique genotypes, x corresponds to the number of samples (genotyped feces), b is the rate of decline in the slope, and the asymptote a represents the estimated population size (Bellemain et al. 2005). We calculated the parameters a and b through nonlinear iterative regression using the statistical software package JMP Pro version 11.0.0 (SAS Institute). To account for the variance caused by the order in which samples are drawn, we repeated this process 100 times with random iterations of the genotype sampling order and used the mean of the resulting asymptotes as the rarefaction population estimate.
For the pedigree-based population estimates, we followed the recommendations by Creel and Rosenblatt (2013) and specified the number of individuals sampled (Ns) as the number of individual genotypes, known breeders (Bs) as those individuals who had progeny in the pedigree and inferred individuals (Nin) as the missing parent in known parent–offspring dyads. However, assuming that each missing parent in the dyads constitutes a new individual would most likely cause an overestimation because brown bear males are known to mate with several females and vice versa (Steyaert et al. 2012). For example, an inferred sire may be the missing father in more than one of the mother–offspring dyads. Therefore, we used the improbable scenario in which the number of inferred individuals (Nin) equals the number of dyads in the pedigree only for approximating an upper bound of the population estimate. For a more realistic estimate that accounts for multiple parentages, we first screened all parent–offspring dyads in the pedigree for individuals with several offspring. If pairwise comparisons of the Lynch–Ritland relatedness suggested full-siblings (r ~ 0.5) among those offspring, we inferred only one new individual (the missing parent) from these dyads. For the remaining cases, we used a different approach where we assumed that the likelihood of sampling each sex was equal: We determined the ratio of all the known individual dams to the known individual sires in the pedigree and then used this ratio to infer the missing counterparts from the individual dams and sires in the pedigree dyads. In this way, the ratio of dams to sires with the inferred individuals included remains the same as it was in the original pedigree.
Another problem pointed out by Creel and Rosenblatt (2013) is the circumstance that there is no way to ascertain how many of the inferred individuals are actually still alive at the time of the estimate. To account for mortality among inferred individuals, we assumed them to be at the typical breeding age of ~5 years (Swenson et al. 2001) and applied the age-specific annual mortality rates as reported in Nilsson (2013) of 7.2% to inferred dams and 11.6% to sires, respectively.
Finally, we assessed the accuracy of the CRE results by comparing them to the official population estimates (Kindberg and Swenson 2013, 2015) and to the results of rarefaction analysis. Because there is currently no method to assign confidence limits to the CRE population estimates, we estimated an upper and a lower bound. For the lower bound, we simply used the count of sampled genotypes. For the upper bound, we treated Nin as equal to the number of dyads in the pedigree and assumed zero mortality among the inferred individuals.
To test the performance of the CRE at different sampling intensities, we used the data from Västerbotten due to the higher sampling coverage (approximately 73% of the population included in the sample) compared to only 55% in Dalarna–Gävleborg. Varying the sampling intensity from 10% to 60% of the official population estimate, we applied the CRE to ten replicates of samples randomly drawn in correspondence with each sampling intensity level.
Not having 100% of the population included in the sample is a common limitation in studies based on field data, but validation of simulation results using empirical data may still reveal strengths and weaknesses.
We assumed that the pedigree would become less complete (contain fewer parent–offspring pairs) the further away the sampling occurred from the core sampling frame. This is because breeding individuals in peripheral areas might have moved beyond the borders of the sampling area and therefore may have been missed during the sample collection. In both areas, Dalarna–Gävleborg and Västerbotten, the only true population border is the Baltic Sea to the east. To the north and the south, bears occur beyond the borders of the sampling areas. Most interesting is the border to the west, formed by the Scandinavian mountain range. Because mountain terrain can be difficult to access and because moose hunting is of lower intensity at higher altitudes, the sampling effort by volunteers was lower there than in other areas. If the pedigrees were to show similar levels of incompleteness along the western border compared with the “open” borders to the north and the south, it could indicate that bears in the mountains were missed in the sampling. If this were the case, it would lead to an underestimation of the population size. If, on the other hand, the mountains form a true border like the Baltic Sea, then the pedigree should be equally complete in both these locations.
From the coordinate data that were provided along with the fecal samples, the median centers of all known locations for an individual were calculated using R (R Development Core Team, 2008). We considered the median to be less biased than the mean because of its lower sensitivity to outliers. Inferring home ranges from the locations of fecal samples is prone to errors but Bellemain et al. (2005) reported that the majority of fecal sites fall inside the home range or within 10 km of it. We determined the center point of the sampling area as the median center of all individual locations using the GIS package ArcMap version 10.2.2 (ESRI, 2014). We then sampled the individuals closest to the center point and the four borders (north, south, east, and west), respectively, at sample sizes of n = 100 for Dalarna–Gävleborg and n = 70 in Västerbotten. The number of samples for Västerbotten had to be lower to avoid overlap because fewer individuals in total were available to sample from. In a second step, we also sampled males and females separately (Dalarna–Gävleborg, n = 50; Västerbotten, n = 30) to investigate whether there are detectable differences between mother–daughter and father–son dyads.
To test for differences in the completeness of the pedigree, we used Pearson's chi-square test for homogeneity of proportions with the proportions corresponding to the number of parent–offspring pairs in the pedigree per number of sampled individuals. To further test whether there is a spatial effect on parent–offspring pairs in sex-separated pedigrees, we also sampled males and females randomly across the whole sampling area (Dalarna–Gävleborg, n = 50; Västerbotten, n = 30).
Results
We successfully genotyped 433 individuals (243 females, 190 males) for Dalarna–Gävleborg and 265 individuals (136 females, 129 males) for Västerbotten. Rarefaction analysis was based on 873 samples from Dalarna–Gävleborg and 677 from Västerbotten. The maximum frequency at which an individual occurred in the sample was 19 for Dalarna–Gävleborg and 16 for Västerbotten, respectively. The amplitudes of the curves fitted to the rarefaction data suggested population sizes of N = 895 (Dalarna–Gävleborg) and N = 484 (Västerbotten).
Table 1 summarizes the results of the FRANz reconstructed pedigrees. The proportions of breeders in both samples (0.37 Dalarna–Gävleborg, 0.42 Västerbotten) are not significantly different (z = −1.42, P = 0.16), and the ratios of dams to sires are also very similar. As shown in Figure 3, the mean pairwise Lynch–Ritland relatedness coefficient (r) for PO dyads and FS pairs did not significantly differ from the expected value of r = 0.5 for first-order relatives (PO: t(292) = 0.33, P = 0.74; FS: t(39) = 1.60, P = 0.12) which corroborates the reconstructed pedigrees.
Dalarna–Gävleborg | Västerbotten | |
---|---|---|
N s | 433 | 265 |
Number of triads | 65 | 37 |
Number of dyads | 170 | 123 |
Number with “no parent” | 198 | 105 |
B s | 159 | 112 |
Ratio of dams: sires | 1.30 | 1.20 |

In Dalarna–Gävleborg, we inferred six sires and four dams directly from full-sibling relationships among the known parent–offspring dyads. An additional 52 sires and 65 dams were inferred using the dam to sire ratio approach. After correcting for mortality, the total number of inferred individuals (Nin) (i.e., those missing from the pedigree) was 115. For Västerbotten, screening the parent–offspring dyads for full-siblings yielded four sires and three dams, whereas the ratio method suggested a further 41 sires and 45 dams resulting in Nin = 85 after mortality correction. Therefore, applying CRE to these numbers (Ns and Bs from Table 1 and Nin) resulted in estimated population sizes of N = 630 for Dalarna–Gävleborg and N = 408 for Västerbotten. Comparison with official bear population estimates shows that both CRE results fall within the 95% CI of official estimates (Fig. 4). Using the genotype count as a measure for minimum population size and number of dyads for the estimate of Nin under the assumption of no mortality, the lower and upper bounds for the CRE correspond to 433 and 728 in Dalarna–Gävleborg and to 265 and 476 in Västerbotten.

Testing for effects of sampling intensity with our empirical data, we found a similar pattern as Creel and Rosenblatt (2013) did in their simulations (Fig. 5). At a sampling intensity of 10%, the coefficient of variation (CV) for the different CRE estimates was 7% and the percentage difference to the official population size estimate 157%; at a sampling intensity of 60%, both CV and percentage difference decreased to 3%. This suggests that CRE population estimates increase in both precision and accuracy with increasing sampling intensity.


Our tests for edge effects of sampling boundaries revealed no significant differences in completeness of the pedigree between the core area and four peripheral border areas in Dalarna–Gävleborg, χ²(4) = 7.05, P = 0.134, or Västerbotten, χ²(4) = 1.97, P = 0.74.
When males and females were sampled separately, significantly more mother–daughter than father–son dyads were found in Dalarna–Gävleborg, χ²(9) = 62.79, P < 0.0001 (Fig. 6). When the same number of males and females (n = 50) were sampled randomly across the whole area, there was no significant difference between the proportions of mother–daughter and father–son dyads per sampled individuals, z = −0.521, P = 0.602. In Västerbotten, the difference in proportions between male and female dyads was not significant, χ²(9) = 15.28, P = 0.083. The two-sample z-test for proportions when females and males (n = 30) were sampled randomly across the whole area was also not significant, z = −0.645, P = 0.52.

Discussion
In this study, we applied the recently developed Creel–Rosenblatt estimator (CRE), a pedigree reconstruction-based method, to estimate the size of two fractions of the Swedish brown bear population. SNP genotypes obtained from noninvasively collected fecal samples were used to reconstruct pedigrees from which we were able to infer the presence of additional individuals which otherwise would have remained undetected. Compared to a simple count of detected genotypes, the CRE increased the population estimates by 45% for Dalarna–Gävleborg and 54% for Västerbotten. The circumstance that reliable population estimates were available prior to this study provided an excellent opportunity for testing this new census method because it allowed for the verification of the results. Our pedigree reconstruction-based population estimates of the CRE fell within the confidence limits of the most recent official estimates. This is an indication that the method provides a potential alternative to traditional CMR approaches with the added benefit that it can be employed using data from a single sampling event. While their percentage relative precision (PRP), a measure which relates a population estimate to its 95% confidence limits (Sutherland 2006), is actually quite good for the study of natural populations (21% in Västerbotten and 35% in Dalarna–Gävleborg), their confidence limits are nevertheless wide. It is therefore difficult to ascertain how close our estimates really are to the true population size. However, the CRE results appear to be further corroborated by the rarefaction results. Simulations by Valière (2002) have shown that the model we used to extrapolate the rarefaction curves has a tendency to overestimate population size if the sampling effort is high. As shown in Figure 4, the rarefaction results indeed exceed both the CMR and CRE estimates and more so in Västerbotten where the sampling intensity was higher. This suggests that the CRE results are actually close to the true figure.
As previously demonstrated by Creel and Rosenblatt (2013), we found that the method works best if the sampling intensity exceeds ~40%. At lower sampling intensities, the estimator tends to severely underestimate the population size. This can be explained by the fact that small sample sizes usually do not contain many parent–offspring pairs which severely restricts the pedigree reconstruction. At much higher sampling intensities (e.g., >80%), hardly any information is gained over a simple count (see Creel and Rosenblatt 2013). Moreover, the risk of overestimation also increases. If, in the extreme case, 100% of individuals were sampled and no reliable information of mortality was available, the CRE would severely overestimate the size of the population because individuals (albeit dead) would still be inferred from the pedigree (Creel and Rosenblatt 2013). The CRE is therefore best suited for noncyclical species with generational overlap and either low or well-documented mortality rates. The published mortality rates for Swedish brown bears are likely to be accurate because natural mortality of adults is rare in comparison with hunting or traffic accidents which are closely monitored (Mörner et al. 2005). Gross overestimation due to unknown mortality rates can also be avoided by comparing the CRE estimates to those obtained from rarefaction analysis on the same data because the slope and amplitude of the rarefaction curve provide good approximations of population size and the proportion sampled. Contrary to our expectation, we found no significant differences in the completeness of the pedigree when sampling individuals from the peripheries of the study area compared to the center regions. This suggests that many individuals roamed widely with frequent crossings in and out of the study area. Nor did we see a more incomplete pedigree in the west where mountains and low human population density result in a lower sampling effort, suggesting that a similar proportion of individuals are sampled in this area too. Indeed, all peripheral areas were similar, including the hard border to the east, the Baltic Sea. The circumstance that more mother–daughter than father–son dyads were found when separately sampling the same number of individuals from both sexes within a specific area is expected as brown bears show female philopatry (Blanchard and Knight 1991; Støen et al. 2005; Saarma and Kojola 2007).
For well-studied populations that are regularly sampled, the CRE offers no immediate advantage over established CMR methods in terms of estimating population size. However, if sampling occurs over a number of years, the required sampling effort to maintain a desired sampling coverage should be considerably reduced as genotyped individuals accumulate. In simulations by Creel and Rosenblatt (2013), the proportion of the population that had to be sampled typically dropped to ≤20% within 3 years. The CRE could therefore prove to be useful in situations where budgetary or logistic constraints make repeated, large-scale sampling events unrealistic, for example, in remote regions or developing countries.
In their simulations, Creel and Rosenblatt (2013) tracked all individuals throughout the simulated period of 15 years which means they consistently had accurate information about parent–offspring relationships and mortalities. Based on these data, they were able to infer individuals (as the missing parent in parent–offspring dyads) without error. In pedigrees reconstructed from empirical genetic field data, this inference is less straightforward especially for species with nonmonogamous mating behavior. Inferring the correct number of missing sires or dams continues to be a major challenge of the CRE method. The additional information provided by the Lynch–Ritland relatedness coefficient (r) helped to improve the resolution of the pedigree by enabling us to detect full-siblings among the parent–offspring dyads which then allowed for correct inference of the missing sire or dam. Based on the values of r, we suspect that there are several half-siblings which share the invisible parent. Unfortunately, it is not sufficient to infer the missing parent as one individual in these cases because the coefficient only captures the degree of relatedness and not the specific relationship. Half-siblings share on average approximately 25% of alleles but the same is true for grandparent–grandoffspring and avuncular relationships (Blouin 2003). Thus, the true relationship between two individuals can usually not be inferred from their degree of relatedness alone.
To further improve the inferences from the pedigree, information about the age of the sampled individuals is needed. If genetic relatedness can be combined with age in the analysis, the most probable relationships are easily determined. We recommend keeping track of each genotyped individual from the date it was first recorded. Even if the true age remains unknown, a minimum age can be assigned, and over the course of several sampling periods, individuals can at least be compared on the basis of age relative to one another. This would considerably improve the accuracy of the pedigrees, particularly with regard to the directionality in putative parent–offspring dyads (Kopps et al. 2015), thereby helping to refine the CRE population estimates. Using the ratio of known dams and sires for inference of individuals may not fully reflect reality, but given the restrictions of the data, we have shown that it results in a credible estimate.
Concurrent with the simulation results of Creel and Rosenblatt (2013), and using empirical data, we show that accurate estimates of total population size are possible from reconstructed pedigrees. The estimator is limited by the resolution of the pedigree and potentially unknown mortality rates. It therefore works best in long-lived species with lots of generational overlap and is further helped if “first seen” records are kept to give rough estimates of age. This makes the method particularly appealing for recurring sampling in the same population.
Acknowledgments
We thank the numerous volunteers who participated in the sample collection and our laboratory technician, Helena Königsson, for carrying out the genotyping. Jonas Kindberg from the Scandinavian Brown Bear Research Project provided valuable insights into the official population estimates. We also thank the reviewers for helpful comments that improved the manuscript.
Conflict of Interest
None declared.