The presence of a cytopathologist increases the diagnostic accuracy of endoscopic ultrasound-guided fine needle aspiration cytology for pancreatic adenocarcinoma: a meta-analysis
Correspondence:
S. Hébert-Magee, UAB Department of Pathology, 136 Hospital Support Building, Birmingham, AL 35249-6823, USA
Tel.: +1 205 975 3299; Fax: +1 205 934 7094;
E-mail: [email protected]
Abstract
Objective
A meta-analysis has not been previously performed to evaluate critically the diagnostic accuracy of endoscopic ultrasound-guided fine needle aspiration (EUS-FNA) of solely pancreatic ductal adenocarcinoma and address factors that have an impact on variability of accuracy. The aim of this study was to determine whether the presence of a cytopathologist, variability of the reference standard and other sources of heterogeneity significantly impacts diagnostic accuracy.
Methods
We conducted a comprehensive search to identify studies, in which the pooled sensitivity, specificity, likelihood ratios for a positive or negative test (LR+, LR−) and summary receiver-operating curves (SROC) could be determined for EUS-FNA of the pancreas for ductal adenocarcinoma using clinical follow-up, and/or surgical biopsy or excision as the reference standard.
Results
We included 34 distinct studies (3644 patients) in which EUS-FNA for a solid pancreatic mass was evaluated. The pooled sensitivity and specificity for EUS-FNA for pancreatic ductal adenocarcinoma was 88.6% [95% confidence interval (CI): 87.2–89.9] and 99.3% (95% CI: 98.7–99.7), respectively. The LR+ and LR– were 33.46 (95% CI: 20.76–53.91) and 0.11 (95% CI: 0.08–0.16), respectively. The meta-regression model showed rapid on-site evaluation (ROSE) (P = 0.001) remained a significant determinant of EUS-FNA accuracy after correcting for study population number and reference standard.
Conclusion
EUS-FNA is an effective modality for diagnosing pancreatic ductal adencarcinoma in solid pancreatic lesions, with an increased diagnostic accuracy when using on-site cytopathology evaluation.
Introduction
In spite of advances in cancer treatment, ductal adenocarcinoma of the pancreas continues to be a fatal disease in the majority of patients. In 2010, the number of pancreatic cancers in the US was estimated to be slightly over 43 000 patients, with nearly 95% being pancreatic adenocarcinoma.1 A diagnosis of pancreatic carcinoma is associated with a 22% 1-year survival compared with 78.5% for all cancer sites combined.1 Delayed presentation and advanced aggressive disease contribute to the poor prognosis experienced by the vast majority of patients. Survival beyond 1 year is dependent on the size and location of the tumour. Therefore, early detection and diagnosis is essential for the possibility of a more favourable prognosis.
Endoscopic ultrasound-guided fine needle aspiration (EUS-FNA) is a technique that has gained wide acceptance for the procurement of pancreatic tissue for diagnostic and staging purposes. Its greatest advantage is high spatial resolution that allows the detection of very small lesions and vascular invasion, particularly in the pancreatic head and neck, which may not be detected on transverse computed tomography (CT), requiring helical CT with CT angiography.2 In addition, this less invasive procedure is often optimal in the procurement of diagnostic tissue in patients with unresectable tumours. In the past decade or so, EUS-FNA has been increasingly used as a diagnostic tool for pancreatic ductal carcinoma with several institutions reporting varying sensitivities and specificities. Very few studies have examined the diagnostic utility of EUS-FNA of pancreatic ductal carcinoma alone, whereas most retrospective and prospective study designs have collectively looked at all primary pancreatic solid lesions, including lymphomas and pancreatic neuroendocrine neoplasms. The aim of this study was to perform a meta-analysis to assess the overall diagnostic accuracy of EUS-FNA for pancreatic ductal carcinoma. We were interested primarily in identifying the causes of heterogeneity in diagnostic accuracy among different studies, including the presence of on-site cytopathology, the use of different reference standards and understanding the reasons underlying the variability in accuracy of EUS-FNA, if present.
Methods
Literature search
A systematic literature search was performed to identify studies assessing the diagnostic value of EUS-FNA for pancreatic ductal carcinoma. The MEDLINE and SCOPUS databases, from January 1994 to March 2011, were searched with the following keywords: (‘EUS’ OR ‘endoscopic ultrasound’ OR ‘endoscopies’ OR ‘ultrasonic endoscopy(ies)’) AND (‘pancreas OR pancreatic’ OR ‘adenocarcinoma OR pancreatic adenocarcinoma’ OR ‘ductal carcinoma OR pancreatic ductal carcinoma OR ductal carcinoma of the pancreas’) OR (‘solid tumour(s)’ OR ‘solid mass’ OR ‘solid lesion’) AND (‘accuracy’ OR ‘sensitivity OR specificity’ OR ‘true-positive OR true-negative OR false-positive OR false-negative’).
Other databases, such as Embase and the Cochrane library were also checked for relevant articles with similar key terminology.
Study selection
Two investigators independently reviewed all returned-search literature (S.H. and I.E.). Accordingly, all abstracts were read to select for potentially eligible articles. Subsequent retrieval of the full text of these articles was performed to further determine eligibility. Non-English literature returns were excluded. The inclusion and exclusion criteria are shown in Table 1. According to our strict criteria, studies with potentially overlapping study populations were excluded. The year of publication was very useful as we included only the study with the largest patient population and published latest in time, whereas previous studies by the same author, with often a smaller subgroup of patients, were excluded unless we could determine the studies were independent.
Inclusion criteria | Exclusion criteria |
---|---|
English-published literature | Articles that published subsets of the same data from previous publications |
Endoscopic ultrasound-guided fine needle aspiration (EUS-FNA) used as diagnostic modality | Studies that did not provide sufficient data to construct a 2 × 2 table |
Pancreatic ductal carcinoma was the diagnosis | Articles that did not include raw data such as case reports, editorials and letters |
Histopathological analysis and/or close clinical and imaging follow-up were used as the reference standard | Studies in which fine needle aspirates (FNAs) were obtained by endoscopic ultrasound (EUS) and other modalities (such as computed tomography) AND in which the EUS-FNA results were not separated out from other modalities |
Sufficient data was present to extrapolate the true-positive (TP), false-positive (FP), true-negative (TN) and false-negative (FN) for pancreatic adenocarcinoma diagnosis | Articles in which data was reported also for non-ductal pancreatic carcinoma and the raw data regarding pancreatic ductal carcinoma could not be ascertained |
The study designs (retrospective/prospective) were determined to be acceptable or unacceptable if the patients were or were not consecutively included, whether or not the test results were blindly interpreted, and dependent upon the type of reference standard that was used in the study.
Data collection and extraction
From the selected studies that met the strict inclusion criteria, two independent investigators extracted the following data onto standardized data forms utilizing Microsoft Excel: design, author, year of publication, sample size, clinical context, the selected study's inclusion and exclusion criteria, reference standard, number of passes, presence of cytopathology assessment and the number of true-positive (TP), true-negative (TN), false-positive (FP) and false-negative diagnoses (FN). 2 × 2 tables were constructed using the numbers of TP (ai), FP (bi), FN (ci) and TN (di) diagnoses in the detection of pancreatic ductal carcinoma.
Quality assessment
Quality assessment of diagnostic accuracy studies (QUADAS), an evidence-based quality assessment tool used in systematic reviews of studies of diagnostic accuracy, was used to evaluate the risk of bias and variability of the studies and extract relevant study characteristics.3 The Cochrane handbook for assessing diagnostic accuracy in systematic reviews suggests the use of 11 out of 14 QUADAS items.3 The following eleven QUADAS methodological quality questions were assessed: (1) Was the patient spectrum representative of an actual patient population in clinical practice? (2) Was the reference standard acceptable? (3) Was the period of time between the index test and the reference standard limited, so that it was less likely that the target condition changed/evolved between the two tests? (4) Were all patients or just a subset (partial segment) evaluated with the reference standard? (5) Did all patients receive the same reference standard, regardless of the index test results? (6) Were the results from the index test interpreted without knowledge of the reference standard results? (7) Were the index test results not incorporated into the reference standard? (8) Were the reference standard results interpreted while blinded to the index test results? (9) Were clinical data analogous to what would be available in practice presented? (10) Were uninterpretable results mentioned? (11) Were withdrawals or exclusions explained?
Statistical analysis


The likelihood ratio for a positive test result (LR+) was calculated by dividing the sensitivity by the false-positive error rate, whereas, the likelihood ratio for a negative test result (LR−) was calculated by dividing the false-negative error rate by the specificity. The diagnostic odds ratio (DOR) was calculated by dividing the (LR+) by (LR−). In addition, summary receiver-operating curves (SROC), where sensitivity is the y-axis and 1-specificity is the x-axis, were produced. The closeness of the area under the curve (AUC) to 1.0 is a validated depiction of the diagnostic accuracy. The variation in accuracy measures in the individual studies and in the pooled measures were displayed graphically using forest plots which visually show the amount of variation between the sensitivity and specificity of the individual studies, as well as an estimate of the overall accuracy of all the studies combined.
Heterogeneity was analysed by performing subgroup analysis, comparing different characteristics among the included studies. Studies were evaluated by geographic location (continent of origin), study design (retrospective or prospective), sample size, presence of on-site cytopathologist, reference standard and QUADAS score. QUADAS scores were calculated as follows. Each study was assessed for the specified QUADAS components; a score of +2, +1, or 0 was assigned for each QUADAS criterion that was met, undetermined, or not met, respectively. An additive sum of the assigned points determined a final score for each study. A funnel plot of standard error was created to assess for publication bias. ORs of the included studies were plotted according to the variance of the log OR estimate. The x-axis consisted of the natural logarithm of the diagnostic OR, whereas on the y-axis, we plotted the sample size. Smaller studies (with fewer patients) that have less precision in estimating the underlying OR will scatter widely, with a narrowing among larger studies. In the absence of publication bias, the plot resembles a symmetrical funnel. If there is bias however, the funnel plot will be asymmetrical.
Results
Literature search and selection of studies
Our initial search yielded 683 study titles and abstracts (465 MEDLINE, 218 SCOPUS). Of these, 34 studies met the pre-defined inclusion and exclusion criteria and were selected for meta-analysis. Six hundred and forty-nine were excluded because (1) the articles were duplicates from the other database or non-English publications (n = 412); (2) the aim of the study was not to reveal the diagnostic value of EUS-FNA for identification and characterization of pancreatic carcinoma, particularly ductal adenocarcinoma (n = 180); (3) insufficient data were reported to construct a 2 × 2 table (n = 44); (4) researchers in the articles did not use a proper reference standard or follow-up interval of equal to or greater than 5 months (n = 8); (5) the full-text article could not be obtained because the journal was no longer in publication (n = 3); and (6) statistical analysis could not be performed with the data provided (n = 2). Selection of our studies is illustrated in Figure 1. The full text of all 34 studies was read for complete examination of their contents.4-37

The 34 studies included 3644 patients, with 2285 pancreatic adenocarcinomas (Tables 2 and 3). Seventeen studies were noted to be retrospective and 17 prospective. Of the 34 selected, 29 used EUS-FNA alone, two used combined EUS-FNA and EUS TruCut, and three combined EUS-FNA and CT FNA, EUS-FNA, CT FNA and US FNA, and EUS-FNA and contrast harmonic echo (CHE-EUS), respectively. In all studies, the cytopathologists were blinded to the diagnostic findings from previous or concurrent studies and information was limited to clinical presentation and radiographic studies alone.
First author | Year | Location | Study design | No. of patientsa | Length of study | On-site cytopathology for all cases | Reference standard | QUADAS score |
---|---|---|---|---|---|---|---|---|
Cahn4 | 1996 | N. America | Prospective | 50 | 32 | Not Provided | Combined | 18 |
Baron5 | 1997 | N. America | Prospective | 47 | 24 | Yes | Histology | 18 |
Chang6 | 1997 | N. America | Prospective | 44 | 30 | Sometimes | Combined | 18 |
Faigel7 | 1997 | N. America | Prospective | 41 | 15 | Yes | Combined | 18 |
Bentz8 | 1998 | N. America | Prospective | 45 | 15 | Yes | Combined | 18 |
Binmoeller9 | 1998 | Europe | Prospective | 40 | 10 | No | Combined | 18 |
Suits10 | 1999 | N. America | Prospective | 98 | 36 | Yes | Combined | 18 |
Erickson11 | 2000 | N. America | Retrospective | 109 | 48 | Yes | Combined | 16 |
Voss12 | 2000 | Europe | Retrospective | 90 | 36 | No | Combined | 18 |
Brandwein13 | 2001 | N. America | Retrospective | 43 | 36 | Sometimes | Histology | 20 |
Fritscher-Ravens14 | 2002 | Europe | Retrospective | 207 | 24 | Yes | Combined | 18 |
Eloubeidi15 | 2003 | N. America | Prospective | 101 | 11 | Yes | Combined | 19 |
Levy16 | 2003 | N. America | Retrospective | 6 | 2 | No | Histology | 20 |
Raut17 | 2003 | N. America | Retrospective | 233 | 23 | Yes | Combined | 20 |
Varadarajulu18 | 2004 | N. America | Prospective | 3 | 9 | Yes | Combined | 21 |
Varadarajulu19 | 2005 | N. America | Prospective | 282 | 36 | Yes | Combined | 19 |
Ho20 | 2006 | N. America | Retrospective | 11 | 70 | No | Combined | 21 |
Horwhat21 | 2006 | N. America | Prospective | 36 | 60 | Yes | Combined | 15 |
Mitsuhashi22 | 2006 | N. America | Retrospective | 267 | 56 | Yes | Combined | 18 |
Aithal23 | 2007 | Europe | Prospective | 167 | 38 | No | Combined | 18 |
Ardengh24 | 2007 | S. America | Retrospective | 69 | 119 | No | Combined | 18 |
Argawal25 | 2008 | N. America | Retrospective | 30 | 48 | Yes | Combined | 18 |
Fisher26 | 2009 | Australia | Prospective | 93 | 44 | Yes | Combined | 18 |
Giovannini27 | 2009 | Europe | Retrospective | 121 | 4 | Sometimes | Combined | 18 |
Hikichi28 | 2009 | Asia | Retrospective | 75 | 48 | Sometimes | Combined | 18 |
Hwang29 | 2009 | Asia | Retrospective | 139 | 12 | No | Combined | 18 |
Krishna30 | 2009 | N. America | Prospective | 140 | 45 | Yes | Combined | 18 |
Touchefeu31 | 2009 | Europe | Prospective | 90 | 48 | No | Combined | 16 |
Wilson32 | 2009 | Australia | Retrospective | 72 | 46 | Not Provided | Combined | 18 |
Napoleon33 | 2010 | Europe | Prospective | 35 | 12 | Not Provided | Combined | 18 |
Noda34 | 2010 | Asia | Prospective | 19 | 41 | No | Combined | 18 |
Turner35 | 2010 | N. America | Retrospective | 520 | 96 | Sometimes | Combined | 17 |
Zhang36 | 2010 | N. America | Retrospective | 279 | 60 | Yes | Combined | 18 |
Raddaoui37 | 2011 | Asia | Retrospective | 42 | 48 | No | Combined | 18 |
- a Patients with endoscopic ultrasound-guided fine needle aspiration performed on solid pancreatic lesions.
First author | Total | Sensitivity | Inadequate | n (%) | ||||
---|---|---|---|---|---|---|---|---|
TP | FP | FN | TN | ROSE | ||||
Levy16 | 6 | 1 | 0 | 1 | 4 | No | 0.50 | 0 (0) |
Binmoeller9 | 40 | 8 | 0 | 4 | 28 | No | 0.67 | 6 (15)b |
Ardengh24 | 69 | 7 | 0 | 4 | 58 | No | 0.64 | 1 (1.4)a |
Ho20 | 11 | 4 | 0 | 0 | 7 | No | 1.00 | 0 (0) |
Touchefeu31 | 90 | 49 | 0 | 22 | 19 | No | 0.69 | 8 (8)a |
Aithal23 | 167 | 51 | 0 | 2 | 114 | No | 0.96 | 0 (0) |
Voss12 | 90 | 48 | 1 | 11 | 30 | No | 0.81 | 9 (9)b |
Hwang29 | 139 | 71 | 0 | 17 | 51 | No | 0.81 | 17 (12) |
Noda34 | 19 | 15 | 0 | 1 | 3 | No | 0.94 | 0 (0) |
Raddaoui37 | 43 | 10 | 0 | 1 | 32 | No | 0.91 | 6 (14)b |
Total no ROSE | 674 | 264 | 1 | 63 | 346 | 0.81 | ||
Cahn4 | 50 | 21 | 0 | 3 | 26 | Not stated | 0.88 | 2 (4)a |
Wilson32 | 72 | 46 | 0 | 2 | 24 | Not stated | 0.96 | 8 (11)b |
Napoleon33 | 35 | 13 | 0 | 5 | 17 | Not stated | 0.72 | 0 (0) |
Total not stated | 157 | 80 | 0 | 10 | 67 | 0.89 | ||
Turner35 | 520 | 340 | 1 | 102 | 77 | Sometimes | 0.77 | 4 (0.9)a |
Brandwein13 | 43 | 22 | 0 | 15 | 6 | Sometimes | 0.60 | 3 (7)a |
Giovannini27 | 121 | 67 | 6 | 5 | 43 | Sometimes | 0.93 | 0 (0) |
Hikichi28 | 75 | 48 | 0 | 2 | 25 | Sometimes | 0.96 | 2 (3)b |
Chang5 | 44 | 30 | 0 | 0 | 14 | Sometimes | 1.00 | 1 (7) |
Total sometimes | 803 | 507 | 7 | 124 | 165 | 0.80 | ||
Horwhat21 | 36 | 21 | 0 | 4 | 11 | Yes | 0.84 | 0 (0) |
Bentz8 | 45 | 22 | 0 | 2 | 21 | Yes | 0.92 | 7 (16)b |
Faigel7 | 40 | 30 | 0 | 2 | 9 | Yes | 0.94 | 2 (4.9)b |
Fritscher-Raven14 | 207 | 51 | 0 | 16 | 140 | Yes | 0.76 | 7 (3.5)b |
Eloubeidi15 | 101 | 67 | 0 | 4 | 30 | Yes | 0.94 | 2 (2)b |
Fisher26 | 93 | 66 | 0 | 4 | 23 | Yes | 0.94 | 9 (9)b |
Krishna30 | 140 | 81 | 0 | 4 | 55 | Yes | 0.95 | 0 (0) |
Mitsuhashi22 | 267 | 133 | 0 | 7 | 127 | Yes | 0.95 | 14 (5.2) |
Suits10 | 98 | 56 | 0 | 2 | 40 | Yes | 0.97 | 0 (0) |
Raut17 | 233 | 172 | 0 | 6 | 55 | Yes | 0.97 | 16 (7) |
Varadarajulu19 | 282 | 204 | 1 | 6 | 71 | Yes | 0.97 | 4 (1)b |
Erickson11 | 109 | 92 | 0 | 1 | 16 | Yes | 0.99 | 0 (0) |
Zhang36 | 279 | 137 | 0 | 1 | 141 | Yes | 0.99 | 6 (7)b |
Baron6 | 47 | 33 | 0 | 5 | 9 | Yes | 0.87 | 0 (0) |
Varadarajulu18 | 3 | 2 | 0 | 0 | 1 | Yes | 1.00 | 0 (0) |
Argawal25 | 30 | 6 | 0 | 0 | 24 | Yes | 1.00 | 0 (0) |
Total with ROSE | 2010 | 1173 | 1 | 64 | 773 | 0.95 |
- TP, true positive; FP, false positive; FN, false negative; TN, true negativie; ROSE, rapid on-site evaluation.
- a Study did not clarify if inadequate samples were included.
- b Inadequate samples are not included in the study.
The reference standard, in the majority, was clinical and/or histological findings and only three studies used histology alone. On-site evaluation by the cytopathologist was present in 16 studies, sometimes in five, was not indicated in three and not present in ten. The time of follow-up for all the 34 studies was longer than 9 months on average, unless diagnosis was confirmed by histology, the patient was lost to follow-up, or expired prior to the stipulated period.
Methodological quality assessment
All the selected studies were of moderate to good quality based upon meeting seven or greater scoring criteria of the modified QUADAS 11 question based assessment. The mean QUADAS score was 18.2 with a range of 15–21 based upon our scoring system. A representative spectrum of patients was thought to be included in all studies as all studies reported a consecutive patient population. Histology was used as the reference standard in three studies (8.8%); partial verification bias (i.e. a portion of the study group not subjected to the defined reference standard) was present in two studies (5.9%), where positive (malignant) cytology was considered diagnostic and included in the reference standard. The risk of differential verification bias (i.e. verification using different reference standards) could not be avoided in 29 studies (85.3%), which used varying reference standards including clinical follow-up and/or histopathology. One study used repeat EUS-FNA as the reference standard causing incorporation bias. The existence of uninterpretable or intermediate results and withdrawals, if present, were explained in all studies. In addition, none of the studies clearly stated the interval time between the index test (EUS-FNA) and diagnostic confirmation. Publication bias (i.e. the selection of articles) did not impact the diagnostic accuracy of this meta-analysis, as shown by a relatively symmetrical funnel plot (Figure 2).

Estimates of diagnostic accuracy
The sensitivity of EUS-FNA for the diagnosis of pancreatic adenocarcinoma ranged from 0.50 to 1.00 and the specificity ranged from 0.88 to 1.00. The pooled sensitivity for EUS-FNA of the pancreas for pancreatic adenocarcinoma was 88.6% [95% confidence interval (CI): 87.2–89.9] and the pooled specificity was 99.3% (95% CI: 98.7–99.7). The pooled LR+ was 33.5 (95% CI: 20.76–53.91) and the pooled LR− was 0.11 (95% CI: 0.08–0.16). The DOR expresses how much greater the odds of having the disease are for the people with a positive test result than for the people with a negative test result. The pooled DOR was 383.64 (95% CI: 219.66–684.09) showing a very high accuracy of EUS-FNA in diagnosing pancreatic ductal carcinoma. The (SROC) curves for overall accuracy and the AUC are shown in Figure 3.

We evaluated heterogeneity by plotting the sensitivity and specificity from each study on a forest plot and noted how well the CIs overlapped. Sensitivity was notably lower in most studies that lacked on-site cytopathology evaluation, whereas only two of 14 studies with a sensitivity of 95% or higher did not have on-site cytopathology. The forest plot for the sensitivity EUS-FNA is shown in Figure 4. There were only nine false-positive diagnoses of which six came from a single institutional study; cytopathology was sometimes present for that particular study. Owing to either only one or no false positives in the other studies, a forest plot of specificity is not shown. Although the studies in which rapid on-site evaluation (ROSE) was performed tended to have a lower inadequate (unsatisfactory) rate (mean = 0.03, 95% CI 0.01–1.96), this difference does not reach statistical significance compared with studies without ROSE (mean = 0.05, 95% CI 0.01–1.96). Thirteen studies did not report any inadequate specimens. Inadequate specimens were included in the assessment of sensitivity and specificity in the assessment of diagnostic performance in four studies.

Univariate analysis was performed on the different variables considered as possible sources of heterogeneity; the presence of a cytopathologist for ROSE (P = 0.012), studies with over 50 patients (P = 0.013) and the type of reference standard used (P = 0.015) were found to be statistically significant (Table 4). Factors that were not found to have a significant contribution to performance variability included: location, QUADAS score and study design. SROC for the presence of cytopathology is shown in Figure 5. Meta-regression was performed on these three variables (presence of on-site cytopathology, reference standard and size) and only the presence of on-site cytopathology (P = 0.001) was shown to be independently significant after correction (Table 5).
Subgroup | No. of studies | Sensitivity pooled | Specificity pooled | P-value |
---|---|---|---|---|
Study design | ||||
Prospective | 17 | 91.7 (89.6–93.4) | 99.8 (98.9–100) | 0.980 |
Retrospective | 17 | 86.8 (84.9–88.5) | 99.1 (98.2–99.6) | |
Study length | ||||
<36 months | 14 | 89.4 (86.8–91.6) | 98.7 (97.1–99.5) | 0.111 |
≥36 months | 20 | 88.3 (86.6–89.8) | 99.7 (99.0–99.9) | |
Location | ||||
N. America | 20 | 89.9 (88.4–91.3) | 99.7 (99.0–100) | 0.481 |
Europe | 7 | 81.5 (77.1–85.4) | 98.2 (96.4–99.3) | |
Other | 7 | 89.5 (85.4–92.7) | 100 (98.3–100) | |
QUADAS score | ||||
15–18 | 27 | 87.1 (85.5–88.7) | 99.3 (98.7–99.7) | 0.545 |
19 or higher | 7 | 93.7 (91.2–95.6) | 99.4 (96.9–100) | |
No. of patients | ||||
≤50 | 14 | 84.4 (79.2–88.7) | 100 (98.2–100) | 0.013 |
>50 | 20 | 89.1 (87.6–90.4) | 99.2 (98.5–99.6) | |
On-site cytopathology | ||||
Yes | 16 | 94.8 (93.4–96.0) | 99.9 (99.3–100) | 0.012 |
Sometimes | 5 | 80.3 (77.0–83.4) | 95.9 (91.8–98.3) | |
Not provided | 3 | 88.9 (80.5–94.5) | 100 (94.6–100) | |
No | 10 | 80.7 (76.0–84.9) | 99.7 (98.4–100) | |
Reference standard | ||||
Combined | 31 | 89.1 (87.8–90.4) | 99.3 (98.7–99.7) | 0.015 |
Histology Only | 3 | 72.7 (61.4–82.3) | 100 (82.4–100) |
Subgroup | RDOR 95% (CI) | P-value |
---|---|---|
Number of patients | 1.00 (1.00–1.01) | 0.1329 |
On-site cytopathology | 5.95 (2.15–16.45) | 0.0012 |
Reference standard | 4.91 (0.62–38.92) | 0.1264 |

Discussion
This review aimed to summarize evidence for the diagnostic accuracy of EUS-FNA in identifying pancreatic ductal carcinoma in patients with solid pancreatic lesions. An important finding is that only three of the studies used histology alone as the gold standard, whereas most studies exhibited differential bias, using varying reference standards of clinical, radiographical or histological follow-up. The diagnostic accuracy of most studies was high, mostly because of high specificity. There were only nine false-positive results mostly coming from a single study, whereas most of the variation in the studies evaluated is related to sensitivity. Although it is generally believed that the presence of a cytopathologist improves adequacy, our analysis shows that the little improvement in adequacy is not statistically significant. This is in contrast to the overall accuracy, which is significantly higher when ROSE is offered. As expected, accuracy is also better in larger studies reflecting the generally accepted view that large volume brings experience to the endoscopy–cytology team.
Generally, if a patient is deemed radiographically inoperable because of the tumour burden, surgical excision or biopsy is not performed to verify EUS-FNA findings of carcinoma. Likewise, if a patient is found not to have pancreatic adenocarcinoma by EUS-FNA and clinical findings do not suggest the contrary, the patient is often monitored through clinical follow-up and subsequent imaging. The lack of histological tissue leads to variance in the type of reference standard. Our analysis showed that most studies used different reference standards based upon the EUS-FNA and physical findings, which inadvertently lead to bias within the studies. Studies that used clinical follow-up alone, or clinical follow-up and histology had higher sensitivities and specificities, higher than histology alone. The usage of various reference standards not only suggests differential bias but may allude to interpretation bias for patients who are monitored clinically.
Notably, all studies took place at a major medical centre or university. Typically most patients undergo EUS-FNA in this setting; however, as EUS-FNA becomes more accepted, smaller institutions and practices have begun to offer this diagnostic tool. This is a concern as the studies selected for this review did not reflect the levels of experience at these institutions. We surmised the earlier studies we reviewed in this analysis may be akin to present studies at less experienced institutions, yielding lower sensitivities and specificities than our pooled data. This may be reflective of less experienced endoscopists and cytopathologists or the lack of availability of ROSE.
It was important for us to discern whether it was the presence of on-site cytopathology that led to higher diagnostic accuracy at these major medical centres or the volume of cases seen at these institutions (inferring more experienced endosonographers performing the aspirates) and how the reference standard used impacted these variables. Our results by meta-regression, which adjusted for the presence of the other variables (either ROSE or case volume), showed that each independently contributed to high diagnostic accuracy. However, when we added the type of reference standard used to the multivariate analysis the case volume (size) was no longer found to be significant. This supports our belief that some bias exists when large numbers of cases are followed clinically. However, if it is a centre with an experienced skilled team that is performing the procedure, having histology as the gold standard may be less important. A previous study that we performed addressed the significance of an experienced team in diagnostic accuracy and validates our concerns.38 However, we recognize that ROSE may not be feasible at many EUS centres because of cost, time and relatively low reimbursement for pathology services.39
In our review of the studies, we wanted to assess the impact of inadequate samples. In the presence of ROSE, inadequate samples are typically not an issue; however, if materials are collected and later sent to the pathology laboratory for evaluation it must be questioned whether these non-diagnostic or inadequate samples are considered to be negative. Considering inadequate samples as negatives will decrease the sensitivity and may affect the assessment of the impact of ROSE for diagnostic accuracy. It would also affect the diagnostic accuracy overall, as this is based on ‘true’ TNs and FNs. Our analyses of the studies did not show a statistically significant difference in the rate of inadequate samples in the presence or absence of a cytopathologist. Yet, there was tendency for inadequate samples to be higher when a cytopathologist was not present. One might speculate as to why ROSE improves accuracy in spite of not significantly improving adequacy. We believe this answer is twofold; first, the inadequacy may be as a result of technical issues in the procurement of tissue, or it may be the result of the limitations in the diagnostic threshold of the pathologist. Perhaps, if enough complete data were provided (avoidance of type one error) adequacy may end up being significant. Of the 34 studies included, 13 did not report any inadequate specimens. Two of the studies (Aithal et al.23 and Horwhat et al.21) reported technical failure; however, these were not considered ‘inadequate samples’ and were not included in their analyses. There is no uniform criterion to define adequacy (technical versus lesional, versus patient-related issues), which is a limitation to this type of analysis. We could not determine if some pathologists at the participating institutions were reluctant to call samples inadequate or if they used loose criteria. Some papers did not address whether the inadequate samples were considered to be negative or removed from analysis; aside from possibly being affected by the presence or absence of a cytopathologist it could also be affected by the skill of the team or size of the centre.
The threshold of malignancy in this analysis varies in defined ‘cut-offs’, especially at the intermediate level (atypical and suspicious cases). Some earlier studies such as Baron et al.5 considered atypical as positive, whereas Turner et al.35 considered all atypical and suspicious as negative. We could not extrapolate all the necessary data to reclassify all the studies to the same threshold. Therefore we used the authors' threshold for intermediate cases and recognize this may bias the results.
There were also limitations to studying the influences of certain sources of heterogeneity in this review, particularly the number of EUS-FNA passes and the presence of chronic pancreatitis. Regarding the number of passes, studies typically did not provide sufficient information pertaining to this characteristic. Chronic pancreatitis was mentioned as a small component in one study;14 however, the correlation with pancreatic ductal carcinoma could not be extrapolated. Our analyses primarily focused on differences in diagnostic performance between studies using different reference standards and the presence or absence of ROSE.
This review focused on the diagnostic performance of EUS-FNA in ascertaining pancreatic adenocarcinoma in solid pancreatic lesions. A comprehensive systematic analysis and synthesis of evidence on reliability was beyond the scope of this review. Undeniably, adequate reliability (inter- and intra-observer agreement) is essential for quality diagnostic test. Our review showed that the EUS-FNA procedure was purported to take place under the guidance of a skilled endoscopist and interpreted by an experienced cytopathologist. As there was no way to standardize the comparative results from the various studies and institutions, this analysis was not able to exclude inter-observer variability in EUS-FNA performance. Overall, however, inter-observer agreement based upon the sensitivities and specificities reported trend towards EUS-FNA of pancreatic adenocarcinomas as a reliable test with optimal performance in an ideal clinical setting with a skilled team.
Applicability of findings to clinical practice
EUS-FNA has been shown to detect tumours less than 3 mm, which are not readily recognized by imaging. Therefore, the results of this study confirm the likelihood through a systematic review of over two thousand cases of pancreatic adenocarcinoma diagnosed by EUS-FNA at various institutions is principally accurate. This is especially of concern when healthcare management and healthcare cost initiatives desire less invasive, cost efficient modalities, which would not diminish the current standard of practice.
Equally important is the fact that this analysis shows that even although the diagnostic accuracy of EUS-FNA is high, it is not as high as previously purported when using histology the ‘true’ gold standard. This may be the result of overconfidence in the observer, imaging analysis and limited clinical follow-up. It is not unforeseeable that a poorly differentiated tumour could have been called pancreatic adenocarcinoma by EUS-FNA and the patient subsequently dies shortly afterwards with this as the causative agent. However, hypothetically, if procurement of tissue showed poorly differentiated neuroendocrine carcinoma or lymphoma the diagnostic value would change. Moreover, if the lesion was called pancreatic adenocarcinoma but was truly pancreatitis, yet the patient die 3 months later from unrelated causes this could erroneously be misinterpreted in clinical follow-up.
We believe the most important point shown by this meta-analysis is the presence of on-site cytopathology assessment yields higher diagnostic accuracy. We also think this meta-analysis shows that larger size centres and reference standard use are not significant but the accuracy lies in the skill of the endosonographer. Our own experience has shown that many of the referrals we received were secondary to the inexperience of the centre where the initial procedure was performed and the presence of the cytopathologist in the room at the time of aspiration prevents inadequate samples and reduces the likelihood of sampling error.
Conclusion
EUS-FNA is an effective modality in diagnosing pancreatic adenocarcinoma in solid pancreatic lesions, with a higher diagnostic accuracy than cystic tumours. This analysis more importantly shows that the diagnostic performance of this test is not an independent entity, but is heavily reliant upon the experience of the skilled team and equally the presence of on-site cytopathology. Although beyond the aspects of this analysis, it is affected by other factors specific to the patient (size of the lesion, location, etc.) and auxiliary studies (cell blocks, liquid-based cytology, endoscopic retrograde cholangiopancreatography). In spite of these limitations, EUS-FNA is clearly the best and least modality to accurately diagnose pancreatic ductal carcinoma.