Are we ready to bridge classification systems? A comprehensive review of different reporting systems in thyroid cytology
Esther Diana Rossi and Liron Pantanowitz contributed equally to the paper.
Abstract
The evaluation of thyroid lesions is common in the daily practice of cytology. While the majority of thyroid nodules are benign, in recent decades, there has been increased detection of small and well-differentiated thyroid cancers. Combining ultrasound evaluation with fine-needle aspiration cytology (FNAC) is extremely useful in the management of thyroid nodules. Furthermore, the adoption of specific terminology, introduced by different thyroid reporting systems, has helped effectively communicate thyroid FNAC diagnoses in a clear and understandable way. In 1996, the Papanicolaou Society thyroid cytological classification was introduced. This was followed in 2005 by the first Japanese and then in 2007 by the Bethesda System for Reporting Thyroid Cytopathology, which subsequently underwent two revisions. Other international thyroid terminology classifications include the British, Italian, Australasian and other Japanese cytology systems. This review covers similarities and differences among these cytology classification systems and highlights key points that unify these varied approaches to reporting thyroid FNAC diagnoses.
Graphical Abstract
The evaluation of thyroid lesions is common in the daily practice of cytology. While the majority of thyroid nodules are benign, in recent decades there has been increased detection of small and well-differentiated thyroid cancers. Combining ultrasound evaluation with fine needle aspiration cytology (FNAC) is extremely useful in the management of thyroid nodules. Furthermore, the adoption of specific terminology, introduced by different thyroid reporting systems, has helped effectively communicate thyroid FNAC diagnoses in a clear and understandable way.
The paper is a brief evaluation of the different cytology terminology classification systems for thyroid nodules. The review aims to define the similarities and differences among these systems, especially in the field of indeterminate lesions.
1 ONCE UPON A TIME: A BRIEF DESCRIPTION OF THE PUBLICATION SEQUENCE OF THE VARIOUS THYROID TERMINOLOGY CLASSIFICATIONS
Over the last three decades, multiple thyroid cytology reporting systems have been published. The first attempt to establish a thyroid cytologic classification system began in 1996 when a task force was created by the Papanicoloau Society of Cytopathology. Specifically, that system categorized FNA's as benign non-neoplastic lesions, cellular follicular lesions, Hürthle cell neoplasm and malignant.1 Interestingly, the suspicious for malignancy category was not specifically mentioned, although it was implied in the paper by Suen et al.2 In 2005, the Japanese Society of Thyroid Surgeons modified the Papanicolaou system. Thereafter, in 2007, The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) was crafted, followed by British, Italian and then revised versions of a Japanese system.2-12 Since then, as it will be discussed in the different sections, each of the reporting systems has undergone multiple revisions with a second and in some cases, e.g. for Bethesda thyroid, also a third edition in 2023. Among the systems, the newest seems to be the Australasian terminology, which was first published in 2014 and then introduced as a modified version of Bethesda in 2019.13, 14
To note, all the reporting terminologies have undergone various revisions, and it can be difficult to keep track of all the changes. The Bethesda system is sponsored by the American Society of Cytopathology, and each edition was published as a book by Springer. On the other hand, some systems are published on society websites and are periodically updated (as is the case with the UKRCPath and Australasian systems) rather than being published as peer-reviewed publications. As an example, the British RCPath system was first published on the RCPath website in 2009, then it underwent a second revision in 2016, a third revision in 2023 and a fourth revision to include changes to the Thy 3 category to take account of changes reported in the Bethesda system, in March 2024.
2 INTRODUCTION
The detection of thyroid nodules has become increasingly frequent in daily practice.15-17 This change is mostly attributed to the increased detection of nodules with greater use of ultrasound imaging of the thyroid gland, which is now able to identify even minor lesions. Fortunately, the majority of thyroid nodules are benign, with only 5%–10% malignant. Nonetheless, the incidence of thyroid cancer, mostly those that are well-differentiated and specifically papillary thyroid carcinoma (PTC) and its subtypes, has been increasing.15
In a paper by Sarajisevi et al., the authors discussed a retrospective series of 1328 patients who underwent thyroid-directed surgery in 16 centers in four countries. The study demonstrated that 51% of thyroid cancers were discovered in patients who had no thyroid-referable symptoms and that these cancers were smaller than symptomatic thyroid cancers.15
As a result, an increasing number of patients with thyroid nodules now require to be further evaluated. It is well established that fine-needle aspiration cytology (FNAC) is one of the most important tools for evaluating thyroid lesions. FNAC is minimally invasive, able to define the nature of most nodules and procure adequate cellular material to support ancillary studies to help resolve indeterminate lesions.16-20
Although FNAC is able to provide a diagnosis for the majority of thyroid nodules, up to 20%–25% will remain indeterminate and classified as follicular proliferations, including categories III and IV for the Bethesda, Thy 3a and Thy 3f in the British classification and TIR3A and TIR3b for the Italian system.18-20 Data from the literature indicate that 75% of these indeterminate lesions result in a diagnosis of adenoma and/or benign adenomatous nodules, with only a minority, as malignant lesions. The risk of malignancy (ROM) of the indeterminate categories in the Bethesda system for AUS is 22% and for FN it is 30%; there are a few differences for the other classification systems, which are not greatly different.
However, it is clear that, especially for these indeterminate lesions, rendering a cytology report with a descriptive-free text diagnosis (but with a non-definitive diagnosis) is insufficient to appropriately guide patient management and follow-up, and it may also lead to clinical misinterpretation of the report.21-23
In order to ensure that an FNAC diagnosis is practical, clearly understandable, robust, and easy to interpret classification systems with standardized reporting were introduced. In these reporting systems, cytomorphology is typically combined with an expected ROM and coupled to clinical and/or surgical patient management.2-13 The purpose of any classification system is to clarify and tailor patient management and minimize inter-and intra-observer agreement.22-24
Since its introduction in 2007, TBSRTC represents the greatest worldwide effort to codify the cytological diagnosis of thyroid lesions and guide patient management.5-7 Thus far, TBSRTC is the most popular system. The goal of TBSRTC, as for all such classification systems, has been to standardize diagnostic categories linked with expected ROM and specific patient management. TBSRTC has been accepted globally, with translations in seven different languages. Despite numerous publications confirming the excellent inter- and intra- observer agreement of TBSRTC, other similar terminology classifications are in use, including the British, Italian, Japanese and Australasian systems.5-14 Whilst all of these classification systems standardize the cytologic reporting of thyroid interpretations without ambiguous terms, the number of these classification systems indicates a lack of unity among cytopathologists. A comparative analysis of these systems (Table 1), including their revisions, has demonstrated more similarities than differences, making them somewhat interchangeable.
ND | B | Indeterminate | SFM | PM | ROM2, 6-10 | ||
---|---|---|---|---|---|---|---|
Bethesda system6 | Including cystic only | Different benign entities |
AUS FN |
Epithelial medullary Others |
Epithelial Others Metastases |
ND = 13% B = 4% AUS = 22% FN =30% SFM = 74% PM = 97% |
|
British system7-9 | Including cystic only |
Different benign entities Including cystic only |
Thy3a Thy3f |
Epithelial Medullary Others |
Epithelial Others Metastases |
ND = 12% B = 5% Thy3a = 25% Thy3f = 31% SFM = 79% PM = 98% |
|
Italian system10 | Including cystic only | Different benign entities |
TIR3A TIR3B |
Epithelial Medullary Others |
Epithelial Others Metastases |
ND < 10% B < 3% TIR3A = 17% TIR3B = 47% SFM = 85% PM = 99% |
|
Japanese system2 | Including cystic only | Different benign entities |
Indeterm. significance FN
|
Epithelial Medullary Others |
Epithelial Others Metastases |
ND = 5.6% B = 0–2.2% Indet Signif = 13.1% FN = 11.4% SFM = 95.8%–98.8% PM = 99–2%-99.3% |
- Abbreviations: AUS, Atypiaof undetrminated signidicance; B, benign; FN, follicular neoplasm; ND, non-diagnostic; PM, positive for malignancy; ROM, risk of malignancy; SFM, suspicious for malignancy; Thy3a, neoplasm possible-atypia/non diagnostic; Thy3f, neoplasm possible-suggesting follicular neoplasm; TIR 3A = low risk indterminate lesion; TIR 3B, high risk indterminate lesion.
3 THE BETHESDA SYSTEM FOR REPORTING THYROID CYTOPATHOLOGY
Ever since TBSRTC started to be adopted, the approach to thyroid lesions for cytopathology has significantly changed due to the worldwide acceptance of this classification system. Furthermore, based also on its worldwide use, TBSRTC was endorsed in 2015 by the American Thyroid Association (ATA) thyroid cancer guidelines as a valid system to diagnose thyroid cytology.25 TBSRTC has also been recognized in the 4th and 5th editions of the WHO classification of Endocrine and Neuroendocrine Tumours.26-28 Furthermore, different cytologic societies, such as the International Academy of Cytology (IAC) and the European Federation of Cytology Societies (EFCS), have supported and endorsed the third edition of the Bethesda, confirming the leading role of this classification system worldwide.5-7
In 2023, TBSRTC published a 3rd edition which included the refinement of diagnostic categories such as subdivisions for Atypia of Undetermined Significance (AUS), updated ROM's including for the paediatric population, modified specific management strategies, adding a chapter about the ultrasound evaluation of thyroid nodules and acknowledging the performance of molecular testing.7 The important change made in the AUS category, with a subclassification into nuclear atypia and others,7 reduces differences in the indeterminate categories compared with the other international classification systems.24, 29, 30 This subclassification was based on data showing that AUS with nuclear atypia has 36%–44% ROM versus only 15% in the other AUS subtypes.
In a recent paper by Bagis et al., the authors analysed 1224 atypia of undetermined significance (AUS) with a subcategorization of AUS including AUS “nuclear” and AUS “other” as proposed in the recent 3rd edition of TBSRTC.31 The authors demonstrated a higher ROM in FNAs with nuclear overlapping (35.5%), nuclear moulding (56.9%), irregular contours (42.1%), nuclear grooves (74.1%), chromatin clearing (49.4%) and chromatin margination (57.7%), and these features were independent significant predictors for malignancy. They concluded that ROM was significantly higher in the AUS-nuclear subcategory (48.2%) and in the AUS-nuclear and architectural subcategory (38.3%) compared with the group including other subcategories.31
This 2-tiered subclassification of AUS is supported by molecular testing, which further confirmed the distinction between these two subcategories.32-39 Zhu et al. discussed the timing for the rate of surgical resection in a series of indeterminate nodules with benign/negative molecular testing and the risk of false-negative molecular test results.32 Only 12 cases underwent surgery whilst 83 were followed up with a median surveillance of 26.7 months. They concluded that the majority of these lesions were stable for over 3 years with 6% false negative rate justifying the importance of continuing sonographic surveillance.32 In another paper by Gokozan et al., the authors combined molecular testing with cytomorphology to monitor AUS diagnoses in a cytopathology laboratory among six cytopathologists.34 A total of 588 cases analysed with molecular testing showing that genetic alterations, especially RAS mutations, were found in the AUS and FN categories, with less commonly BRAF mutation. Furthermore, they found that the rate of neoplasia with genetic alterations in the group of AUS was 5% and 20% according to the different pathologists, and they did not correlate with the AUS:FN ratio. Both these calculations may allow recognition of over or under diagnoses.34
The latest revision also introduced a description of cytological findings of new entities in agreement with the latest edition of the WHO Classification of Endocrine and Neuroendocrine Tumours.6, 26-30 This review will not discuss Bethesda III edition and the WHO 2022 in detail; however, among the changes in WHO 2022 that might affect the cytological interpretations are the introduction of a squamous subtype of anaplastic (undifferentiated) thyroid cancer (ATC).6, 26-30 Furthermore, the diagnosis of the newly defined entity, differentiated high-grade thyroid carcinoma, based on the presence of a necrotic component and increased mitotic figures (≥5 mitoses/2 mm2) could be investigated in cytological samples.7, 40 The diagnosis of noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP) can only be made histologically, so this entity is most likely to fall in the indeterminate categories of AUS, FN.41-44
4 THE BRITISH CLASSIFICATION SYSTEM
The British thyroid classification was introduced following a joint effort by the British Thyroid Association (BTA) and Royal College of Physicians.7-9, 45-48 In 2002, The British Thyroid Association and The Royal College of Physicians of London published a document entitled Guidelines for the Management of Thyroid Cancer in Adults with a second edition in 2007 and a third edition in 2014. In 2009, the UK Royal College of Pathologists (RCPath) published the first edition of the RCPath Thy terminology. The RCPath document was then further revised in 2016, 2023, and 2024. The latest RCPath guidance from March 2024 is now published on the RCPath website and it includes an updated revision of the Thy 3 category with almost identical criteria for Thy 3 to the AUS criteria in TBSRTC; Thy 3a (nuclear atypia) and Thy 3a (non-nuclear atypia).45-48
Since its introduction, it is clear that also in the British system, the most interesting adjustment was the subclassification of indeterminate lesions into Thy 3a and Thy 3f.16 For Thy 3f lesions, the definition was based on samples suggesting a follicular or oncocytic neoplasm, also with marked nuclear atypia.16
5 THE ITALIAN CLASSIFICATION SYSTEM
After TBSRTC and the RCPath systems, the Italian Consensus for the Classification and Reporting of Thyroid Cytology (ICCRTC) provided an alternative 5-tiered classification, including a subclassification of indeterminate lesions into two different subgroups: low-risk (TIR3A) and high-risk (TIR 3B).10, 11 Unlike many of the other systems, the TIR3A subcategory included FNAC cases with increased cellularity and microfollicular structures in a background of scant colloid, as well as samples with cytologic and architectural atypia which cannot be further classified.49-51 The concept of focal and mild nuclear atypia is only included in the TIR3B subcategory, which is suggestive for a follicular neoplasm. The concept of nuclear atypia ascribed to TIR3B only was the most important difference compared to the other classification systems. While this subclassification of indeterminate lesions is helpful for clinical management, for discriminating lesions that need surgery from repeat FNA only, there is unfortunately no associated ROM.49-51
However, a new revised edition is expected in 2024, with some changes in the diagnostic categories, although the major criteria will be maintained as they were, with incorporation of ancillary techniques, including molecular testing.
6 THE JAPANESE CLASSIFICATION SYSTEM
This classification system for thyroid cytopathology highlights the different approach in practice and management of thyroid lesions between Eastern and Western countries.3, 52-58 In 2005, the Japanese Society of Thyroid Surgeons modified the Papanicoloau classification system.56 In 2019, the Japanese Association of Endocrine Surgery (JAES) and the Japanese Society of Thyroid Pathology (JSTP) proposed a new Japanese Reporting System for Thyroid Aspiration cytology (JRSTA).3 This system includes seven categories with a subclassification of the indeterminate groups into (a) indeterminate “other” showing <1% ROM, and (b) indeterminate follicular neoplasm with 40%–60% ROM, which in turn are further subclassified into three subcategories: (1) favour benign with <15% ROM; (2) borderline with 15%–30% ROM and (3) favour malignant with 40%–60% ROM.3
For the indeterminate “other” group, as reported by Hirokawa et al., this includes atypical criteria which were detailed in their atlas, as well as mild nuclear atypia of PTC.2 In this regard, this indeterminate category shows significant similarities with TBSRTC and the British Classification. The Japanese system introduced the adoption of only a single name to define each category, in order to avoid confusion. The first two editions of TBSRTC maintained two names for some categories (e.g. AUS/FLUS and FN/SFN), whilst the recent third edition of TBSRTC was modified in favour of utilizing only a single diagnostic name. Hirokawa et al. published a comparison between the Japanese and Bethesda systems (referring to the 2nd edition of TBSRTC) and concluded that there were significant differences in ROM and with the detection of PTC among the indeterminate categories exhibiting nuclear atypia.3 Specifically, TBSRTC found that the PTC rate was 28%–56% in AUS with nuclear atypia and 27%–68% in FN, whilst the Japanese system found that it was 44% in the indeterminate category and only 4% in those cases diagnosed as FN. These discrepancies illustrate the difference between Western and Eastern countries with respect to thyroid cytopathology and patient management.59 In fact, in the Japanese system, not all patients diagnosed with FN undergo surgery, as the final analysis also requires genetic testing.52-58 To note, as for some of the other regional and local classification systems, many laboratories, especially in Japan, seem to have adopted the Bethesda system.
Furtheremore, Kameyama et al. reported very recent changes in the ROM especially for the indeterminate lesions, showing that also this system had a recent revision.60
7 ARE WE READY FOR A SINGLE CLASSIFICATION SYSTEM? COMPARATIVE RESULTS
Despite the fact that all of the aforementioned systems have greater similarities than differences, a single classification system that will be favourably adopted by all countries still needs to be developed. An expected WHO Head and Neck including head and neck, salivary and also the thyroid terminology is expected to be published in 2025; however, at the time of this review article, no specific details or assessments of this system are known.
Should just one of these systems be adopted (e.g. TBSRTC), or is there a need to rather create an entirely new thyroid system unifying the previous published ones? To be helpful, a single unifying cytological system should be easy to understand and prove to be useful in clinical practice by clinicians. Furthermore, the ideal system should have good intra- and inter-observer reproducibility among the diagnostic categories so that the diagnosis of AUS, TIR3A or THY3a should be the same, with the same criteria, irrespective of the country.1-12 The changes in the different systems confirm that the revised British and the revised Italian systems are very similar and aligned to the Bethesda system version 3. However, TBSRTC seems to be the most used thyroid classification worldwide, with a high level of satisfaction by cytopathologists and clinicians. Other points to note are that the British, Italian and Japanese systems are mostly restricted to use in their own countries,2, 4-12 but with good acceptance. The Australasian system does not add any differences to the Bethesda and it seems to be a Bethesda-modified terminology. No meta-analysis has been published to compare the results of the Australasian modified version and the latest revisions of TBSRTC appearmost aligned with advances in the field of thyroid pathology in the WHO 2022 Classification of Endocrine and Neuroendocrine Tumours.6
Analysing the diagnostic categories among the different systems, there are more similarities than differences.2-12, 21, 22 For instance, in the non-diagnostic group each system uses criteria that are reproducible.2-12 Also, the criteria for benign lesions are very similar. All of the systems have also introduced indeterminate categories in which thyroid lesions cannot be reliably interpreted as either benign or malignant. However, it is debatable whether the different systems utilize reproducible and similar criteria for diagnosing indeterminate lesions. Poller et al. in 2020 published a systematic review and meta-analysis assessing the ROM in the various categories of the British system.50 They report a 25% ROM for Thy3a versus 31% ROM in Thy3f lesions, with only a 6% gap between these two subcategories.61 Similar results were also reported for TBSRTC. In fact, a major meta-analysis comparing the subcategories of indeterminate lesions found a 27% ROM for AUS, and 31% ROM for FN.38 Given that the gap between these two subgroups is only around 4%, subclassification itself is clearly was not able to correctly triage cases that need follow-up and/or surgery, leading in some cases to unnecessary surgical procedures. The above discussion and analysis is, of course now historic, as these studies were conducted using former criteria in use prior to 2023 (Bethesda version 3) and early 2024 (UK RCPath version 4) Data from a meta-analysis by Trimboli et al., adopting the Italian classification system, has shown that the cancer rate was 12% in the TIR3A and 44% in TIR3B.59 As mentioned above the Poller et al. 2020 article would now reflect UK historic practices using the UK RCPath 2024 revised Thy 3a definitions the ROM of Thy 3a from 2024 onwards in British practice would be expected to be similar to that reported by Trimboli et al. for TIR 3A and 3B61 and for TBSRTC version 3 AUS (other) vs. AUS (nuclear atypia).
The significant gap as reported in these earlier meta-analyses between these two subcategories is likely ascribed to morphologic criteria, including nuclear atypia only in the TIR3B group, and the other types of atypia present in the low-risk TIR3A category. It is also important to note that not all reported data are based on surgical follow-up. Trimboli et al. discussed the impact of nodule size for the indeterminate categories. Their data showed that larger nodule size was associated with lower prevalence of TIR3B and ROM, but higher ROM in TIR3A. Hence all the different systems underline, as a common point, the combined role of morphology and ancillary techniques, including molecular testing for the indeterminate categories.62-65
8 CONCLUSION
While a single unifying thyroid classification system that is accepted globally would be ideal, this is unlikely to be realized any time soon. One of the main reasons why this has likely not yet materialized can be ascribed to the excellent revised versions of the different classification systems, which continually narrow any differences as the field advances. It seems that despite the idea of a single terminology system, all the currently available systems have similar categories offering excellent results and management which are unlikely to be improved with a single system. To date, there is general agreement on the different terminology and classification of thyroid nodules, even though similarities in the ROM from the different systems are documented. For now, TBSRTC has been endorsed and recognized as the most commonly used classification system for thyroid nodules. However, further work is still necessary to achieve uniformity. The option of a WHO Head and Neck Terminology is in the initial evaluation stage, although at the current moment, no details are available.
AUTHOR CONTRIBUTIONS
EDR and LP contributed equally to planning, writing, editing and reviewing the manuscript. EDR, LP planned, conceptualized and designed the study.
FUNDING INFORMATION
No funding for this commentary.
CONFLICT OF INTEREST STATEMENT
No competing interest and conflict of interest.
ETHICS STATEMENT
Not applicable.
Open Research
DATA AVAILABILITY STATEMENT
The data are available in our depositary.