Laryngoscope Investigative Otolaryngology

Volume 9, Issue 5 e70009

ORIGINAL RESEARCH

Open Access

Search for medical information for chronic rhinosinusitis through an artificial intelligence ChatBot

Arsany Yassa BA,

Arsany Yassa BA

orcid.org/0009-0009-1474-4346

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Search for more papers by this author

Olivia Ayad BS, MSc,

Olivia Ayad BS, MSc

Department of Architecture and Territory, Mediterranean University of Reggio Calabria, Calabria, Italy

Department of Landscape Architecture, International Credit Hours Engineering Programs of Ain Shams University, Cairo, Egypt

Arclivia, Bayonne, NJ, United States

Search for more papers by this author

David Avery Cohen MD,

David Avery Cohen MD

orcid.org/0000-0001-8263-6341

Department of Otolaryngology—Head and Neck Surgery, University of Florida College of Medicine, Gainesville, Florida, USA

Search for more papers by this author

Aman M. Patel BA,

Aman M. Patel BA

orcid.org/0000-0003-0794-041X

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Search for more papers by this author

Ved A. Vengsarkar BS,

Ved A. Vengsarkar BS

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Search for more papers by this author

Michael S. Hegazin DO,

Michael S. Hegazin DO

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Search for more papers by this author

Andrey Filimonov MD, PharmD,

Andrey Filimonov MD, PharmD

orcid.org/0000-0002-4285-5862

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Search for more papers by this author

Wayne D. Hsueh MD,

Wayne D. Hsueh MD

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Department of Ophthalmology and Visual Science, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Center for Skull Base and Pituitary Surgery, Neurological Institute of New Jersey, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Search for more papers by this author

Jean Anderson Eloy MD, FACS, FARS,

Corresponding Author

Jean Anderson Eloy MD, FACS, FARS

[email protected]

orcid.org/0000-0003-2893-7818

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Department of Ophthalmology and Visual Science, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Center for Skull Base and Pituitary Surgery, Neurological Institute of New Jersey, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Department of Neurological Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Department of Otolaryngology and Facial Plastic Surgery, Cooperman Barnabas Medical Center—RWJBarnabas Health, Livingston, New Jersey, USA

Correspondence

Jean Anderson Eloy, Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, 90 Bergen St., Suite 8100, Newark, NJ 07103, USA.

Email: [email protected]

Search for more papers by this author

Arsany Yassa BA,

Arsany Yassa BA

orcid.org/0009-0009-1474-4346

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Search for more papers by this author

Olivia Ayad BS, MSc,

Olivia Ayad BS, MSc

Department of Architecture and Territory, Mediterranean University of Reggio Calabria, Calabria, Italy

Department of Landscape Architecture, International Credit Hours Engineering Programs of Ain Shams University, Cairo, Egypt

Arclivia, Bayonne, NJ, United States

Search for more papers by this author

David Avery Cohen MD,

David Avery Cohen MD

orcid.org/0000-0001-8263-6341

Department of Otolaryngology—Head and Neck Surgery, University of Florida College of Medicine, Gainesville, Florida, USA

Search for more papers by this author

Aman M. Patel BA,

Aman M. Patel BA

orcid.org/0000-0003-0794-041X

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Search for more papers by this author

Ved A. Vengsarkar BS,

Ved A. Vengsarkar BS

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Search for more papers by this author

Michael S. Hegazin DO,

Michael S. Hegazin DO

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Search for more papers by this author

Andrey Filimonov MD, PharmD,

Andrey Filimonov MD, PharmD

orcid.org/0000-0002-4285-5862

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Search for more papers by this author

Wayne D. Hsueh MD,

Wayne D. Hsueh MD

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Department of Ophthalmology and Visual Science, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Center for Skull Base and Pituitary Surgery, Neurological Institute of New Jersey, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Search for more papers by this author

Jean Anderson Eloy MD, FACS, FARS,

Corresponding Author

Jean Anderson Eloy MD, FACS, FARS

[email protected]

orcid.org/0000-0003-2893-7818

Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Department of Ophthalmology and Visual Science, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Center for Skull Base and Pituitary Surgery, Neurological Institute of New Jersey, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Department of Neurological Surgery, Rutgers New Jersey Medical School, Newark, New Jersey, USA

Department of Otolaryngology and Facial Plastic Surgery, Cooperman Barnabas Medical Center—RWJBarnabas Health, Livingston, New Jersey, USA

Correspondence

Jean Anderson Eloy, Department of Otolaryngology—Head and Neck Surgery, Rutgers New Jersey Medical School, 90 Bergen St., Suite 8100, Newark, NJ 07103, USA.

Email: [email protected]

Search for more papers by this author

First published: 09 September 2024

https://doi.org/10.1002/lio2.70009

Share a link

Email
Wechat
Bluesky

Abstract

Objectives

Artificial intelligence is evolving and significantly impacting health care, promising to transform access to medical information. With the rise of medical misinformation and frequent internet searches for health-related advice, there is a growing demand for reliable patient information. This study assesses the effectiveness of ChatGPT in providing information and treatment options for chronic rhinosinusitis (CRS).

Methods

Six inputs were entered into ChatGPT regarding the definition, prevalence, causes, symptoms, treatment options, and postoperative complications of CRS. International Consensus Statement on Allergy and Rhinology guidelines for Rhinosinusitis was the gold standard for evaluating the answers. The inputs were categorized into three categories and Flesch–Kincaid readability, ANOVA and trend analysis tests were used to assess them.

Results

Although some discrepancies were found regarding CRS, ChatGPT's answers were largely in line with existing literature. Mean Flesch Reading Ease, Flesch–Kincaid Grade Level and passive voice percentage were (40.7%, 12.15%, 22.5%) for basic information and prevalence category, (47.5%, 11.2%, 11.1%) for causes and symptoms category, (33.05%, 13.05%, 22.25%) for treatment and complications, and (40.42%, 12.13%, 18.62%) across all categories. ANOVA indicated no statistically significant differences in readability across the categories (p-values: Flesch Reading Ease = 0.385, Flesch–Kincaid Grade Level = 0.555, Passive Sentences = 0.601). Trend analysis revealed readability varied slightly, with a general increase in complexity.

Conclusion

ChatGPT is a developing tool potentially useful for patients and medical professionals to access medical information. However, caution is advised as its answers may not be fully accurate compared to clinical guidelines or suitable for patients with varying educational backgrounds.

Level of evidence: 4.

1 INTRODUCTION

Artificial intelligence (AI) is a rapidly growing field that has the potential to revolutionize many industries, including health care.¹ One of the most flourishing subfields of AI is machine learning (ML), which has gained great recognition since the 1990's.² ML allows computers to learn by automatically recognizing significant patterns and relations within large amounts of data without the need for explicit programming.³ In recent years, ML's algorithm has witnessed great improvements, allowing its applications to become beneficial across many fields.⁴ Tasks that would normally require human intelligence such as understanding natural language, recognizing images, and making decisions are now being performed by AI.⁵

With the rise of technology, the use of the internet to search for health-related information (HRI) has become readily accessible for many people around the globe. However, medical information may be inappropriate or even harmful because of existing unverified content and a lack of strict online regulations.⁶ Moreover, even if the information is accurate, some resources may use language above the lay level understanding of the public, rendering it effectively inaccessible.⁶

With the exponential evolution of online search demands comes the growing need for AI to revolutionize how we access dependable medical information. AI has already been successfully applied in the health care field in recent years. In the field of Neurology, an AI system was developed to restore the control of movement in patients with quadriplegia.⁷ In the field of Dermatology, AI-based tools are being used to evaluate the severity of psoriasis⁸ and to distinguish between onychomycosis and healthy nails.⁹ In Otolaryngology, Powell et al. provided a proof of concept that human phonation can be decoded by AI to help in the diagnosis of voice disorders.¹⁰ In the field of ophthalmology, researchers at Google developed and trained a deep convolutional neural network on thousands of retinal fundus images to classify diabetic retinopathy and macular edema in adults with diabetes.¹¹ In primary care fields, physicians can utilize AI to transcribe their notes, analyze patient discussions, and automatically input the necessary information into EHR systems.¹² Nevertheless, AI's ability to provide accurate and comprehensible medical information for various medical topics and disorders has yet to be proven via extensive demonstration.¹³ Therefore, in this study, we sought to examine whether an AI ChatBot can provide accurate, comprehensive, and understandable information on chronic rhinosinusitis (CRS). CRS was chosen as the prototype for this research due to several compelling reasons. Firstly, CRS is one of the most prevalent conditions in Otolaryngology, affecting approximately 11% of the population and accounting for 15% of otolaryngologic outpatient consultations. This condition leads to significant morbidity, impacting the quality of life of millions of individuals.^{14, 15} In the US alone, there are over 30 million physician visits related to CRS annually, a figure that exceeds the number of medical visits for hypertension.¹⁶ CRS's high health care burden and clinical complexity with its wide range of symptoms, causes, and treatments provides a robust test for AI-generated medical content. By evaluating ChatGPT's performance in providing information on a condition as widespread and multifaceted as CRS, we aim to assess its potential as a reliable tool for medical education and patient information.

2 MATERIALS AND METHODS

2.1 Generating medical information

The following data were generated on April 1, 2024. The website is accessible through OpenAI.

To examine ChatGPT's ability to respond with appropriate medical information, we provided the AI ChatBot with inputs in the form of questions about CRS and recorded the responses. These inputs included queries about CRS symptoms and questions about the disorder. A total of six unique ChatGPT outputs were examined, corresponding to the six questions posed. The requests were then categorized into several categories to fully evaluate the ChatBot's knowledge of CRS. ChatGPT's answers were then compared to ICAR (International consensus statement on Allergy and rhinology guidelines for Rhinosinusitis) guidelines to evaluate their accuracy.

ChatGPT-3.5 uses an algorithm that is probabilistic in nature. In other words, it utilizes random sampling to generate a wide variety of responses, possibly including different answers to the same query. This investigation only included ChatGPT's initial answer to each query without regenerating the answers. Additional clarifications or explanations were not permitted. All queries were entered into a ChatGPT account owned by the author in a single day, guaranteeing accuracy in grammar and syntax. We placed each query into a new dialogue window to eliminate confounding factors and guarantee the accuracy and precision of the responses since ChatGPT-3.5 can adjust from the details of every interaction.

The study did not require institutional review board (IRB) approval from Rutgers New Jersey Medical School IRB since it does not utilize human participants and no patient identifying information was used.

2.2 Data analysis

We conducted a thorough linguistic examination to assess the readability and complexity of the AI-generated responses. To accomplish this, we utilized the Flesch Reading Ease and Flesch–Kincaid Grade Level metrics, which are established measures that provide insights into the readability and the educational level required to comprehend the material. These metrics have been widely applied in numerous studies to evaluate the accessibility of online content for individuals facing various conditions, such as ACL injury, glaucoma, and dog bites.^17-19 In our study, we first determined the average (mean) and variability (standard deviation) for both the Flesch Reading Ease and Flesch–Kincaid Grade Level indices, as well as for the percentage of passive sentences, to evaluate ChatGPT's overall readability performance. The reason we also analyzed passive sentence percentage is that it can affect the readability and comprehension of the text. Passive constructions are generally harder to read and understand, particularly for individuals with lower literacy levels.²⁰ Next, we organized the six questions pertaining to CRS into three distinct categories: basic information and prevalence, causes and symptoms, and treatment and complications. For each category, we computed the mean and standard deviation for the Flesch Reading Ease, Flesch–Kincaid Grade Level, and the passive sentence percentage for ChatGPT's responses. These initial steps paved the way for more thorough statistical analyses, including ANOVA and trend analysis, enabling us to investigate whether ChatGPT's readability performance varied across different question categories. ANOVA is used to compare the means of three or more groups to determine if there are any statistically significant differences among them. In this study, ANOVA was used to compare the means of the readability metrics across the three different categories of questions. The assumptions for ANOVA were also verified prior to analysis. Trend analysis was conducted to observe patterns and shifts in the readability metrics across the different categories. The mean and standard deviation for each readability metric were plotted to visualize trends in the data. This analysis helped to identify any systematic changes in readability and complexity of the responses based on the type of information provided. This comprehensive analytical approach allowed us to deeply understand the textual qualities of the AI-generated content and thoroughly evaluate its suitability for patient education.

3 RESULTS

3.1 Qualitative analysis

Figure 1 illustrates the response from ChatGPT when asked about the definition of CRS. Figure 2 displays the ChatGPT-provided prevalence data for CRS. Figures 3-8 detail the causes, symptoms, treatments, and postoperative complications of CRS, respectively, as described by ChatGPT.

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

ChatGPT's response to the question “What is the definition of chronic rhinosinusitis?”

We first want to report the accuracy of ChatGPT's answers to each of the questions.

For the definition question (Figure 1), ChatGPT defines CRS as it is commonly defined in the medical literature—with a symptom duration of at least 12 weeks for CRS diagnosis to be established.²¹ However, when compared to ICAR guidelines, ChatGPT failed to mention that establishing a diagnosis of CRS requires two of four main symptoms (Pressure/pain in face, anosmia/hyposmia, nasal obstruction/blockage, and mucopurulent nasal drainage) to be present for at least 12 weeks in addition to objective evidence on physical exam (purulence from paranasal sinuses or osteomeatal complex, polyps, edema or evidence of inflammation on nasal endoscopy or CT), since symptoms alone have low specificity for CRS diagnosis.^21-23 ChatGPT also failed to mention that the presence of polyps further classifies CRS to CRSsNP (CRS without nasal polyps) or CRSwNP (CRS with nasal polyps), an important omission as treatment differs based on disease subtype according to ICAR.²³

As for the prevalence question (Figure 2), ChatGPT reported a prevalence of 2%–5% for CRS in the US and Europe, and that the prevalence has been increasing over the past few years. However, ChatGPT's source for this percentage is unclear since many sources reports a rate of >10% in the US and Europe for the general population.^{24, 25} ICAR, on the other hand, reports a prevalence in the range of 2.1% to 13.8% in the US and 6.9% to 27.1% in Europe.²³ ICAR attributes the large difference in range between the lower and the upper limits to the fact that the diagnosis of CRS requires objective evidence, which makes it difficult to determine a true prevalence by ICAR.²³

When asked about the causes of CRS (Figure 3), ChatGPT's answer aligned with the ICAR guidelines in that the exact etiology of CRS involves multiple factors.^{23, 26-30} The ChatBot then proceeded to mention some of the cited causes, including lifestyle and environmental factors (i.e., occupational hazards).³¹ ChatGPT also explained how each factor can contribute to or increase the risk of developing CRS (i.e., structural abnormalities and nasal polyps can contribute to developing CRS by blocking the sinuses).^32-34 In the literature, some lifestyle factors are proven in affecting the development of CRS such as exposure to hair-care products, dust, fumes, cleaning agents, allergens, and even cold, dry and low elevation areas.^35-37 Another study even found a 2.5 fold increased risk of CRS development with residential proximity (within 2 km radius) to commercial pesticides application.³⁸ However, according to ICAR guidelines, the link between environmental or lifestyle factors and CRS is very weakly supported.²³ The lifestyle factor that is most strongly associated with CRS is tobacco smoke exposure, according to ICAR.²³ Even though ChatGPT mentioned that smoking is associated with CRS, it did not highlight the importance of this risk factor and only mentioned it as a potential factor. In other words, in this case, ChatGPT's answer is not entirely correct according to ICAR but correct according to other sources.

When asked about the symptoms of CRS (Figure 4), ChatGPT clearly provided the most common symptoms and even the less common ones according to ICAR.^{22, 23, 39} At the end, it also correctly provided some caveats: (1) Symptoms may not always be persistent and can also be intermittent in nature. (2) CRS symptoms may fluctuate in severity. (3) Some patients might have mild symptoms, but others may suffer severe symptoms that affect their quality of life. All of the caveats are consistent with ICAR.^{23, 30, 40}

Concerning the treatment options for CRS (Figures 5 and 6), ChatGPT correctly answered that treatment could be medical or surgical.^{23, 41} The answer also contained a list of medically oriented treatments with accurate corresponding rationales and another list of procedures and surgical techniques.^{22, 23, 42-45} ChatGPT also delved into some accurate details as to how each procedure is performed.^{23, 44, 45} Again, ChatGPT aptly mentioned that the treatment plan may vary depending on each individual's case. However, ChatGPT mentioned mucolytics as part of the medical management for CRS, but ICAR did not provide any recommendations regarding their use due to insufficient evidence.²³ ChatGPT also did not reveal that most of the people who have CRS report severe symptoms and a lack of satisfaction with the treatment options they currently undergo.⁴⁶

ChatGPT also identified the most common postoperative complications and provided useful information on their management (Figures 7 and 8).⁴⁷ One of the most significant postoperative complications that ChatGPT identified was infection, which is a typical concern following any surgical operation⁴⁸ and aligns with ICAR.²³ This risk can be reduced, as ChatGPT correctly noted, by adhering to the proper postoperative care guidelines and taking prescribed antibiotics.^{23, 47-51} Prophylactic antibiotics have been shown in multiple studies, according to ICAR, to significantly reduce the incidence of postoperative infections in individuals undergoing sinus surgery.^{23, 49} In addition, ChatGPT, consistent with ICAR, mentioned persistent or recurrent symptoms such as anosmia, epistaxis, swelling, bruising, and scarring, and other common postoperative complications that patients may experience after surgery.^{23, 50, 51} In general, ChatGPT's response emphasizes the value of consulting a surgeon to learn about the risks and possible complications associated with the procedure as well as the anticipated length of recovery. To reduce the risk of complications, patients should also be provided with suitable postoperative care advice.^{47, 51}

3.2 Quantitative analysis

3.2.1 Statistical analysis

We meticulously assessed the Flesch Reading Ease, Flesch–Kincaid Grade Level, and the percentage of passive sentences across three distinct categories: basic information and prevalence, causes and symptoms, and treatment and complications.

3.3 Overall readability metrics

Across all categories, our findings (Table 1) revealed an average Flesch Reading Ease score of 40.42 (SD = 9.43), signifying that the material is challenging for most readers. The Flesch–Kincaid Grade Level averaged at 12.13 (SD = 1.45) indicates that the content is suitable for an audience with at least a high school reading level. Passive sentence construction was employed in 18.62% (SD = 10.85%) of the sentences, suggesting a moderate use of passive voice.

TABLE 1. Overall and question-specific readability metrics for chronic rhinosinusitis.

Category	Flesch reading ease score	Flesch–Kincaid grade level	Passive sentences
Q1: Definition of chronic rhinosinusitis	37.7	12.6	25%
Q2: Prevalence of chronic rhinosinusitis	43.7	11.7	20%
Q3: Causes of chronic rhinosinusitis	37.3	13	22.2%
Q4: Symptoms of chronic rhinosinusitis	57.7	9.4	0%
Q5: Treatments for chronic rhinosinusitis	30.9	13.4	31.2%
Q6: Postoperative complications	35.2	12.7	13.3%
Mean ± Standard deviation	40.42 ± 9.43	12.13 ± 1.45	18.62 ± 10.85%

3.4 Category-specific readability metrics

Table 2 displays the readability metrics across the three questions' categories: basic information and prevalence, causes and symptoms, and treatment and complications. Mean Flesch Reading Ease, Flesch–Kincaid Grade Level and passive voice percentage were (40.7 ± 4.24, 12.15 ± 0.64, 22.5% ± 3.54%) for basic information and prevalence category, (47.5 ± 14.42, 11.2 ± 2.55, 11.1% ± 15.7%) for causes and symptoms category, (33.05 ± 3.04, 13.05 ± 0.49, 22.25% ± 12.66%) for treatment and complications. This shows variations in Flesch Reading Ease and Flesch–Kincaid Grade Level scores, indicating the complexity of each category. The treatment and complications section, for example, has the lowest Reading ease score and highest Flesch–Kincaid Grade level. Additionally, the Passive Sentences percentage highlights variations in writing style.

TABLE 2. Readability metrics by category for chronic rhinosinusitis.

Category	Flesch reading ease score (mean ± standard deviation)	Flesch–Kincaid grade level (mean ± standard deviation)	Passive sentences (%) (mean ± standard deviation)
Basic information and prevalence	40.7 ± 4.24	12.15 ± 0.64	22.5 ± 3.54%
Causes and symptoms	47.5 ± 14.42	11.2 ± 2.55	11.1 ± 15.7%
Treatment and complications	33.05 ± 3.04	13.05 ± 0.49	22.25 ± 12.66%

3.5 Statistical significance

Applying ANOVA tests to these metrics, we ascertained the p-values: Flesch Reading Ease (p = .385), Flesch–Kincaid Grade Level (p = .555), and Passive Sentences (p = .601), all suggesting no statistically significant differences in the readability across the different categories.

3.6 Trend analysis

A trend analysis with standard deviations was conducted to visualize the readability shifts between categories (Figure 9). The Flesch Reading Ease scores displayed a nominal increase from basic information to causes and symptoms but dropped in the treatment and complications category. The Flesch–Kincaid Grade Level indicated a consistent upward trajectory, signifying increasing textual complexity. The standard deviations highlighted the variability within each category, particularly pronounced in the causes and symptoms segment for both Flesch Reading Ease and Passive Sentences.

4 DISCUSSION

ChatGPT was able to respond to all questions, from defining CRS to describing the causes of disease, symptoms, postoperative complications, and even detailing the roles that rehabilitation and patient education may play. Each response's sentences closely adhered to appropriate grammatical rules and sentence structure.

As for the statistical analysis, the average Flesch Reading Ease score for all categories combined was moderately low (Table 1). This score aligns with the style typically found in academic or professional documents, suggesting that, while ChatGPT's responses are informative, they may not be entirely accessible to individuals without a higher educational background.

Interestingly, the readability scores did not vary significantly across different content categories, as evidenced by the ANOVA test results. This consistency in complexity and readability is beneficial in one aspect, as it suggests that ChatGPT maintains a uniform level of language complexity regardless of the topic complexity. However, it also implies that the AI does not automatically adjust its language complexity in response to the varying difficulty levels of the subject matter. For instance, one might expect the language around basic information and prevalence to be more accessible than that regarding treatment and complications, which inherently deals with more complex medical procedures and concepts.

Even though ANOVA found no significant differences in the readability scores, it is important to mention that the increasing trend in the Flesch–Kincaid Grade Level across the categories may reflect the intrinsic complexity of the medical information as it progresses from basic definitions to detailed medical procedures and potential complications. However, notably, the causes and symptoms category demonstrated a higher Flesch Reading Ease and a lower Flesch–Kincaid Grade Level compared to the other categories (Table 2). The lower percentage of passive sentences in this category may contribute to its relative readability.

It is important to mention that relying solely on readability metrics to determine if medical material is appropriate for patients has significant drawbacks. Readability metrics like the Flesch–Kincaid assess linguistic simplicity but overlook critical aspects such as health literacy and content accuracy.

Health literacy involves understanding medical terms and concepts, which readability metrics do not measure. Even easy-to-read text can be confusing if it contains medical jargon or complex ideas that are not clearly explained. Furthermore, readability metrics do not ensure the accuracy of the information, which is crucial for patient safety and effective health management.

To create truly patient-appropriate medical material, a comprehensive approach is needed. This approach should combine readability assessments with considerations of health literacy and content accuracy. This means using plain language, explaining medical terms, incorporating visual aids, and having medical professionals review the content for accuracy and relevance.

In the context of patient education, it is also crucial to consider the health literacy of the audience. The National Assessment of Adult Literacy reports that only 12% of U.S. adults have proficient health literacy.⁵² Given that many adults may struggle with complex health information, the findings of this study suggest that there is a need for further optimization to enhance readability and ensure that the information is comprehensible to all patients, irrespective of their educational background. Future iterations of AI-driven platforms could focus on dynamically adjusting language complexity based on the user's comprehension level, potentially assessed through preliminary questions or interactive dialogue. Furthermore, incorporating visual aids and interactive elements could enhance understanding and engagement, particularly for topics that are inherently complex.^{53, 54}

This study has a few limitations, however. The collection of data was performed at a specific moment in time, which poses a challenge in the rapidly changing domain of AI. Furthermore, the qualitative approach of this study inherently carries the potential for some level of investigator bias. The study also acknowledges the impact of variations in ChatGPT's responses due to differences in how questions are phrased, alongside the restricted range of question sources, as potential areas for further exploration. Future studies could benefit from comparing ChatGPT to other AI models to gain a broader perspective on its effectiveness and limitations in medical and patient education contexts. Future studies can also explore the variance in ChatGPT's responses over multiple instances and with follow-up questions. This is a valuable area for future research and could investigate the consistency and reliability of the AI over repeated interactions, providing a more comprehensive understanding of its performance. Nonetheless, we believe that despite these limitations, our study offers valuable insights into an information source that is increasingly prevalent in today's digital age.

ChatGPT offers several notable advantages in providing medical information. Its primary strength lies in accessibility; it allows users to obtain medical information quickly and easily, regardless of their location. This can be particularly beneficial for individuals in remote areas or those who face barriers to accessing health care professionals. The speed at which ChatGPT generates responses is another significant advantage, providing instant answers to medical queries.

However, there are significant disadvantages to using ChatGPT for medical information. One major issue is accuracy; the inaccuracies in some of ChatGPT's answers highlight the limitation that ChatGPT's responses are only as reliable as the data it has been trained on. Another disadvantage is the omission of critical details; for example, ChatGPT failed to mention the need for objective evidence in diagnosing CRS, which is crucial for accurate diagnosis and treatment planning. The readability and comprehensibility of ChatGPT's responses also pose a challenge, as the analysis revealed that its outputs are often at a high reading level, making them unsuitable for individuals with limited health literacy. Additionally, ChatGPT's responses lack the nuance and context-specific advice that human health care providers can offer, limiting its ability to tailor information based on individual patient histories or specific circumstances.

To enhance the quality of AI-generated medical information, several methods can be implemented. Integrating AI systems with verified medical databases like PubMed and Medline ensures reliance on current and reliable sources. Regularly updating training data with the latest medical research and guidelines can reduce outdated information. A human-in-the-loop approach, where medical professionals review AI-generated content, can identify and correct discrepancies. Improving the AI's contextual understanding and prioritization of critical clinical information can enhance response relevance and completeness. Increasing transparency about how responses are generated and sourced can also build trust and reliability.

Even though ChatGPT should never be considered a replacement to medical professionals' advice, it can enhance professional medical information by serving as a supplementary resource to provide preliminary information and answer common questions, preparing patients for consultations. It can also explain complex terms in simple language to improve understanding, but medical professionals should review its responses to ensure accuracy. By adjusting responses based on health literacy levels and incorporating visual aids, ChatGPT can make complex information more accessible.

In summary, while ChatGPT presents a promising tool for enhancing access to medical information and can serve as a useful starting point for patient education and general inquiries, it should not replace professional medical advice. Ensuring the accuracy, completeness, and readability of its responses, and providing contextually relevant and individualized information, are critical areas for future development. Also, the current level of language complexity highlights an area for improvement. To fully harness the educational potential of AI in health care, there must be a concerted effort to tailor the readability of content to the diverse needs of patients, ensuring that information is not only accurate but also accessible to those it intends to serve.

5 CONCLUSION

ChatGPT is a rapidly developing tool that may, soon, become an invaluable asset to the health care system. As of today, this tool may be useful for patients who have difficulty accessing medical information due to geographic or financial constraints. The AI ChatBot has a user-friendly interface and a unique ability to understand a patient's natural language. Nevertheless, the content generated by the ChatBot may be inaccurate, biased, or hard for many patients to understand so, while promising, it is not yet time for AI to be considered a reliable source for attaining medical information especially for patients with limited health literacy. Moreover, because each patient's case is unique, AI is not yet be able to provide precise recommendations to individual patients in the same way that human physicians are able to provide.

FUNDING INFORMATION

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

CONFLICT OF INTEREST STATEMENT

The authors declare that they have no conflict of interest.

REFERENCES

1Wen Z, Huang H. The potential for artificial intelligence in healthcare. J Commer Biotechnol. 2022; 27(4): 217-225.
Google Scholar
2Siwach M, Mann S. A compendium of various applications of machine learning. Int J Res Eng Technol. 2022; 9: 1141-1144.
Google Scholar
3Angelov PP, Gu X. Empirical Approach to Machine Learning. Springer; 2019.
10.1007/978-3-030-02384-3
Google Scholar
4 Royal Society (Great Britain). Machine Learning: the Power and Promise of Computers that Learn by Example: an Introduction. Royal Society; 2017.
Google Scholar
5Mohammad SM. Ethics sheets for AI tasks. arXiv preprint arXiv:2107.01183 2021.
Google Scholar
6Murray E, Lo B, Pollack L, et al. The impact of health information on the internet on health care and the physician-patient relationship: national US survey among 1.050 US physicians. J Med Internet Res. 2003; 5(3):e17.
10.2196/jmir.5.3.e17
PubMed Web of Science® Google Scholar
7Bouton CE, Shaikhouni A, Annetta NV, et al. Restoring cortical control of functional movement in a human with quadriplegia. Nature. 2016; 533(7602): 247-250.
10.1038/nature17435
CAS PubMed Web of Science® Google Scholar
8Shrivastava VK, Londhe ND, Sonawane RS, Suri JS. A novel and robust Bayesian approach for segmentation of psoriasis lesions and its risk stratification. Comput Methods Prog Biomed. 2017; 150: 9-22.
10.1016/j.cmpb.2017.07.011
PubMed Web of Science® Google Scholar
9Han SS, Park GH, Lim W, et al. Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: automatic construction of onychomycosis datasets by region-based convolutional deep neural network. PLoS One. 2018; 13(1):e0191493.
10.1371/journal.pone.0191493
PubMed Web of Science® Google Scholar
10Powell ME, Rodriguez Cancio M, Young D, et al. Decoding phonation with artificial intelligence (DeP AI): proof of concept. Laryngoscope Investig Otolaryngol. 2019; 4(3): 328-334.
10.1002/lio2.259
PubMed Web of Science® Google Scholar
11Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016; 316(22): 2402-2410.
10.1001/jama.2016.17216
PubMed Web of Science® Google Scholar
12Malik P, Pathania M, Rathaur VK. Overview of artificial intelligence in medicine. J Family Med Prim Care. 2019; 8(7): 2328-2331.
10.4103/jfmpc.jfmpc_440_19
PubMed Google Scholar
13Liévin V, Hother CE, Motzfeldt AG, Winther O. Can Large Language Models Reason about Medical Questions? Patterns (NY). 2023; 5:100943.
10.1016/j.patter.2024.100943
Google Scholar
14Shah RK. Anatomical variation of nose and paranasal sinuses among patients with chronic rhinosinusitis on computer tomography. Birat J Health Sci. 2022; 7(1): 1732-1735.
10.3126/bjhs.v7i1.45825
Google Scholar
15Wahid NW, Smith R, Clark A, Salam M, Philpott C. The socioeconomic cost of chronic rhinosinusitis study. Rhinology. 2020; 58(2): 112-125.
10.4193/Rhin19.424
CAS PubMed Web of Science® Google Scholar
16Grzegorzek T, Kolebacz B, Stryjewska-Makuch G, Kasperska-Zając A, Misiołek M. The influence of selected preoperative factors on the course of endoscopic surgery in patients with chronic rhinosinusitis. Adv Clin Exp Med. 2014; 23(1): 69-78.
10.17219/acem/37024
PubMed Google Scholar
17Dadabhoy M, Awal D. 152 assessment of quality and readability of online patient education materials relating to dog bites. Br J Surg. 2022; 109(Supplement_6):znac269.
10.1093/bjs/znac269.298
Google Scholar
18Crabtree L, Lee E. Assessment of the readability and quality of online patient education materials for the medical treatment of open-angle glaucoma. BMJ Open Ophthalmol. 2022; 7(1):e000966.
10.1136/bmjophth-2021-000966
PubMed Google Scholar
19Gao B, Shamrock AG, Gulbrandsen TR, et al. Can patients read, understand, and act on online resources for anterior cruciate ligament surgery? Orthop J Sports Med. 2022; 10(7).
10.1177/23259671221089977
Google Scholar
20Flinton D, Singh MK, Haria K. Readability of internet-based patient information for radiotherapy patients. J Radiother Pract. 2018; 17(2): 142-150.
10.1017/S1460396917000620
Google Scholar
21Sedaghat AR. Chronic rhinosinusitis. Am Fam Physician. 2017; 96(8): 500-506.
PubMed Web of Science® Google Scholar
22Rosenfeld RM, Piccirillo JF, Chandrasekhar SS, et al. Clinical practice guideline (update): adult sinusitis. Otolaryngol Head Neck Surg. 2015; 152(2_suppl): S1-S39.
10.1177/0194599815572097
PubMed Web of Science® Google Scholar
23Orlandi RR, Kingdom TT, Smith TL, et al. International consensus statement on allergy and rhinology: rhinosinusitis 2021. Int Forum Allergy Rhinol. 2021; 11(3): 213-739.
10.1002/alr.22741
PubMed Web of Science® Google Scholar
24Zhou F, Zhang T, Jin Y, et al. Developments and emerging trends in the global treatment of chronic rhinosinusitis from 2001 to 2020: a systematic bibliometric analysis. Front Surg. 2022; 9:851923.
10.3389/fsurg.2022.851923
PubMed Google Scholar
25Hastan D, Fokkens WJ, Bachert C, et al. Chronic rhinosinusitis in Europe–an underestimated disease. A GA2LEN study. Allergy. 2011; 66(9): 1216-1223.
10.1111/j.1398-9995.2011.02646.x
CAS PubMed Web of Science® Google Scholar
26Bhattacharyya N. Bacterial infection in chronic rhinosinusitis: a controlled paired analysis. Am J Rhinol. 2005; 19(6): 544-548.
10.1177/194589240501900602
PubMed Google Scholar
27Sedaghat AR, Gray ST, Caradonna SD, Caradonna DS. Clustering of chronic rhinosinusitis symptomatology reveals novel associations with objective clinical and demographic characteristics. Am J Rhinol Allergy. 2015; 29(2): 100-105.
10.2500/ajra.2015.29.4140
PubMed Web of Science® Google Scholar
28Busaba NY, Siegel NS, Salman SD. Microbiology of chronic ethmoid sinusitis: is this a bacterial disease? Am J Otolaryngol. 2004; 25(6): 379-384.
10.1016/j.amjoto.2004.04.001
PubMed Google Scholar
29Kutluhan A, Çetin H, Kale H, et al. Comparison of natural ostiodilatation and endoscopic sinus surgery in the same patient with chronic sinusitis. Braz J Otorhinolaryngol. 2020; 86: 56-62.
10.1016/j.bjorl.2018.09.006
PubMed Google Scholar
30Aurora R, Chatterjee D, Hentzleman J, Prasad G, Sindwani R, Sanford T. Contrasting the microbiomes from healthy volunteers and patients with chronic rhinosinusitis. JAMA Otolaryngol Head Neck Surg. 2013; 139(12): 1328-1338.
10.1001/jamaoto.2013.5465
PubMed Web of Science® Google Scholar
31Koh D, Kim H, Han S. The relationship between chronic rhinosinusitis and occupation: the 1998, 2001, and 2005 Korea national health and nutrition examination survey (KNHANES). Am J Ind Med. 2009; 52(3): 179-184.
10.1002/ajim.20665
PubMed Web of Science® Google Scholar
32Feng CH, Miller MD, Simon RA. The united allergic airway: connections between allergic rhinitis, asthma, and chronic sinusitis. Am J Rhinol Allergy. 2012; 26(3): 187-190.
10.2500/ajra.2012.26.3762
PubMed Web of Science® Google Scholar
33Ramadan HH, Fornelli R, Ortiz AO, Rodman S. Correlation of allergy and severity of sinus disease. Am J Rhinol. 1999; 13(5): 345-348.
10.2500/105065899781367500
CAS PubMed Web of Science® Google Scholar
34Kasapoğlu F, Onart S, Basut O. Preoperative evaluation of chronic rhinosinusitis patients by conventional radiographies, computed tomography and nasal endoscopy. Turk J Ear Nose Throat. 2009; 19(4): 184-191.
Google Scholar
35Ghatee MA, Kanannejad Z, Nikaein K, Fallah N, Sabz G. Geo-climatic risk factors for chronic rhinosinusitis in southwest Iran. PLoS One. 2023; 18(7):e0288101.
10.1371/journal.pone.0288101
CAS PubMed Google Scholar
36Alkholaiwi FM, Almutairi RR, Alrajhi DM, Alturki BA, Almutairi AG, Binyousef FH. Occupational and environmental exposures, the association with chronic sinusitis. Saudi Med J. 2022; 43(2): 125-131.
10.15537/smj.2022.43.2.20210849
PubMed Google Scholar
37Clarhed UK, Johansson H, Svendsen MV, Torén K, Møller Fell AK, Hellgren J. Occupational exposure and the risk of new-onset chronic rhinosinusitis–a prospective study 2013-2018. Rhinology. 2020; 58: 597-604.
CAS PubMed Web of Science® Google Scholar
38Yang H, Paul KC, Cockburn MG, et al. Residential proximity to a commercial pesticide application site and risk of chronic rhinosinusitis. JAMA Otolaryngol Head Neck Surg. 2023; 149(9): 773-780.
10.1001/jamaoto.2023.1499
PubMed Web of Science® Google Scholar
39Savović S, Buljčik-Čupić M, Jovančević L, Kljajić V, Lemajić-Komazec S, Dragičević D. Frequency and intensity of symptoms in patients with chronic rhinosinusitis. Srp Arh Celok Lek. 2019; 147(1–2): 34-38.
10.2298/SARH171207013S
Google Scholar
40Eloy P, Poirrier AL, De Dorlodot C, Van Zele T, Watelet JB, Bertrand B. Actual concepts in rhinosinusitis: a review of clinical presentations, inflammatory pathways, cytokine profiles, remodeling, and management. Curr Allergy Asthma Rep. 2011; 11: 146-162.
10.1007/s11882-011-0180-0
CAS PubMed Web of Science® Google Scholar
41Ragab S, Scadding GK, Lund VJ, Saleh H. Treatment of chronic rhinosinusitis and its effects on asthma. Eur Respir J. 2006; 28(1): 68-74.
10.1183/09031936.06.00043305
CAS PubMed Web of Science® Google Scholar
42Rosenfeld RM, Andes D, Bhattacharyya N, et al. Clinical practice guideline: adult sinusitis. Otolaryngol Head Neck Surg. 2007; 137(3): S1-S31.
10.1016/j.otohns.2006.10.032
PubMed Web of Science® Google Scholar
43Dass K, Peters AT. Diagnosis and management of rhinosinusitis: highlights from the 2015 practice parameter. Curr Allergy Asthma Rep. 2016; 16: 1-7.
10.1007/s11882-016-0607-8
PubMed Google Scholar
44Tomazic PV, Stammberger H, Braun H, et al. Feasibility of balloon sinuplasty in patients with chronic rhinosinusitis: the Graz experience. Rhinology. 2013; 51(2): 120-127.
10.4193/Rhino12.194
CAS PubMed Web of Science® Google Scholar
45Koskinen A, Penttilä M, Myller J, et al. Endoscopic sinus surgery might reduce exacerbations and symptoms more than balloon sinuplasty. Am J Rhinol Allergy. 2012; 26(6): e150-e156.
10.2500/ajra.2012.26.3828
PubMed Google Scholar
46Palmer JN, Messina JC, Biletch R, Grosel K, Mahmoud RA. A cross-sectional, population-based survey of US adults with symptoms of chronic rhinosinusitis. Allergy Asthma Proc. 2019; 40(1): 48-56.
Google Scholar
47Hopkins C, Browne JP, Slack R, et al. Complications of surgery for nasal polyposis and chronic rhinosinusitis: the results of a national audit in England and Wales. Laryngoscope. 2006; 116(8): 1494-1499.
10.1097/01.mlg.0000230399.24306.50
PubMed Web of Science® Google Scholar
48Giri VP, Giri OP, Bajracharya S, et al. Risk of acute kidney injury with amikacin versus gentamycin both in combination with metronidazole for surgical prophylaxis. J Clin Diagn Res. 2016; 10(1):FC09.
CAS PubMed Google Scholar
49Bhattacharyya N, Fried MP. The accuracy of computed tomography in the diagnosis of chronic rhinosinusitis. Laryngoscope. 2003; 113(1): 125-129.
10.1097/00005537-200301000-00023
PubMed Web of Science® Google Scholar
50Tan BK, Chandra RK. Postoperative prevention and treatment of complications after sinus surgery. Otolaryngol Clin N Am. 2010; 43(4): 769-779.
10.1016/j.otc.2010.04.004
PubMed Web of Science® Google Scholar
51Jeican II, Trombitas V, Crivii C, et al. Rehabilitation of patients with chronic rhinosinusitis after functional endoscopic sinus surgery. Balneo PRM Res J. 2021; 12(1): 65-72.
10.12680/balneo.2021.421
Google Scholar
52 Office of the Surgeon General. Office of Disease Prevention and Health Promotion. Proceedings of the Surgeon General's Workshop on Improving Health Literacy. National Institutes of Health; 2006.
Google Scholar
53Hafner C, Schneider J, Schindler M, Braillard O. Visual aids in ambulatory clinical practice: experiences, perceptions and needs of patients and healthcare professionals. PLoS One. 2022; 17(2):e0263041.
10.1371/journal.pone.0263041
CAS PubMed Google Scholar
54Stewart JA, Wood L, Wiener J, et al. Visual teaching aids improve patient understanding and reduce anxiety prior to a colectomy. Am J Surg. 2021; 222(4): 780-785.
10.1016/j.amjsurg.2021.01.029
PubMed Web of Science® Google Scholar

Volume9, Issue5

October 2024

e70009

Search for medical information for chronic rhinosinusitis through an artificial intelligence ChatBot