Bridging the conversational gap in epilepsy: Using large language models to reveal insights into patient behavior and concerns from online discussions
Abstract
Objective
This study was undertaken to explore the experiences and concerns of people living with epilepsy by analyzing discussions in an online epilepsy community, using large language models (LLMs) to identify themes, demographic patterns, and associations with emotional distress, substance use, and suicidal ideation.
Methods
We analyzed 56 970 posts and responses to them from 21 906 users on the epilepsy forum (subreddit) of Reddit and 768 504 posts from the same users in other subreddits, between 2010 and 2023. LLMs, validated against human labeling, were used to identify 23 recurring themes, assess demographic differences, and examine cross-posting to depression- and suicide-related subreddits. Hazard ratios (HRs) were calculated to assess the association between specific themes and activity in mental health forums.
Results
Prominent topics included seizure descriptions, medication management, stigma, drug and alcohol use, and emotional well-being. The posts on topics less likely to be discussed in clinical settings had the highest engagement. Younger users focused on stigma and emotional issues, whereas older users discussed medical treatments. Posts about emotional distress (HR = 1.3), postictal state (HR = 1.4), surgical treatment (HR = .7), and work challenges (HR = 1.6) predicted activity in a subreddit associated with suicidal ideation, whereas emotional distress (HR = 1.5), surgical treatment (HR = .6), and stigma (HR = 1.3) predicted activity in the depression subreddit. Substance use discussions showed a temporal pattern of association with seizure descriptions, implying possible opportunities for intervention.
Significance
LLM analysis of online epilepsy communities provides novel insights into patient concerns often overlooked in clinical settings. These findings may improve patient–provider communication, inform personalized interventions, and support the development of patient-reported outcome measures. Additionally, hazard models can help identify at-risk individuals, offering opportunities for early mental health interventions.
PODCAST
Key points
- LLMs are used to analyze Reddit data, revealing unfiltered insights into epilepsy patient concerns such as stigma, treatment, and emotional challenges.
- Artificial intelligence identifies 23 key themes in epilepsy discussions, offering demographic and behavioral insights across a large online community.
- Substance use discussions show cyclic patterns linked to seizures, implying possible opportunities for targeted interventions.
- Emotional issues, postictal state, and work concerns are linked to higher risks of suicide and depression in epilepsy patients.
- Findings may improve patient–provider communication and optimize development of patient-reported outcome measures.
1 INTRODUCTION
Epilepsy is a neurological disorder that extends far beyond the occurrence of seizures, and health care providers in the field should strive to be acquainted with the condition's broad impacts on patients' lives.1-3 Understanding epilepsy from the perspective of those living with it daily can strengthen the patient–doctor relationship and ultimately improve care quality. However, communication barriers and limitations of the health care setting can prevent patients from fully discussing their concerns. As a result, many patients seek consultation outside the health care setting, with online platforms playing an increasingly central role.4-8 Online peer support has been recognized as a key resource for people living with epilepsy (PLWE), helping to reduce feelings of isolation, cope with stigma, and share valuable “experiential knowledge.”8-10
This paper aims to analyze discussions on the epilepsy “subreddit” (r/epilepsy), an online community on Reddit, to reveal issues of concern for PLWE in a more direct and uncensored manner than traditional health care interactions allow. Unlike small-scale qualitative approaches that seek detailed and nuanced insights by examining individual experiences,11-13 our approach offers a broader view by analyzing large volumes of posts and conversations from a multitude of individuals, enabling the identification of themes and issues at a community scale.
Reddit is a popular social media platform and online community, with more than 500 million users accounts,14 used by 22% of Americans as of 2023.15 On Reddit, users can initiate and engage in discussions with others based on common interests. It functions as a large network of individual communities, known as “subreddits,” each dedicated to a specific topic. These subreddits are denoted by the prefix “r/” (e.g., r/epilepsy). One of Reddit's central features is anonymity in posting, which can enable a comfortable environment for sharing concerns and seeking information.
Past work has analyzed different health-related communities on Reddit. De Choudhury et al. analyzed associations of posts in a subreddit related to depression with transition to a subreddit of people with suicidal ideation (r/suicidewatch).16 Yom Tov and Hochberg analyzed reported weight loss over time on diet subreddits.17 Similarly, Mazuz and Yom Tov analyzed posting on loneliness subreddits and reported the effect of loneliness on mental health outcomes.18 We follow the approach of the latter two papers to track users over time on Reddit.
Previous studies have examined online conversations related to epilepsy.19-22 He et al. analyzed 355 838 posts from epilepsy support platforms, using text mining methods, and found treatment issues to be discussed predominantly.20 Fazekas et al. analyzed 264 706 conversations from social media platforms, using natural language processing (NLP) and manual qualitative analysis.19 Major themes were identified, and lack of awareness about epilepsy was emphasized. Falcone et al. applied machine learning and NLP to explore 222 000 online conversations, to identify and analyze suicide-related discussions, and found age-related differences in attitude.21
In the current study, we expand on these past works in two important ways. First, we apply advanced large language models (LLMs) for deeper thematic analysis. Second, we explore cross-interactions between epilepsy-focused communities and other online communities, enabling the tracking of people over long time frames in a wide range of human behaviors.
2 MATERIALS AND METHODS
Our data were obtained from the social media website Reddit (www.reddit.com), where users communicate in groups, called subreddits, centered on topics of interest. Specifically, we focused on a subreddit devoted to epilepsy (r/epilepsy).
We extracted all posts to r/epilepsy made between 2010 (at which point the r/epilepsy subreddit was established) and 2023 (inclusive). This resulted in 56 970 posts from 21 906 users. Additionally, we extracted all posts and all comments made by the users who posted on r/epilepsy. We excluded users who made more than 200 posts during this period to prevent skewing the results due to overrepresentation. This resulted in 768 504 posts and 2 225 304 comments.
For each post and comment, we extracted the user identifier, time and date, text of the post, and any available description of the user (which could include medicines taken, age, gender, and other relevant information).
We used ChatGPT4-Turbo to label the posts from r/epilepsy. Specifically, the first question asked to ascertain (on a scale of 0–10) whether the person making the post was likely diagnosed with epilepsy and the second question whether (on a scale of 0–10) this person had an epileptic seizure during the past month. The exact prompts to ChatGPT4 are given in Appendix A.
To estimate the accuracy of these labels, a human labeler read 480 random posts and labeled them according to the same questions, albeit with a binary label (yes or no). The labeler was a PGY-5 neurology resident at Sheba Medical Center with experience treating PLWE at the epilepsy clinic, and any inconclusive cases were discussed with a senior epileptologist at the clinic. We compared the labels returned by ChatGPT4 with the human-generated labels and report the area under the receiver operating characteristic curve (AUC) below.
A minority of the posts indicated the age and gender of the posting user, either explicitly (“male, 31 years old”) or through colloquial acronyms (e.g., “M31”). We developed a regular expression to capture this information and applied it to all posts that users made (i.e., in all subreddits). If a user reported their age or gender multiple times, we took the average year of birth and the most common reported gender. We report the accuracy of the capture, as compared to a human labeler who reviewed 100 random posts.
Finally, all posts that had a ChatGPT score greater than or equal to 7 for the first question above were classified into their topic using ChatGPT4o. This threshold was chosen as the maximum of Youden's J statistic of the receiver operating characteristic curve for the 100 random posts (see Results and Appendix C, Figures A1 and A2).
Preliminary analysis identified 23 common themes for posts in r/epilepsy. These themes were as follows: adjustment to epilepsy, alcohol and drugs (as triggers), cognitive, consultation, diagnostic process, diet, driving, emotion, fatigue, handicap/restrictions, health care costs, health care experiences, other, postictal state, recreational drugs, relationships, seizure burden, seizure descriptions, side effects of treatment, stigma and misconceptions, treatment–medical, treatment–surgical, and work. The full prompt to ChatGPT4o and the full descriptions of the themes are reported in Appendix B.
The accuracy of the identified themes was tested by having one of the authors label 130 random posts for their themes and compare the identified themes to those returned by ChatGPT4o.
We computed the hazard of posting in forums related to depression (r/depression) and to suicide (r/suicidewatch) as a function of the topics of posts in the epilepsy forum. A second set of hazard models augmented the topics with the age and gender of the poster, focused only on posts where this information was available. Posts by users who did not later post in a depression or suicidality forum were marked as censored. If multiple posts were made to these forums, only the first post was used.
The sentiment expressed in posts was quantified using NLTK's VADER sentiment analyzer, which provides a measure of the positive and negative sentiment expressed in each text.
3 RESULTS
The accuracy of ChatGPT4-Turbo in answering the first question, compared to a human labeler, was AUC = .79 (n = 480; see also Appendix C, Figures A1 and A2). Among posts where the answer to the first question was positive (n = 335), the accuracy in answering the second question was AUC = .77.
Age and gender were identified in 8763 posts (15% of total posts) made by 3710 users (17% of total users). The accuracy of age and gender identification was 99% (n = 100).
Average age of users was 29.1 years (minimum = 13, maximum = 99, SD = 12.9). User gender was 59% female, 41% male.
In the test data (n = 130), the average number of topics identified by the human labeler was 1.3, compared to 4.2 by the LLM. The average percentage of topics identified by the human labeler that were also identified by the LLM was 34%. The average percentage of topics identified by the LLM that were also identified by the human labeler was 18%.
Table 1 shows the 10 most common websites posted in comments on the epilepsy forum. As the table indicates, seven of the 10 websites are authoritative.
Domain | Count |
---|---|
www.epilepsy.com | 694 |
www.epilepsy.org.uk | 145 |
www.youtube.com | 250 |
www.ncbi.nlm.nih.gov | 178 |
en.wikipedia.org | 58 |
www.amazon.com | 57 |
www.sciencedirect.com | 33 |
www.mayoclinic.org | 33 |
www.drugs.com | 33 |
www.cureepilepsy.org | 28 |
Figure 1A shows the frequency of LLM-labeled topics in posts. Demographics were available for 23% of the LLM-labeled posts. Figure 1B shows the ratio of posts by females compared to males, and Figure 1C the distribution of topics by age. In the latter, the likelihood of posts by topic are computed for each age group, and the likelihood is then divided by the average likelihood of posts in a given topic, for all age groups. As the figure shows, the most common topics for discussion were consultation, seizure descriptions, and a discussion of the side effects of treatments. When stratified by gender, females were more likely to discuss surgical treatments and stigma, whereas males were more likely to discuss alcohol and drugs as triggers.

Figure 2 shows the average number of comments to posts on each topic. As the figure shows, posts on stigma and on recreational drugs receive the most responses in the epilepsy forum.

Figure 3 shows the percentage of posts per topic that express more positive sentiment than negative sentiment. As the figure shows, posts on surgical treatment, diet, and recreational drugs are usually more positive in their sentiment, compared to posts on fatigue, emotion, and postictal state, which are generally more negative in their sentiment.

Figure 4 shows the likelihood of posts in subreddits related to alcohol or to recreational drugs (including cannabis, psychedelics, cognitive enhancers, euphoric agents, and hallucinogens), relative to the first post whose topic was recreational drug use and (separately) alcohol and drugs as triggers on the epilepsy forum. As the figures show, there was a decline in posting to drug and alcohol subreddits in the 6 months before people posted about recreational drug use, immediately returning to baseline after the question. In contrast, people were more likely to post about drugs and alcohol before they posted about drugs as triggers, after which they were much less likely to post about it.

Table 2 shows the Cox proportional hazard model parameters for future postings in a suicide-related subreddit (SuicideWatch) and in a depression-related subreddit. As the table shows, suicide ideation (posting to SuicideWatch) is associated with writing on emotional issues (hazard ratio [HR] = 1.28), postictal state (HR = 1.37), and work issues (HR = 1.64) and negatively associated with posting on driving (HR = .60) and surgical treatment (HR = .69). When demographics are added, younger users and males are more likely to post on suicidal ideation (HR = 1.09). Additionally, emotional issues (HR = 1.48), (negative) health care experiences (HR = 1.43), and work issues (HR = 1.93) are positively associated with suicidal ideation, whereas surgical treatment (HR = .41) is negatively associated with it. Similar drivers are observed with the likelihood to post in the depression subreddit.
Future post topic | Only topics | With demographics | ||||
---|---|---|---|---|---|---|
Posting topic | exp(B) | p | Posting topic | exp(B) | p | |
SuicideWatch | Driving | .607 | .002 | Emotion | 1.477 | .0004 |
Emotion | 1.282 | .001 | Health care experiences | 1.429 | .013 | |
Postictal state | 1.373 | .004 | Treatment—surgical | .414 | .004 | |
Treatment—surgical | .694 | .010 | Work | 1.933 | .0003 | |
Work | 1.637 | .000 | Age | .978 | .0002 | |
Gender | 1.087 | <10−6 | ||||
Depression | Consultation | .844 | .000 | Consultation | .880 | .040 |
Emotion | 1.485 | .000 | Diagnostic process | .756 | .021 | |
Stigma and misconceptions | 1.327 | .003 | Emotion | 1.634 | <10−6 | |
Treatment—surgical | .614 | .000 | Health care experiences | 1.323 | .002 | |
Relationships | 1.382 | .016 | ||||
Stigma and misconceptions | 1.487 | .003 | ||||
Treatment—surgical | .430 | .0005 | ||||
Age | .986 | .000006 | ||||
Gender | 1.0448 | .000001 |
- Note: Gender is coded as 0 = female, 1 = male.
4 DISCUSSION
In this study, we analyzed hundreds of thousands of Reddit posts to explore the experiences and concerns of people with epilepsy, leveraging the power of big data and artificial intelligence. This approach allowed us to identify the most prominent topics and themes discussed, their prevalence in different demographics, and their associations with behaviors such as substance use, emotional distress, and suicidal ideation.
4.1 Thematic analysis
“Fellow people with epilepsy, share your experiences! I don't know anyone else with epilepsy and would love to hear your stories. I'll go first…”
All quotes are derived from the r/epilepsy subreddit and have been paraphrased to preserve the original meaning and intent.
Our initial analysis aimed to explore the different themes of discourse and their prevalence, essentially asking: “What do people with epilepsy talk about, and what concerns them the most?” Naturally, individual posts often encompassed multiple overlapping and interconnected themes.
The analysis revealed that one of the primary reasons for posting was to seek consultation, reflecting a strong need for advice through informal channels. A central concern for this sort of information gathering is the potential for misinformation, as demonstrated by various studies.22-24 Interestingly, our analysis showed that consultation discussions with the highest levels of engagement (determined by the number of comments per post) concerned topics that are arguably less likely to be addressed in the medical setting, including issues of stigma, drug and alcohol use, and impacts on interpersonal relationships. Concerns that would be more likely to come up in clinical visits, such as medication issues, showed lower engagement. This dynamic suggests that peer-to-peer advice via informal sources such as Reddit can complement, rather than compete with, professional health care advice.
Other prominent themes included medical treatment concerns, the diagnostic process, driving restrictions, seizure burden (including the postictal state), and surgical treatment options. A significant portion of conversations centered on seizure descriptions, often as a means for venting, seeking validation, or questioning symptoms. The popularity of certain topics over others may be attributed to factors such as their broad relevance (i.e., issues that affect most people living with epilepsy) and their emotional resonance.
4.2 Themes by subpopulations
“Just hanging in there… Age: 26; Attitude: Fair, but I'm smiling. It's been over a year since my epilepsy diagnosis. My memory has gotten a bit better, but I still zone out and lose time occasionally. I haven't had a seizure in six months, but there's always the chance of more. Does anyone else feel like they're just… hanging in there?”
The impact of epilepsy can vary significantly based on factors such as age, gender, and life stage.25, 26 Accordingly, our analysis uncovered notable differences in the themes discussed by various demographic groups within the epilepsy community. Older users were predominantly concerned with topics related to medication and treatment, perhaps reflecting a focus on managing the long-term aspects of their condition. In contrast, younger users were more engaged with discussions surrounding stigma, emotional challenges, and cognitive issues, indicating that these areas may be more relevant to those in earlier stages of life or with recent diagnosis.
Gender differences also emerged in our analysis. Male users were more likely to discuss topics related to drugs and alcohol, whereas female users were more involved in conversations about treatments and the stigma associated with epilepsy. This gendered divergence in discussion topics may offer a means of more tailored discussions in the office setting.
4.3 Exploring the theme of drugs and alcohol
“I had three seizures this morning and could use some encouragement. I'd been seizure-free for several months, but I overdid it last night with drinking. Now I'm back to square one, and it's frustrating. I'm in law school, where networking and social events revolve around alcohol. Any advice on how to cope?”
In our analysis, we categorized posts related to drug and alcohol use into two main themes: general discourse about substance use and mentions of these substances specifically as seizure triggers. A notable pattern emerged regarding the timing and progression of these discussions. Posts about general substance use, often focused on the possibility of consumption despite having epilepsy, were typically associated with fewer posts to drug and alcohol subreddits in the 6 months prior to a discussion of substance use, returning to baseline after it. We interpret these findings as people abstaining from drugs and alcohol use until asking about it on the epilepsy forum, after which, possibly due to reassuring responses, they return to use. In contrast, people who posted about having a seizure due to drugs or alcohol appeared to be more likely to use these substances before the seizure and less likely to do so after the event, with substance use discourse reemerging approximately 1 year later. This possibly indicates a period of abstinence (of approximately 1 year) after suffering the consequences of use.
These findings provide valuable insight into the temporal pattern of substance use in PLWE, suggesting a possible cyclical “loop”: discussing and using drugs and alcohol (for approximately 6 months), suffering a seizure, abstinence (for approximately 1 year), and renewed discourse. Part of this pattern may be attributed to responses received online by the community. This uncovers opportunities for targeted intervention and prevention efforts, namely the 6-month period leading up to seizure reports. By recognizing this period of increased vulnerability and curiosity, health care providers, support groups, and online communities can proactively offer guidance aimed at mitigating the risks associated with drug and alcohol use in individuals with epilepsy.
4.4 Information resources
Our analysis identified the websites and resources that users frequently referenced as sources of information, with the majority of these referrals pointing to formal sources such as medical websites, research articles, and professional organizations. This demonstrates the diverse array of external influences that shape the perspectives and knowledge of people with epilepsy, while also emphasizing the reliance on reputable information to inform discussions within the community.
4.5 Hazard models for suicide and depression
“Has anyone felt like they lost something after a seizure? After my recent one, I felt really depressed with some dark thoughts. A couple of days later, I still feel like I'm missing something, like a part of my happiness. Is that normal?”
The risk of depressive disorders is approximately twice as high in individuals with epilepsy compared to the general population,28 and the risk of suicidality is increased 2.60-fold.29 As part of the current analysis, we identified specific themes within the epilepsy subreddit that were significantly associated with posting in depression- and suicide-related subreddits, and by doing so developed hazard models for the identification of themes more likely to be linked to suicide and depression.
Users were more likely to post in the SuicideWatch subreddit—a community for peer support concerning suicidal thoughts—when their Epilepsy subreddit posts involved emotional issues, the postictal state, or work-related concerns. Similarly, posts in the Depression subreddit were associated with themes of stigma and misconceptions, reflecting the emotional toll these social challenges impose on individuals with epilepsy. Conversely, they were less likely to post in the SuicideWatch or Depression subreddits after discussing surgical treatment and driving, possibly reflecting a sense of hope due to the potential for positive outcomes through treatment and perhaps reassurance from the often-temporary nature of driving restrictions.
Applying these models can potentially identify at-risk individuals based on the topics they engage with, both online and interpersonally, potentially offering a tool for early detection and intervention.
4.6 Health care sentiments
“My doctor only seems interested in increasing my meds. I've been seeing this neurologist for years, and all they do is raise the dosage. No tests or anything. Recently, they even suggested I might just be sleepwalking! Is this normal for a neurologist to just up the meds without investigating further?”
Patient satisfaction and positive health care experiences are associated with better health outcomes in epilepsy.30 The current work's unfiltered view of the discourse among people with epilepsy provides valuable insights into sentiments toward health care systems, including patient–doctor relationships, trust, and experiences across various health care settings such as clinics, inpatient video-electroencephalographic monitoring, and emergency departments.
Overall, we found that sentiments related to health care experiences tend to be predominantly negative. However, a more in-depth and rigorous analysis is needed beyond the scope of the current article, possibly as a potential area for future research.
4.7 Implementations
In addition to raising awareness of patient concerns among health care providers, one way these results may be utilized is by incorporating them into the development of patient-reported outcome measures (PROMs) in epilepsy practice. PROMs assess patients' perceptions of their own health status and quality of life to enhance patient–provider communication and treatment monitoring, and are widely used in clinical practice.31-35 However, their development sometimes lacks sufficient patient input regarding everyday concerns for PLWE.31 Some central topics of concern identified in this study, such as the issue of social stigma, are not represented in widely used PROMs such as QOLIE-31.36, 37
The findings and methodology of the current study offer an additional avenue for determining which measures and outcomes are significant for PLWE and most impactful on their lives, potentially adding to current methods of developing such metrics as PROMs. Additionally, specific questionnaires for identifying PLWE at risk of depression, suicidality, and substance use may be built upon the identified hazard models in this study.
4.8 Limitations
Although this approach offers a rich source of information, it is important to recognize its limitations. First, Reddit's user base does not represent the general population; it tends to be younger, with a higher proportion of male users, and is more likely to include individuals comfortable with technology.38, 39 Therefore, the perspectives shared on Reddit might not fully represent the diversity of experiences among PLWE. Additionally, research on online health communities has shown that only a small subset of users, known as “superusers” or “contributors,” generate the majority of posts, whereas most users primarily observe without directly contributing—a dynamic often described by the “90–9–1 rule,” where 90% are “lurkers,” 9% contribute occasionally, and 1% are highly active (although this notion has not been specifically studied in epilepsy populations).40, 41 This pattern may influence the data, as active participants may differ in characteristics, motivations, and experiences from the broader community. For instance, frequent posters may be those experiencing more severe conditions, seeking peer support, or driven by a desire to help others—characteristics that might not reflect the general population.42, 43
Another limitation to consider is the role of subreddit moderators—volunteers who help maintain order, set community guidelines, and remove noncompliant content. Their activities can influence both the quality and quantity of posts available for analysis, potentially shaping the content in ways that are difficult to quantify.
Although our approach achieved high precision in identifying age and gender (with an accuracy of 99%), the regular expression used may not have captured all possible mentions, resulting in high specificity for identifying age and gender, with unknown sensitivity. This suggests that some mentions of age and gender may have been missed. Furthermore, posts with identified age and gender constitute only approximately 15% of the total dataset, potentially representing a subgroup more inclined to share this information. Consequently, the generalizability of our findings may be limited, and conclusions should be interpreted with caution.
In addition, there is a discrepancy between the number of topics identified by the language processing model and those recognized by human evaluators. The model's high sensitivity to subtle linguistic variations can lead it to detect multiple subthemes within a single post, whereas a human evaluator might identify only the dominant theme and group related subthemes together based on perceived relevance. This difference may result in the model identifying more topics than are considered significant by human standards. Future refinements to the model could address this by adjusting its thematic sensitivity to better align with human evaluation.
A further limitation to consider is the accuracy of the models used to analyze the text. Although LLMs like ChatGPT4o represent the most advanced technology in the field at the time of writing, accuracies are far from perfect, and they may introduce errors in theme and sentiment identification, and contextual understanding. These inaccuracies can impact the reliability of the findings. Nonetheless, the substantial volume of data helps to mitigate these effects to some extent by averaging out inaccuracies. However, further improvement to the model is needed to enhance precision, particularly for nuanced themes critical for studies of this nature.
5 CONCLUSIONS
Discussions within online communities (in this case, Reddit) can offer invaluable insights into the lives of people with epilepsy, revealing themes that are often not openly addressed in traditional health care settings. Insights from the analysis may help in identifying topics of concern by age and gender groups, recognizing at-risk groups for depression and suicidality, and detecting temporal patterns of drug and alcohol use with possible interventional implications. Insights from this analysis may be incorporated into the development of patient-reported outcome measures to help bridge the communication gap between health care providers and patients to improve overall care.
ACKNOWLEDGMENTS
We would like to thank Dr. Sher Mazri for her invaluable guidance and support in the preparation of this article. We also extend our gratitude to the Neurology Department at Sheba Medical Center in Israel for their assistance and resources.
CONFLICT OF INTEREST STATEMENT
None of the authors has any conflict of interest to disclose. We confirm that we have read the Journal's position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.
APPENDIX A: ChatGPT4 prompts
Question 1: Here is a question posted on a social media group for people with epilepsy, whose family members have epilepsy, or those who are worried they have it. Please answer with a single number, on a scale of 0–10, whether this person is likely to have been diagnosed with epilepsy, where 0 means highly unlikely and 10 means very likely. If it is impossible to know or the post is unrelated to the person posting, please answer 0.
Question 2: Here is a question posted on a social media group for people with epilepsy, whose family members have epilepsy, or those who are worried they have it. Please answer with a single number, on a scale of 0–10, whether this person is reporting having had an epileptic seizure within the past month, where 0 means it is highly unlikely that he is reporting a seizure within the past month and 10 means very likely. If it is impossible to know or the post is unrelated to the person posting, please answer 0.
APPENDIX B: Prompt and description of themes, as provided to ChatGPT4o
- Adjustment to epilepsy: Posts about the mental, emotional, and logistical adjustments to having epilepsy, specifically the transition from a state of being without epilepsy to the state of having epilepsy. Also, the impact of a recent diagnosis of epilepsy on a person's sense of identity and self-esteem, and fears and uncertainties about what the future holds in the face of the condition.
- Alcohol and drugs (as triggers): Posts about drug and alcohol consumption causing seizures.
- Cognitive: Posts about the cognitive effects of epilepsy or its medications, including memory difficulties, concentration, planning, "mental fogging," et cetera. Include posts concerning descriptions of cognitive difficulties, queries about it, and its sequelae.
- Consultation: Posts in which the author wishes to consult with other people with epilepsy about different aspects of epilepsy and its consequences. This should include advice seeking, experience sharing, information requests, requests for resource recommendations, et cetera.
- Diagnostic process: Posts that concern any part of the diagnostic process, or journey, that people with epilepsy undergo (1) as part of the initial diagnosis of epilepsy; or (2) after the establishment of epilepsy diagnosis, such as further diagnostic testing to refine the diagnosis, for example, to characterize the epilepsy into subtypes, to explore surgical options, or to assess treatment response. This theme should NOT include diagnostic testing as part of a presurgical evaluation, because another theme exists for surgical evaluation.
- Diet: Posts concerning dietary changes or weight changes brought on by epilepsy or its medications.
- Driving: Posts concerning driving restrictions, being forbidden to drive due to having epilepsy, license revocation, the process of getting one's license back, and posts about the everyday difficulties of living without a driver's license.
- Emotion: Posts concerning emotional issues and the impact of the different aspects of epilepsy on emotional well-being. This should include but not be limited to depressive symptoms, anxiety, agoraphobia, et cetera.
- Fatigue: Posts about suffering from, and dealing with, chronic fatigue and tiredness as a consequence of epilepsy.
- Handicap/restrictions: Posts that deal with the restrictions that having epilepsy imposes on a person's life, for instance, not being able to scuba dive, not being able to swim unattended, or always having to sleep enough hours. This should not include driving restrictions, as this topic has its own category.
- Health care costs: Posts about the financial aspects of dealing with epilepsy, including costs of medications, diagnostic testing, and health care services such as ambulances and hospitalizations. Posts concerning the difficulties arising from high health care costs, or the need to compromise on older generation drugs due to the high costs of newer drugs, should be included.
- Health care experiences: Posts that concern the experiences of patients with the health care system. This should include experiences with doctors and neurologists, including different sentiments toward them. Examples include sentiments of trust or distrust toward doctors or the medical profession, dissatisfaction with the diagnosis or treatment recommended, and complaints of being mistreated, misdiagnosed, or not believed, or about not receiving enough attention by doctors or medical staff. Positive sentiments should be included as well. Another aspect of health care experiences to be included under this theme is general experiences in hospitals, wards, clinics, ambulances, et cetera. This theme does NOT include issues of health care costs, which is considered a separate theme.
- Other: Posts about the impact of having epilepsy on a person's quality of life, in aspects not fitting the other defined subcategories of quality of life.
- Postictal state: Posts that concern the postictal state, that is, hours following a seizure in which a person may be obtunded, extremely tired, confused, in pain, et cetera. Specifically, posts that depict the suffering from this state.
- Recreational drugs: Posts that concern the need to give up recreational drugs due to having epilepsy. This should include posts that describe the longing to use drugs that were once habitual for the person (e.g., marijuana) and queries about the safety of using these drugs.
- Relationships: Posts about the different difficulties of maintaining relationships with romantic partners or with friends and family in the face of epilepsy. Also, posts about the advantages of these relationships as support. Also, posts about the difficulties of initiating these relationships while dealing with epilepsy.
- Seizure burden: Posts that concern the harmful impact of seizures, especially high frequency of seizures, on a person's quality of life. For instance, the inability to maintain day-to-day functioning due to high frequency of seizures, and the aspect of dreading and anticipating further seizures.
- Seizure descriptions: Posts that describe seizures, prodromes, seizure auras, postictal states, and sequelae of seizures such as physical trauma. May include posts that describe paroxysmal neurological events that are not confirmed as epileptic.
- Side effects of treatment: Posts that concern the side effects of antiseizure medications, specifically the impact of said side effects on a person's quality of life. Examples include dizziness, imbalance, weight gain due to medications, and fatigue due to medications.
- Stigma and misconceptions: Posts about encountering, dealing with, and fearing epilepsy-related social stigma, myths and misconceptions about epilepsy, and discrimination (social, romantic, work-related, etc.).
- Treatment—medical: Posts that concern medical (i.e., pharmacological) treatment of epilepsy. This includes anything to do with antiseizure medications, including efficacy or lack thereof, tolerability, drug interactions, side effects, dosage concerns, drug recommendations, adherence issues, et cetera.
- Treatment—surgical: Posts that concern surgical treatment of epilepsy, any part of the presurgical evaluation, or the postsurgical results and sequelae. This theme should also include questions and consultations regarding surgery, including experiences, efficacy, concerns and hesitations, et cetera. The presurgical evaluation includes, but is not limited to, intracranial electroencephalography, interictal single photon emission computed tomography, positron emission tomography–computed tomography, and magnetoencephalography.
- Work: Posts about the impact of having epilepsy on work performance, posts about issues of workplace environment (e.g., supportive or nonsupportive environments), posts about the difficulty in maintaining a job or acquiring a job while having epilepsy, et cetera.
If more than one topic is relevant, please provide the list of relevant topics in descending order of importance in the post. The answer should be in the form of: topic X, topic Y, et cetera, without explanations. If no topic is relevant, please reply None.
APPENDIX C: Receiver operating characteristic curve for detection of relevant posts


Open Research
DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.