A Core set of patient-reported outcome measures to measure quality of life in obesity treatment research
Funding information: The S.Q.O.T. II meeting was funded by Medtronic, Johnson and Johnson, Philips Vital Health, Novo Nordisk, Castor, and Bart Torensma. The S.Q.O.T. III meeting was funded by Medtronic, Johnson and Johnson, Novo Nordisk, Goodlife, and Fitforme. The funders of the meeting were not involved in the selection of the PROs or PROMs. None of the members of the S.Q.O.T. organizing committee (PD, CV, VM, BW, IJ, and RL) and none of the participants of the consensus meetings received payment for their participation. Meeting spaces, audiovisual support, catering, travel expenses, and the hotel overnight for the meetings were supported by Johnson and Johnson (S.Q.O.T. II) and Medtronic, Johnson and Johnson, Novo Nordisk, Goodlife, and Fitforme (S.Q.O.T. III). Additionally, the moderator costs (CT) were supported by these sponsors. The sponsors played no role in the selection of participants of the meeting, in the selection of the domains and questionnaires included in this meeting, the voting rounds, or the writing of this paper, nor in any other research related activity. The S.Q.O.T. organizing committee (PD, CV, VM, BW, IJ, and RL) did not receive payment for their work for the S.Q.O.T. initiative.
[Correction added on 4 February 2025, after first online publication: The copyright has been changed.]
Summary
The lack of standardization in patient-reported outcome measures (PROMs) has made measurement and comparison of quality of life (QoL) outcomes in research focused on obesity treatment challenging. This study reports on the results of the second and third global multidisciplinary Standardizing Quality of life measures in Obesity Treatment (S.Q.O.T.) consensus meetings, where a core set of PROMs to measure nine previously selected patient-reported outcomes (PROs) in obesity treatment research was established.
The S.Q.O.T. II online and S.Q.O.T. III face-to-face hybrid consensus meetings were held in October 2021 and May 2022. The meetings were led by an independent moderator specializing in PRO measurement. Nominal group techniques, Delphi exercises, and anonymous voting were used to select the most suitable PROMs by consensus. The meetings were attended by 28 and 27 participants, respectively, including a geographically diverse selection of people living with obesity (PLWO) and experts from various disciplines.
Out of 24 PROs and 16 PROMs identified in the first S.Q.O.T. consensus meeting, the following nine PROs and three PROMs were selected via consensus: BODY-Q (physical function, physical symptoms, psychological function, social function, eating behavior, and body image), IWQOL-Lite (self-esteem), and QOLOS (excess skin). No PROM was selected to measure stigma as existing PROMs deemed to be inadequate.
A core set of PROMs to measure QoL in research focused on obesity treatment has been selected incorporating patients' and experts' opinions. This core set should serve as a minimum to use in obesity research studies and can be combined with clinical parameters.
1 INTRODUCTION
Empirical research has reported a negative association between obesity and quality of life (QoL), where QoL tends to be lower among people living with obesity (PLWO).1-3 Improvement in QoL is a prominent driver motivating PLWO to undergo obesity treatment.4-6 While QoL is a key outcome in obesity treatment,5 it has not been measured consistently in research and often with generic questionnaires. Hence, the exact effect of obesity treatment on QoL is largely unknown.
To adequately measure QoL, two necessities exist. First, the patient-reported outcomes (PROs) that are examined—that is, what to measure—should reflect the domains of QoL that matter most to PLWO and healthcare professionals (HCPs).7, 8 Second, it is crucial that patient-reported outcome measures (PROMs)—that is, how to measure—have sufficient evidence for reliability and validity.8-10 The lack of consensus on which PROs to measure has led to measurement of outcomes of questionable relevance, as well as wide variation in outcome reporting,10-12 precluding interpretation and synthesis of existing data. Moreover, a diverse array of PROMs are used, many of which have limited evidence for their reliability and validity.11, 13-15 Consequently, QoL is often not adequately measured in PLWO undergoing obesity treatment, thus hampering evidence-based decision-making.12
To address the inadequacy of a core set of PROMs in current research focused on obesity treatment, the Standardizing Quality of life measures in Obesity Treatment initiative (S.Q.O.T.) was founded.16 It aims to establish consensus-based, standardized sets of PROMs that can be used in obesity treatment (research, clinical practice, and registries) across diverse settings, facilitating comparison and the accumulation of meaningful knowledge in this domain. These core sets were selected using systematic reviews and consensus meetings whose participants included PLWO and HCPs, following the rigorous Core Outcome Measures in Effectiveness Trials (COMET) guideline.17 The importance of such a core set of PROs and PROMs in obesity treatment research lies in its ability to ensure consistent and comprehensive assessment of treatment outcomes, enhance the comparability of studies, improve the quality of evidence generated, and ultimately guide clinical decision-making and policy development. By standardizing outcome measures, researchers can better identify effective interventions, understand patient experiences, and address the multifaceted impacts of obesity on QoL.18
In the first S.Q.O.T. multidisciplinary international meeting, 24 PROs (i.e., what to measure) were identified through multiple prioritization surveys, with consensus achieved on nine PROs.15 These include self-esteem, physical function and symptoms, mental/psychological function, social function, stigma, eating behavior, body image, and excess skin. A systematic review identified 16 PROMs (i.e., how to measure) for assessing these PROs.12, 15 The following five PROMs were selected for potential inclusion in the core set by 35 participants of the first S.Q.O.T. meeting15: BODY-Q,19 Impact of Weight on Quality of Life-Lite (IWQOL-Lite),20, 21 Quality of Life for Obesity Surgery (QOLOS),22 36-Item Short Form Health Survey (SF-36),23 and Obesity-related Problems scale (OP-Scale).24 The current study describes the results of the second and third global multidisciplinary S.Q.O.T. consensus meetings, where one PROM was selected for the assessment of each PRO, and the definitive core set of PROMs for research in obesity treatment was established based on available evidence and consensus by experts and PLWO.
2 MATERIALS AND METHODS
2.1 Organization and ethical approval
The selection of the core set of PROMs for use in obesity treatment research involved two international multidisciplinary consensus meetings. The S.Q.O.T. II online consensus meeting was held on October 27, 2021, and the S.Q.O.T. III hybrid consensus meeting was held on May 2, and 3, 2022, online and in-person in Maastricht, The Netherlands. Ethical approval was obtained by the regional institutional review board (Medical research Ethics Committees United, The Netherlands, reference number W21.227). All participants provided (written/oral) consent. Six authors held a board function for the S.Q.O.T. Initiative and were only involved in organizing (logistics) the consensus meetings (PD, CdV, VM, IJ, RL, BvW).16
2.2 Recruitment of participants
Participants of the consensus meetings included PLWO and HCPs. Recruitment of participants for the consensus meetings was performed in various ways. Participants of the first S.Q.O.T. consensus meeting were invited through email.15 PLWO were identified through patient organizations or patient representative networks. HCPs were identified through professional networks of the organizers. HCPs were asked if they knew PLWO who were eligible to participate in the consensus meeting. The PLWO and HCPs were required to have a proficient level of English language comprehension. HCPs who were invited to participate had expertise in the treatment of obesity and patient-centered outcomes research, clinical trials, registries, quality improvement, and/or healthcare policy. PLWO who were invited to participate were mostly involved in patient representative networks. Those willing to participate in the consensus meetings were asked to register for the meeting through email or the S.Q.O.T. initiative website: https://www.sqotinitiative.com/.
2.3 Consensus meetings
The consensus meetings consisted of small group discussions and large group discussions using nominal group techniques, Delphi exercises, and anonymous voting through VoxVote (an online live voting system that allows anonymous voting through computer or smartphone).25 Both consensus meetings were led by an independent moderator specialized in the development of PROMs and core outcome sets (COS). The meetings started with a general presentation on the objectives of the S.Q.O.T. initiative, outcomes of prior meetings, and a general explanation of reliability and validity to ensure all participants were familiar with the terminology during group discussion and voting. In addition, characteristics (e.g., costs, languages available, and psychometric properties) of all PROMs discussed were presented (Table 1).
PROM | Content | Reliability | Validity | Feasibility |
---|---|---|---|---|
IWQOL-Lite | Adequate | Good | Good |
81 translations Costs |
BODY-Q | Good | Good | Good |
18 translations No costs Reduction of the items ✓ |
SF-36 | Inadequate | Unknown | Good |
29 translations Costs |
OP-Scale | Inadequate | Unknown | Good |
5 translations Costs unknown |
QOLOS | Good | Good | Unknown |
2 translations Costs unknown |
- Abbreviations: IWQOL-Lite, Impact of Weight on Quality of Life-Lite questionnaire; OP-Scale, Obesity-related Problems scale; QOLOS, Quality of Life for Obesity Surgery questionnaire; SF-36, 36-Item Short Form Health Survey.
2.3.1 S.Q.O.T. II online international consensus meeting
The aim of the S.Q.O.T. II consensus meeting was to select a core set of PROMs (i.e., how to measure) in obesity treatment research. This meeting was preceded by the first S.Q.O.T. consensus meeting, where a standard set of PROs (i.e., what to measure) was selected.15 Prior to the S.Q.O.T. II meeting, a preparatory online survey was sent to all registered participants using Google Forms.26 In this survey, informed consent for the preparatory survey and consensus meeting and demographic information were collected, and participants were asked to indicate which PROM they found most relevant, comprehensive, and comprehensible out of the five PROMs that were selected in the first S.Q.O.T. meeting. The participants were also asked if they would make changes to the PROM that was selected (e.g., remove a PRO or add a PRO from another PROM). Thirty persons filled out the preparatory survey including 18 HCPs (seven surgeons, three psychologists, two endocrinologists, two dietitians, two researchers, and two other physicians specialized in obesity treatment) and 12 PLWO from 11 different countries divided over four continents.
The S.Q.O.T. II consensus meeting took place online through the video conferencing platform Zoom.27 A list in PDF format of available PROMs for each PRO was shared with all participants prior to the meeting, with a general overview of the content, reliability, validity, and feasibility of each PROM (Table 1). Smaller group discussions about each PRO were held in breakout rooms for 5 min intervals with a maximum of five people per breakout room to discuss on the most suitable PROM (with separate breakout rooms for HCPs and PLWO). Subsequently, anonymous voting was initiated to select one PROM for each PRO. Twenty-eight persons participated in the S.Q.O.T. II meeting including 10 PLWO and 18 HCPs from 11 different countries divided over four continents (Table 2). Two PLWO who completed the preparatory survey did not participate. The HCPs consisted of seven bariatric surgeons, three psychologists, three dietitians, two endocrinologists, one researcher, one plastic surgeon, and one other physician specialized in obesity treatment.
Country | Healthcare provides, n | People living with obesity, n | Total, n |
---|---|---|---|
United Kingdom | 6 | 4 | 10 |
United States of America | 5 | 2 | 7 |
The Netherlands | 3 | 3 | 6 |
Australia | 4 | 0 | 4 |
Ireland | 2 | 2 | |
Kuwait | 1 | 1 | 2 |
Canada | 1 | 1 | 2 |
Switzerland | 1 | 1 | |
Belgium | 1 | 1 | |
France | 1 | 1 | |
Sweden | 1 | 1 | |
Mexico | 1 | 1 | |
Brazil | 1 | 1 | |
Germany | 1 | 1 | |
Denmark | 1 | 1 | |
Total | 27 | 14 | 41 |
2.3.2 S.Q.O.T. III hybrid international consensus meeting
A combined face-to-face and online consensus meeting was held for 2 days. The aims of the S.Q.O.T. III consensus meeting were (1) to select a PROM for the PRO stigma, (2) to elaborate on discussion points from the S.Q.O.T. II meeting and finalize the core set of PROMs for obesity treatment research, and (3) to discuss implementation and dissemination of the core set. During the S.Q.O.T. III consensus meeting, we also finalized the core set of PROMs for clinical practice. The selection of a PROM for the PRO stigma and selection of a core set of PROMs for obesity treatment research are described in this article. The core set for clinical practice will be published separately. Twenty-seven persons participated in the hybrid S.Q.O.T. III meeting including nine PLWO and 18 HCPs from 12 different countries divided over five continents (Table 2). The HCPs consisted of six bariatric surgeons, two psychologist, three dietitians, two endocrinologists, three researchers, one plastic surgeon, and one physician specialized in obesity treatment. Seven (26%) participated online. Thirteen (48%) participants also participated in the S.Q.O.T. II consensus meeting.
2.3.3 Stigma
During the first S.Q.O.T. meeting in 2019, stigma was selected as a PRO. However, no PROM regarding stigma was identified through the systematic review on PROMs for obesity treatment research.15 Therefore, a new review was conducted in PubMed on May 21, 2022, focusing only on PROMs to measure stigma in obesity treatment following the methodology of our previous systematic review and the Consensus-based Standards for the selection of health Measurement Instruments (COSMIN) guideline.28 This contrasts our previous search, where PROMs were reviewed for all PROs. One member of the S.Q.O.T. Initiative team (PD) performed the literature search resulting in 671 articles (Figure S1). First, articles were screened and removed based on title and abstract if no stigma PROM was provided. Second, full-text articles were read to search for stigma PROMs. The PROMs were extracted and reviewed by two members of the team, PD and CdV. A total of three stigma PROMs were extracted and discussed in the consensus meeting, and one additional PROM was brought to our attention by one of the participants who had extensive experience in stigma research. The results are presented in Appendix S1.
2.4 Consensus
A PROM was selected if the majority of participants voted for that PROM. When more than 70% of PLWO voted in favor or against a specific PRO or PROM, it would overrule the total number of votes. This process was described in more detail in our previous publication regarding the S.Q.O.T. I consensus meeting.15 Important quotes will be included in the results section to highlight the key findings and provide additional context. The organizers and moderator were not permitted to influence the discussions or voting rounds and only functioned as facilitators during the consensus meetings. HCPs with a conflict of interest (e.g., involvement in the development of a PROM being considered) were not allowed to participate in group voting for that PROM.
3 RESULTS
3.1 How to measure—Selection of the core PROM set
The BODY-Q29 was most frequently chosen in the preparatory survey as the most suitable PROM overall (n = 23, 77%) followed by the IWQOL-Lite30 (n = 7, 23%). More HCPs than PLWO preferred the BODY-Q over the IWQOL-Lite (83% vs. 67%). A core set was subsequently agreed on in the consensus meetings, in which one PROM was selected for each of the previously selected PRO's (Table 3).
PROs | First selection of PROMs | Definitive selection of PROMs for research |
---|---|---|
Self-esteem | IWQOL-Lite | IWQOL-Lite |
Physical health | ||
Physical functioning | SF-36, IWQOL-Lite, BODY-Q | BODY-Q |
Physical symptoms | BODY-Q | BODY-Q |
Mental/Psychological health | BODY-Q | BODY-Q |
Social health | OP-Scale, IWQOL-Lite, BODY-Q | BODY-Q |
Stigma | - | - |
Eating | ||
Eating behavior | BODY-Q | BODY-Q |
Eating-related distress | BODY-Q | - |
Body image | BODY-Q, QOLOS | BODY-Q |
Excess Skin | BODY-Q, QOLOS | QOLOS |
- Abbreviations: IWQOL-Lite, Impact of Weight on Quality of Life-Lite questionnaire; OP-Scale, Obesity-related Problems scale; QOLOS, Quality of Life for Obesity Surgery questionnaire; SF-36 L, 36-Item Short Form Health Survey.
3.1.1 Self-esteem
The IWQOL-Lite (self-esteem subscale)30 was selected as the most suitable PROM for self-esteem (20/27 votes [74%]; HCP 81%, PLWO 56%). However, the wording of this measure, which included the attribution “because of my weight” was disliked by most participants, because there are more things that influence self-esteem than weight alone. Some PLWO preferred “because of my obesity,” however, the HCPs indicated that “because of my obesity” would not apply to people with lower BMI to reflect on their self-esteem, and “obesity” has generally been found in the empirical literature to be disliked by PLWO, as it is experienced as stigmatizing. Another point of discussion was the use of questions with a negative valence in the IWQOL-Lite (self-esteem subscale, e.g., because of my weight, I do not like myself), while the BODY-Q29 generally makes use of positive wording (e.g., I feel happy). Most PLWO indicated that they preferred positive wording as it made them reflect less often on negative experiences. Nonetheless, the IWQOL-Lite (self-esteem subscale) was selected as for this PRO.
“Obesity has been recognized as a chronic disease by the World Health Organization. It does not matter what BMI I will have; obesity is a disease I will spend the rest of my life living with. Therefore, a scale such as the IWQOL-Lite (self-esteem subscale) should not be centered around ‘weight’ only.” (Quote from one PLWO participant)
3.1.2 Physical health
The PRO physical health consists of two subdomains: physical functioning and physical symptoms. For physical functioning, the BODY-Q (physical function subscale)29 was selected (16/27 votes [59%]; HCP 57%, PLWO 60%). For physical symptoms, the BODY-Q (physical symptoms subscale) was selected (24/28 votes [86%]; HCP 88%, PLWO 80%). In the group discussions, both subdomains of physical health were deemed as important by HCPs and PLWO. The PLWO especially liked the BODY-Q physical symptoms questions because they cover skin rash, infection, and perspiration issues. HCPs preferred the BODY-Q physical functioning over the IWQOL-Lite physical functioning30 because they thought the phrase “because of my weight” to be misleading or inappropriate (see previous comments for self-esteem).
“There are more things than weight alone that might influence physical health such as excess abdominal fat, excess skin and skin rash. Therefore, we prefer the BODY-Q because it makes use of the phrase ‘because of my body’.” (Quote from one HCP participant)
3.1.3 Mental/psychological health
The BODY-Q (psychological function subscale)29 was selected to measure psychological health (17/26 votes [65%]; HCP 75%, PLWO 50%). Several PLWO questioned whether the BODY-Q psychological health was specific enough for obesity treatment research. A point of discussion was the recall period of only 1 week: “thinking of the past week.” Whereas PLWO considered 1 week to be quite short as their QoL differs week by week, the HCPs indicated that 1 week was an adequate recall period as you can remember how you felt last week, but not much longer than that. It was decided that “1 week” was an optimal recall period.
“Looking at my body can undeniably influence my psychological well-being for the duration of that day. I firmly believe that there is a significant interrelation between one's mental health and physical state. From my perspective, the BODY-Q (psychological function subscale) is fundamental as a post-bariatric patient”. (Quote from one PLWO participant)
3.1.4 Social health
First, the IWQOL-Lite (public distress subscale)30 was selected to measure social health (12/26 votes [46%]; HCP 38%, PLWO 56%). PLWO found the BODY-Q29 too general, while the IWQOL-Lite public distress scale was perceived as more specific. However, in further discussions, both HCPs and PLWO concluded that the PRO social health was not adequately covered by IWQOL-Lite (public distress subscale). Therefore, the moderator decided on a second voting round. The BODY-Q (social function subscale) was then selected as the most suitable measure for measuring the broader concept of social health (17/24 votes [71%]; HCP 62%, PLWO 100%).
“Social function should cover items on friendships, relationships, social functioning and social participation” & “The impact of obesity on close relationships is missing in every PROM, but the questionnaire that covers most aspects of social health is the BODY-Q (social function subscale).” (Quote from multiple HCP participants)
3.1.5 Eating
The PRO eating consisted of two subdomains: eating behavior and eating-related distress. The BODY-Q (eating behavior subscale)29 was selected to measure eating behavior (21/22 votes [96%]; HCP 91%, PLWO 100%). This PROM was preferred by almost all participants and included most items that HCPs considered important. For eating-related distress, no PROM was selected (7/23 votes [30%] in favor of the BODY-Q; HCP 50%, PLWO 10%). The BODY-Q (eating-related distress subscale) was disliked by all PLWO, as the questions were considered to be stigmatizing.
“The BODY-Q eating-related distress subscale is stigmatizing, judgmental and assuming that you are only dealing with one stage of obesity.” (Quote from one PLWO participant)
3.1.6 Body image
The BODY-Q (body image subscale)29 was selected to measure body image (22/27 votes [82%]; HCP 81%; PLWO 90%). In the group discussion, PLWO decided that they preferred the BODY-Q body image, because it incorporates positive questions rather than negative questions. Most HCPs also preferred the BODY-Q.
“My body is not perfect, but I like it is very realistic and in my opinion well suited to be used in research.” (Quote from one HCP participant)
3.1.7 Excess skin
First, the BODY-Q (excess skin subscale)29 was selected to measure excess skin (12/20 votes [60%]; HCP 80%; PLWO 44%). Voting revealed a considerable difference between the preference of PLWO and HCPs, as the majority (56%) of PLWO voted for the QOLOS excess skin22 (vs. 20% of HCPs). After group discussion, three HCPs indicated that they wanted to change their vote because the PLWO convinced them that the QOLOS (excess skin subscale) provides a broader overview of excess skin, including the negative physical consequences of excess skin (rather than as an issue of esthetics alone). A re-voting round was held: the QOLOS (excess skin subscale) was then selected to measure excess skin (13/17 votes [77%]; HCP 63%; PLWO 88%).
“The BODY-Q (excess skin subscale) covers the esthetics whereas the QOLOS (excess skin subscale) zooms into what is really important: pain, infections, how excess skin makes you feel. The BODY-Q (excess skin subscale) only covers the abdomen while the QOLOS (excess skin subscale) covers the whole body. However, we dislike the term hanging skin and therefore, both questionnaires are imperfect.” (Quote from one PLWO participant)
3.1.8 Stigma
Participants concluded that two things should be measured regarding the PRO stigma. First, how one's weight makes one feel (self- or internalized stigma), and second, how does this feeling impacts one's behavior and social experiences. The Weight Self-Stigma Questionnaire (WSSQ)31 consists of two subscales and was therefore, split into two questionnaires for the voting round: WSSQ part one and WSSQ part two. Four options were available for voting: (1) WSSQ part one, (2) WSSQ part two, (3) Stigmatizing Situations Inventory Brief version (SSI-B),32 and (4) none of the available PROMs. The WSSQ part two (12/27 votes [44.4%]; HCP 59%; PLWO 25%) and no questionnaire (12/27 votes [44.4%]; HCP 24%; PLWO 75%) were selected most frequently. Ultimately, no PROM was selected because more than 70% of PLWO voted for this option. Participants disliked the WSSQ and SSI-B because these PROMs cannot be used longitudinally (not all items can potentially change over time). Furthermore, PLWO disliked the wording and phrasing. Some HCPs with expertise in stigma suggested another questionnaire, namely, the Weight Bias Internalization scale—Modified Version (WBIS-M).33 This questionnaire was initially not selected in the literature review, because it was not validated in PLWO undergoing obesity treatment. In the group discussion, the WBIS-M was deemed inadequate by both PLWO and HCP and primarily because it only measures internalized weight stigma and not general experiences of stigma.
“In my opinion, self-internalized stigma and the impact of stigma are most important to measure. Once, while walking on the streets, I noticed somebody was really staring at me, which my friend pointed out. I was not particularly concerned about the stranger's perception of me, it only reflected how I thought about myself. At that point in time, I hated myself because of my weight. It is crucial for a stigma questionnaire to be capable of capturing this sentiment and its evolution over time.” (Quote from one PLWO participant)
4 DISCUSSION
The purpose of this study was to select a core set of PROMs to measure QoL in obesity treatment research. Based on the results of the S.Q.O.T. consensus meetings, the following PROM subscales were selected to measure nine previously selected PROs: BODY-Q (physical functioning, physical symptoms, psychological function, social function, eating behavior, body image), IWQOL-lite (self-esteem), and QOLOS (excess skin). For stigma, no PROM was selected as existing PROMs were considered inadequate by the participants. A geographically diverse selection of PLWO and HCPs from various disciplines contributed to the selection of this set, and the perspectives of PLWO were integrated in every phase of the project.
To the best of our knowledge, the core set presented here is the first (multidisciplinary) standard set of PROMs to be used in research evaluating obesity treatment. The current research builds upon two previously established COS for bariatric and metabolic surgery (BARIACT), and weight management interventions (STAR-LITE), which both included QoL in their COS.5, 6 We are aware of the fact that the International Consortium for Health Outcomes Measurements (ICHOM) has been working on a COS for obesity treatment over the past year and a half. This was completed very recently and will most likely be published somewhere soon. We trust that the working group within ICHOM has been sufficiently aware of the work that has previously been done within SQOT, and that our results have been taken into account in the development process of the ICHOM obesity COS.
A wide variety of generic and obesity-specific PROMs are currently being used,11, 12 many of which lack adequate validation evidence and, therefore, cannot be said to adequately measure QoL.12, 34 The heterogeneity hampers the usefulness of PROMs to inform value-based healthcare interventions and hinders comparative effectiveness research. High-quality systematic reviews and meta-analyses are required to evaluate QoL outcomes after obesity treatments and inform decision-making. However, to date, these studies have not been conducted due to the use of insufficient PROMs and lack of standardization.11, 12, 35-37 To adequately evaluate the effect of QoL in obesity treatment, standardized PROMs with the highest evidence for reliability and validity must be routinely used in research. The S.Q.O.T. initiative has established a core set of PROs and PROMs to measure QoL in obesity treatment research, including only PROMs of high-quality and incorporating opinions of PLWO and HCPs. This core set should serve as a minimum to measure and compare QoL in obesity treatment research and can be combined with clinical outcomes. To improve obesity care, it is essential that this set is used in every future obesity trial. Implementation of this core set will improve the quality of clinical trials, allow for comparison of outcomes that matter most to PLWO, and stimulate comparative effectiveness with high-quality systematic reviews to promote value-based healthcare.
Importantly, the core set emphasized outcomes that matter most to patients. Differences in priorities of PLWO and HCP were already apparent after the first S.Q.O.T. consensus meeting, where PLWO, in contrast to HCPs, valued the need for more specific PROs (body image, self-esteem, excess skin).15 During the S.Q.O.T. II & III meetings, PLWO and HCPs differed most in their opinion on the PROM that should be selected for excess skin. The PLWO convinced the HCPs in the group discussion to change their vote in favor of the QOLOS questionnaire since this PROM captures what PLWO thought was most important, namely, the daily consequences of excess skin (rather than esthetics alone). This highlights the importance of incorporating patients' views.
To measure the PROs selected in the first S.Q.O.T. consensus meeting,15 three PROMs were selected: the BODY-Q, IWQOL-Lite, and QOLOS. The inclusion of these PROMs in the core set aligns with recent recommendations for the use of PROMs in bariatric surgery, in which measurement properties of the available PROMs were evaluated.12 Critical in the development of PROMs is involving patient perspectives and following all necessary steps to obtain adequate measurement properties.28 Of these measurement properties, content validity, are considered most important, referring to relevance, comprehensiveness, and comprehensibility.10, 38 The BODY-Q, IWQOL-Lite and QOLOS were all developed involving patient perspectives and showed sufficient content validity in previous research. Hence, the core set only contains PROMs with evidence for sufficient reliability and validity for use in obesity treatment research.
We acknowledge that the core set requires further work in the future. The literature search for stigma identified several PROMs with questionable validation evidence. An adequate PROM for stigma should be able to capture differences in scores over time, as weight loss has a significant impact on both internalized stigma and its consequences.39 Future studies should be carried out to modify existing stigma PROMs, or to develop a new measure for obesity stigma (i.e., beyond weight stigma), as none of the PROMs were deemed adequate by the HCPs and PLWO. The QOLOS has rarely been used in research and is available in only two languages. Therefore, this PROM requires further validation in large studies and should be translated in multiple languages to promote its widespread use, or a new PROM on excess skin should be developed. In contrast, the IWQOL-Lite and BODY-Q are available in 81 and 19 languages, respectively.29, 30 The IWQOL-Lite makes use of negative questions and might propose more mental burden than positive questions, which should be examined in future research. This PROM also makes use of the sentence “because of my weight”. Participants in our consensus meetings proposed using “because of my body” instead, as it allows individuals with lower BMI to reflect on their QoL. However, this requires new validation work by the developers of the IWQOL-Lite. Furthermore, the IWQOL-Lite has associated costs for use in research and clinical care, whereas the BODY-Q and QOLOS are free of charge. This was a notable drawback for use of the IWQOL-Lite by HCPs and may hamper widespread use, especially as healthcare and research budgets are already limited.
Finally, the BODY-Q makes use of Rasch Measurement Theory (RMT) analyses, potentially offering a contribution to its validity, while the IWQOL-Lite and QOLOS make use of Classical Test Theory (CTT).40 The selection of only one PROM in a core set would be most convenient and use of different PROMs in a core set will lead to practical challenges. Yet, it is critical that patient-centered outcomes are adequately measured, and this core set is considered most suitable to measure and compare QoL in obesity research treatment. Additionally, the panel acknowledges that employing this set demands significant time and resources. To facilitate implementation we have, therefore, added a manual in which an overview is provided of the core set (Appendix S2). This manual includes information regarding the available languages, costs, and contact persons for the core PROMs necessary in research.
Strengths of this study include the large and geographically diverse selection of PLWO and HCPs from different disciplines, increasing generalizability, and the use of an independent moderator specialized in the development of PROMs to guide group discussions. The moderator facilitated structured discussions and ensured that all viewpoints were considered, maintaining focus on the predefined objectives of the initiative. This standardized approach helped to prevent bias by guiding participants through methodologies like nominal group technique and Delphi exercises. Potential limitations should be considered. First, three HCPs had a conflict of interest as they were involved in the development of one of the questionnaires. They were asked not to vote on specific PROMs to exclude bias, as it might be expected that they would vote for the PROM they were involved in. Second, it was our goal to have an equal distribution of PLWO and HCPs in the consensus meeting. Fewer PLWO compared to HCPs participated in the third meeting, potentially affecting the representation of PLWO opinions. Nevertheless, nearly all PLWO participants possessed significant experience as patient advocates and were affiliated with (inter)national patient representative networks. Furthermore, the independent moderator actively engaged the PLWO to discuss their opinions. Fourth, the PLWO were mostly part of patient representative groups and were expected to have a sufficient understanding of English, which may not reflect the broader population of individuals living with obesity.41 Finally, due to the duration of the S.Q.O.T. consensus meetings and different time zones for each country, not every online participant was able to participate until the end. Consequently, fewer participants were involved in the group discussions and voting toward the end of the consensus meetings.
5 CONCLUSION
A core set of PROMs to measure QoL in obesity treatment research has been established, including the BODY-Q (physical functioning, physical symptoms, psychological function, social function, eating behavior, body image), IWQOL-lite (self-esteem), and QOLOS (excess skin). No PROM was deemed adequate for measuring the PRO stigma; therefore, this PRO cannot be assessed yet. To adequately assess patient-centered outcomes, allow for comparative effectiveness and improve the value of research data, it is essential that this set is used in every future obesity trial. Next steps include the selection of a core set for clinical practice and registries. In addition, consensus meetings will be held regularly to improve the current set.
ACKNOWLEDGMENTS
Ronette L. Kolotkin (involved in the development of the IWQOL-Lite) participated during the S.Q.O.T. II consensus meeting. We thank the sponsors of the S.Q.O.T. consensus meetings for their financial support.
CONFLICTS OF INTEREST STATEMENT
Claire E.E. de Vries, Lotte Poulsen, Maarten M. Hoogbergen, and Ronald S.L. Liem were involved in the development of the additional eating scales of the BODY-Q questionnaire. Of those, Claire E.E. de Vries and Ronald S.L. Liem are members of the organizing committee and were not permitted to participate in, nor influence, the S.Q.O.T. consensus meetings discussions or voting rounds. Lotte Poulsen (participant S.Q.O.T. III) and Maarten M. Hoogbergen (participant S.Q.O.T. II) were not permitted to participate in the voting rounds where the BODY-Q questionnaire was available for a specific PRO domain. None of the aforementioned persons received financial compensation regarding the development of the BODY-Q questionnaire or any other BODY-Q related activity.
Ronette L. Kolotkin (involved in the development of the IWQOL-Lite) participated during the S.Q.O.T. II consensus meeting. Likewise, she was not permitted to participate in the voting rounds where the IWQOL-Lite questionnaire was available for a specific PRO domain. Ronette L. Kolotkin was involved in the development of the IWQOL-Lite questionnaire and receives royalties from Duke University for use of this PROM.
Within the last 3 years, Stuart W. Flint (SWF) reports research grants from National Institute for Health Research, the Office of Health Improvement & Disparities, Public Health England, Doncaster Council, West Yorkshire Combined Authority, Johnson and Johnson, Novo Nordisk and the University of Leeds, personal fees from the Royal College of General Practitioners, Institutional fees from Public Health England, and support for attendance at meetings from UK Parliament, Novo Nordisk, Johnson & Johnson and Safefood. SWF also reports unpaid roles with Obesity UK. Bruno Halpern takes part in the advisory board of Novo Nordisk, Lilly, and Astra Zeneca. Additionally, Bruno Halpern receives lecture fees from Novo Nordisk, Astra Zeneca, Merck, and Abbott Nutrition. Furthermore, Bruno Halpern conducts clinical trials in collaboration with Novo Nordisk, Lilly, and Boehringer Ingelheim. John B. Dixon is a consultant and is on the advisory boards for Nestle Health Science and Reshape Lifesciences. John B. Dixon is on the advisory boards and receives speaker fees from Novo Nordisk and Lilly.