Volume 120, Issue 9 pp. 1863-1875

METHODS AND TECHNIQUES

Open Access

Identifying relevant intersections in relation to motivation and attempt to stop smoking by using a combination of methods to develop robust predictive models and resampling techniques: A cross-sectional study of the German population

Sabina Ulbricht,

Corresponding Author

Sabina Ulbricht

[email protected]

orcid.org/0000-0001-7617-0266

Department SHIP-KEF, Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany

Correspondence

Sabina Ulbricht, Department SHIP-KEF, Institute for Community Medicine, University Medicine Greifswald, Walther-Rathenau-Str. 48, 17475, Greifswald, Germany.

Email: [email protected]

Contribution: Conceptualization (equal), Formal analysis (supporting), Methodology (supporting), Writing - original draft (lead), Writing - review & editing (equal)

Search for more papers by this author

Adrian Richter,

Adrian Richter

Department SHIP-KEF, Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany

Epidemiology and Health Services Research, German Rheumatology Research Centre Berlin, Berlin, Germany

Contribution: Conceptualization (equal), Formal analysis (lead), Methodology (lead), Writing - original draft (equal), Writing - review & editing (lead)

Search for more papers by this author

Daniel Kotz,

Daniel Kotz

orcid.org/0000-0002-9454-023X

Institute of General Practice, Addiction Research and Clinical Epidemiology Unit, Centre for Health and Society, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Germany

Department of Behavioural Science and Health, University College London, London, UK

Contribution: Conceptualization (equal), Funding acquisition (lead), Methodology (supporting), Writing - original draft (supporting), Writing - review & editing (supporting)

Search for more papers by this author

Sabrina Kastaun,

Sabrina Kastaun

orcid.org/0000-0002-5590-1135

Institute of General Practice, Patient-Physician Communication Research Unit, Centre for Health and Society, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Germany

Contribution: Conceptualization (equal), Methodology (supporting), Supervision (lead), Writing - original draft (supporting), Writing - review & editing (supporting)

Search for more papers by this author

Sabina Ulbricht,

Corresponding Author

Sabina Ulbricht

[email protected]

orcid.org/0000-0001-7617-0266

Department SHIP-KEF, Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany

Correspondence

Sabina Ulbricht, Department SHIP-KEF, Institute for Community Medicine, University Medicine Greifswald, Walther-Rathenau-Str. 48, 17475, Greifswald, Germany.

Email: [email protected]

Contribution: Conceptualization (equal), Formal analysis (supporting), Methodology (supporting), Writing - original draft (lead), Writing - review & editing (equal)

Search for more papers by this author

Adrian Richter,

Adrian Richter

Department SHIP-KEF, Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany

Epidemiology and Health Services Research, German Rheumatology Research Centre Berlin, Berlin, Germany

Contribution: Conceptualization (equal), Formal analysis (lead), Methodology (lead), Writing - original draft (equal), Writing - review & editing (lead)

Search for more papers by this author

Daniel Kotz,

Daniel Kotz

orcid.org/0000-0002-9454-023X

Department of Behavioural Science and Health, University College London, London, UK

Contribution: Conceptualization (equal), Funding acquisition (lead), Methodology (supporting), Writing - original draft (supporting), Writing - review & editing (supporting)

Search for more papers by this author

Sabrina Kastaun,

Sabrina Kastaun

orcid.org/0000-0002-5590-1135

Contribution: Conceptualization (equal), Methodology (supporting), Supervision (lead), Writing - original draft (supporting), Writing - review & editing (supporting)

Search for more papers by this author

First published: 21 March 2025

https://doi.org/10.1111/add.70045

Shared first-authorship: Sabina Ulbricht and Adrian Richter contributed equally.

Share a link

Email
Wechat
Bluesky

Abstract

Aims

To illustrate robust intersections of co-occurring factors for two predictors of smoking cessation, motivation to stop smoking (MTSS) and past year-quit attempts (QA), by using means to develop robust predictive models such as bootstrap resampling, scoring rules to evaluate the predictive accuracy and spline functions.

Design, Setting and Participants

Cross-sectional data from the German Study on Tobacco Use (DEBRA). Past-years smokers (≥18 years, n = 13 245) from 22 survey waves (2016–2020) were included. The sample (mean age 46.8 years, 46.7% women) was randomly divided into learning (70%) and validation data (30%). Less than 20% in both data sets had tried to stop smoking within the preceding 12 months.

Measurements

Multinomial regression (for MTSS) and logistic regression (for QA) were used to evaluate whether age, sex, education, monthly net household income per person and the region of residence form intersections with relevant differences in the two outcomes.

Findings

MTSS compared with the absence of MTSS was associated with middle [95% confidence interval (CI) = 1.02–1.39] and high education (95% CI = 1.37–1.98). Regarding MTSS, the highest probabilities were observed in participants aged 30 to 50 years from lower and middle (30–40 years) income groups. Regarding QA, the probability of at least one past-year QA was highest in females aged between 20 and 40 years and independent from educational level. Similar probabilities in males were seen only among those from the highest educated group. The predictive accuracy of the results was reduced by 3.1% for MTSS and 3.4% for QA when comparing learning with validation data.

Conclusions

This German study provides compelling evidence linking highest motivation to stop smoking to those aged 30 to 50 years with lower or middle household income. Regardless of educational level, females' probabilities of reporting at least one past-year quit attempt appears to be highest in those aged 20 to 40 years. These findings highlight the need for adopting an intersectional approach when studying predictors of smoking cessation.

INTRODUCTION

Tobacco use causes 8 million premature deaths worldwide each year, predominantly from respiratory diseases, cardiovascular diseases and cancer diseases [1]. Although there was a decline in tobacco use over the past decades in high income countries, including Germany [2], socio-economic inequalities in smoking prevalence have persisted or increased over time [3-7].

Smoking prevalence is determined by uptake of smoking and smoking cessation. Socio-demographic factors, which are known to impact both determinants comprise sex and education [8, 9]. Motivational factors, including the intention or desire to stop smoking have been identified as important predictors of quit attempts [10] and as a consequence smoking cessation.

Although the majority of studies considered attributes such as age, sex or education as mutually independent factors, a more recent study [11] has examined the potentially synergistic effects of such characteristics on smoking cessation using the intersectionality framework [12, 13]. In line with this framework, smoking cessation outcomes may be shaped by the interaction of several attributes of a person. This interplay can result in different levels of motivation to stop smoking (MTSS) and may also manifest itself in different probabilities of quit attempts. For example, a younger female who lives in an urban region and has an above-average level of education may be more motivated to stop smoking than an older female who lives in a rural region and has a lower educational level [12, 13]. Recognising these multiple intersecting dimensions is important to understand how age and sex interact with other attributes (e.g. region of residence) in order ‘to improve the understanding of how multiple factors relate to health’ [11]. Unfortunately, the approach proposed in this study [11] is known to generate spurious effects because of categorisation of continuous covariates (e.g. age) [14] and because of multiple testing in too complex regression models [15]. In consequence, some of the intersections may only appear to be different with regard to the outcome, but the result is likely to be irreproducible. Similar findings regarding the susceptibility for spurious effects were made for the application of multilevel models in the context of intersectionality research [16].

Generalized additive models (GAM) [17] can be used to allow for flexible modelling and the inclusion of interactions while avoiding categorisation of continuous covariates. The built-in algorithm of GAMs selects the optimal complexity of functional forms. However, there is no inherent mechanism to control for the complexity in terms of the number of interaction terms. To enhance robustness and generalisability of results in studies investigating complex questions of intersectionality we propose a combination of established methods for developing predictive models [18].

To illustrate robust intersections of co-occurring factors (age, sex, education, monthly net household income and the region of participants' residence) for two smoking cessation outcomes: MTSS (current smokers) and past year-quit attempts (QA, among past-year smokers, including current smokers and ex-smokers who had stopped smoking within the preceding 12 months) this study uses a three-step process. First, we divided the large corpus of the German Study on Tobacco Use (DEBRA: ‘Deutsche Befragung zum Rauchverhalten’) into learning and validation data. Second, using bootstrap resampling, we evaluated different regression models comprising interactions between socio-demographic factors and varying functional forms of continuous covariates to examine their association with both outcome measures. Third, the best model for both outcomes was examined for robustness in validation data.

MATERIAL AND METHODS

Data source

The ongoing DEBRA study is a representative, nationwide household survey on the use of tobacco and nicotine products (https://www.debra-study.info/). The protocol of the general study design has been published [19]. A registration of the study can be found at German Clinical Trials Register (registration numbers DRKS00011322, DRKS00017157 and DRKS00028054). Data from computer-assisted face-to-face household interviews are collected every other month from individuals age ≥14 years between June/July 2016 and April/July 2023. The article presents data of 22 survey waves (2016–2020) with a total of 13 245 past-year smokers, including 12 784 current smokers and 461 ex-smokers who had stopped smoking within the preceding 12 months. A detailed study protocol was pre-registered on the Open Science Framework (https://osf.io/ub3w7).

MEASURES

Outcome variables

The first outcome examined with regard to intersections was the level of MTSS, measured among current smokers and by the German version of the Motivation to Stop Scale (MTSS) [20, 21]. MTSS is rated on a 7-level scale, ranging from lowest level 1 (I do not want to stop smoking) to highest level 7 (I really want to stop smoking and intend to in the next month).

The second outcome considered was past-year QA. Past-years smokers were asked: ‘How many serious attempts to stop smoking have you made in the last 12 months? By serious attempt I mean you decided that you would try to make sure you never smoked again. Please include any attempt that you are currently making and please include any successful attempt made within the last year’.

Intersectionality measures

This study is interested in possible intersections formed by the covariates: age, sex, educational level, monthly net household income and region of residence. Age and monthly net household income are considered as continuous variables in the models. Monthly net household income was calculated per person/month on the household (for more details https://osf.io/387fg/) and coded from 0 (€0 income) to 7 (>€7000), according to an equalisation technique of the Organisation for Economic Cooperation and Development [22]. Sex (female or male), educational level (low = junior high school equivalent or no graduation as reference vs. middle = secondary school equivalent or high = high school equivalent or advanced technical college equivalent) and participant's region of residence (metropolitan, urban, vs. rural as reference) were evaluated as categorical variables. In line with a previous German cohort study [23], participant's region of residence was assessed using the administrative municipality district size that was collapsed into three categories: >500 000 = metropolitan area, 20 000 to 500 000 = urban area and <20 000 = rural area.

Potential confounding measures

Strength of urges to smoke was assessed by using the second item of the German version of the Strength of Urges to Smoke Scale [24]. The item consists of a simple rating of strength of urges on a normal smoking day ranging from 0 (none) to 5 (extremely strong) [25].

Statistical analysis

Initial data analysis

We first examined longitudinal changes across survey waves from 2016 and 2023 with regard to all intersectionality measures (please see above), the prevalence of smoking, MTSS and the number of QAs (Figures S1–S8). The data indicated a considerable increase in the prevalence of smokers during the coronavirus disease 2019 pandemic [26, 27] (Figure S6). Therefore, we suspected an influence of the pandemic on the study outcomes, which has been described in the study protocol. Subsequently, the final sample was restricted to those being surveyed between 2016 and 2020 (waves 1–22; n = 13 245 past-year smokers). We deviated from our assumptions in the protocol by including the covariate time in all models, because considerable time trends were found in earlier waves of the DEBRA study with regard to the outcome of at least one QA (Figure S7).

Analysis of intersectionality

Statistically, the explorative investigation of intersectionality entails the consideration of multiple interaction terms in regression models, which markedly increases model complexity. It is well established that overly complex models are susceptible to overfitting the data [28] (i.e. a too complex model fits also to the noise in the data). With regard to intersectionality research, there is a significant risk of false-positive results and overfitting if multiple interaction terms are examined simultaneously [29]. Resampling methods, such as cross-validation or bootstrap, are computationally demanding but can assist in avoiding overfitting. In essence, resampling enables the identification of a model that may not be optimal for a single data set but is, nevertheless, the most accurate on average across multiple randomly generated data representations.

Categorization of continuous covariates, which has been discouraged since decades [28, 30, 31], increases model complexity further. The perceived benefits of categorization are, among others, eased model interpretation and handling of non-linearity. However, it has been shown that this approach generates biased estimates and is statistically inferior to the use of spline functions in case of non-linearity. To overcome interpretability issues when using spline functions several best-practice approaches are available [32].

The approach adopted in this study is based on a recent simulation study that has shown superiority of resampling-based methods over heuristic approaches when examining multiple intersections [29]. The algorithm of this approach is described below (please see: Model training and validation, Figure 1).

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Scheme of model training and validation algorithm.

Regression models

MTSS

For the first outcome, application of ordinal regression models was planned in the study protocol. Initial analysis revealed violations of the proportional odds assumption [33], and therefore, multinomial regression models were chosen (Table S1). The seven categories of the MTSS-scale were collapsed into three categories. The first category (absence of motivation) represents individuals without any intention to quit smoking (MTSS-scale level 1 ‘I don't want to stop smoking’). The second category (unspecific motivation) covers those with moderate desire, but no intention to quit smoking (MTSS-scale level 2 and 3, e.g. ‘I think I should stop smoking but don't really want to’) [21]. Individuals who reported a strong desire, but no intention to quit or a moderate or strong desire and an intention to quit (MTSS-scale level 4, 5, 6 and 7, e.g. ‘= I really want to stop smoking and intend to in the next months.’) were assigned to category three (motivation). The approach of collapsing categories is only partially methodologically justified by the violation of the proportional odds assumption. Nevertheless, the categories were collapsed on the basis of content-related decisions and are not methodologically supported. As most variable selection approaches are not designed for multinomial regression models [34], best subset selection has been chosen [35] (please see: Model training and validation).

QA

Our second outcome, the number of QAs, was recoded into a dichotomous variable ‘past-year quit attempt’ (yes, ≥1 attempt vs. none). Multiple logistic regression models were used to describe the odds within the last 12 months. We followed a 2-step approach for the choice of covariates: the functional form and interaction term [36-38] were chosen using backward elimination (BE) in combination with Bayesian information criteria (BIC) [39] (please see: Model training and validation).

Model training and validation

Step 1:
data of the 22 survey waves comprised 13 245 past-year smokers, including 12 784 current smokers and 461 ex-smokers who had stopped smoking within the preceding 12 months. The data were randomly divided into learning data (70%) and validation data (30%).
Step 2:
within the learning data, repeated bootstrap samples are used to create inner learning and validation data. Different models are then fitted to inner learning data and predictions of outcomes are computed for inner validation data. The best model is selected according to best prediction accuracy, for instance, the highest polytomous discrimination index (PDI) [40-42] for the first outcome (MTSS) and lowest Brier score [43] for the second outcome (QA). As a final step, models comprising an interaction terms were tested against the same model without the interaction term using likelihood ratio tests.
Step 3:
the predictive accuracy of the final model was evaluated in independent validation data (Figure 1).

Missing data

The overall amount of missing data in the DEBRA study is very low and did not exceed 8% (Table 1). Nevertheless, missingness was found to be selective in subgroups of the data, for example, males with higher education more frequently denied information on smoking habits (Figure S9). We, therefore, decided to apply multiple imputation [44] using the R package mice [45]. The imputation model comprised all intersectionality measures and confounder of the analysis models. To avoid model optimism regarding the prediction accuracy, imputations were done separately for learning data (m = 10 imputations) and validation data (m = 20 imputations). Given the overall low number of missing values, we considered 10 imputations in the learning data to be sufficient and in alignment with recommendations [46].

TABLE 1. Characteristics of the study sample.^a^,^b

	Past-year smokers	Current smokers	Past-year smokers	Current Smokers
	Learning data		Validation data
n	9253	8944	3992	3840
Age (y)
Mean (SD)	46.8 (16.7)	46.9 (16.7)	46.8 (16.9)	46.9 (16.9)
Median [Min, Max]	48.0 [18.0, 95.0]	48.0 [18.0, 95.0]	48.0 [18.0, 99.0]	48.0 [18.0, 99.0]
Sex (n, %)
Male	4941 (53.4)	4778 (53.4)	2115 (53.0)	2034 (53.0)
Female	4312 (46.6)	4166 (46.6)	1877 (47.0)	1806 (47.0)
Education (n, %)
Low	3114 (33.7)	3040 (34.0)	1286 (32.2)	1252 (32.6)
Middle	3623 (39.2)	3499 (39.1)	1641 (41.1)	1583 (41.2)
High	2391 (25.8)	2286 (25.6)	1011 (25.3)	953 (24.8)
Missing	125 (1.4)	119 (1.3)	54 (1.4)	52 (1.4)
Region of residence (n, %)
Rural	3323 (35.9)	3202 (35.8)	1456 (36.5)	1405 (36.6)
Urban	3995 (43.2)	3869 (43.3)	1753 (43.9)	1676 (43.6)
Metropolitan	1935 (20.9)	1873 (20.9)	783 (19.6)	759 (19.8)
Net household income^c
Mean (SD)	1.47 (0.818)	1.47 (0.819)	1.47 (0.790)	1.47 (0.788)
Median [Min, Max]	1.39 [0, 7.00]	1.39 [0, 7.00]	1.43 [0, 7.00]	1.43 [0, 7.00]
Missing (n, %)	2 (0.0)	1 (0.0)	1 (0.0)	0 (0)
Cigarettes per day^d
Mean (SD)	13.8 (8.27)	13.8 (8.23)	14.0 (8.01)	14.0 (8.01)
Median [Min, Max]	14.0 [1.00, 70.0]	14.0 [1.00, 70.0]	14.0 [1.00, 80.0]	14.0 [1.00, 80.0]
Missing (n, %)	601 (6.5)	518 (5.8)	249 (6.2)	213 (5.5)
Heaviness of smoking index^d
Mean (SD)	4.07 (1.54)	4.07 (1.54)	4.09 (1.52)	4.09 (1.52)
Median [Min, Max]	4.00 [2.00, 8.00]	4.00 [2.00, 8.00]	4.00 [2.00, 8.00]	4.00 [2.00, 8.00]
Missing (n, %)	961 (10.4)	652 (7.3)	430 (10.8)	278 (7.2)
Urges to smoke (n, %)
None	668 (7.2)	477 (5.3)	305 (7.6)	210 (5.5)
Light	2046 (22.1)	1991 (22.3)	863 (21.6)	839 (21.8)
Medium strong	3708 (40.1)	3664 (41.0)	1615 (40.5)	1596 (41.6)
Strong	2109 (22.8)	2100 (23.5)	886 (22.2)	879 (22.9)
Very strong	519 (5.6)	517 (5.8)	235 (5.9)	232 (6.0)
Extremely strong	92 (1.0)	92 (1.0)	42 (1.1)	42 (1.1)
Missing	111 (1.2)	103 (1.2)	46 (1.2)	42 (1.1)
Quit attempt (≥1 QA, last 12 months)^d (n, %)
No	7141 (77.2)	7141 (79.8)	3067 (76.8)	3067 (79.9)
Yes	1689 (18.3)	1380 (15.4)	778 (19.5)	626 (16.3)
Missing	423 (4.6)	423 (4.7)	147 (3.7)	147 (3.8)
MTSS^e
1	4221 (45.6)	4221 (47.2)	1889 (47.3)	1889 (49.2)
2	2347 (25.4)	2347 (26.2)	979 (24.5)	979 (25.5)
3	961 (10.4)	961 (10.7)	348 (8.7)	348 (9.1)
4	307 (3.3)	307 (3.4)	145 (3.6)	145 (3.8)
5	786 (8.5)	786 (8.8)	312 (7.8)	312 (8.1)
6	100 (1.1)	100 (1.1)	55 (1.4)	55 (1.4)
7	86 (0.9)	86 (1.0)	46 (1.2)	46 (1.2)
Missing	445 (4.8)	136 (1.5)	218 (5.5)	66 (1.7)

Abbreviations: Max, maximum; Min, minimum; MTSS, motivation to stop smoking; QA, past year-quit attempts.
^a The study sample comprised 13 245 past-year smokers, including 12 784 current smokers and 461 recent ex-smokers (i.e. individuals who had stopped smoking within the preceding 12 months).
^b The data set of past-years smokers and current smokers were split into training data (70%) and validation data (30%).
^c Net household income was standardised into a value range of 0 = lowest to 7 = highest, according to an equalisation technique of the Organisation for Economic Cooperation and Development.
^d For the second outcome ‘attempts to quit smoking’ recent ex-smokers are included. Therefore, learning data (validation data) for the second outcome comprised additional n = 309 (n = 152) participants.
^e Not assessed among recent ex-smokers.

Final estimates of coefficients were obtained after fitting the best models to the learning data containing multiple imputations of missing data. Pooled estimates are presented as OR and respective 95% CI. Marginal means [47] were also calculated according Rubins' rule and illustrated using the R package ggeffects [48]. Finally, model estimates were used to predict both outcomes in the validation data.

Computations

We used parallelized R code (R version 4.3.1, [49]; RStudio version 2023.06.1.) supported by the R packages foreach and doParallel [50] and a high performance cluster computer of the University of Greifswald [51]. All code used is publicly available in an open GitLab repository [52].

RESULTS

Characteristics of participants

The sample of past-year smokers (n = 13 245) was composed in the learning data of 4312 women (46.6%) and 4941 men (53.4%) with a mean age of 46.8 years (SD = 16.7) (Table 1). The validation data included 1877 females (47.0%) and 2115 males (53.0%) with a mean age of 46.8 years (SD = 16.9). Among them, 45.6% (learning data) and 47.3% (validation data) reported the absence of MTSS. Less than 20% in both data sets had tried to stop smoking within the preceding 12 months (learning data = 18.3%, validation data = 19.5%).

First outcome: MTSS

Table 2 shows that an unspecific motivation compared with the absence of MTSS was associated with female sex (OR = 1.18; 95% CI = 1.08–1.30), high versus middle education (OR = 1.13; 95% CI = 1.01–1.26), high versus low education (OR = 1.33; 95% CI = 1.16–1.52) and participant's residence in an urban versus rural area (OR = 1.12; 95% CI = 1.01–1.24). Further, a MTSS, compared with the absence of motivation, was associated with both middle (OR = 1.19; 95% CI = 1.02–1.39) and high education (OR = 1.65; 95% CI = 1.37–1.98), and participant's residence in an urban versus rural area (OR = 1.17; 95% CI = 1.01–1.35). When evaluating the interaction of age and net household income, changes in MTSS depend on variation in these variables. Therefore, the respective main effects of age and monthly net household income are not interpreted [28].

TABLE 2. Multinomial regression analyses of the MTSS.^a

Term	MTSS: unspecific^a		MTSS: higher^a
Term	OR	95% CI	OR	95% CI
(Intercept)	0.84	[0.47; 1.5]	0.65	[0.31; 1.38]
Strength of urges to smoke
None	0.64	[0.51; 0.8]	0.63	[0.47; 0.86]
Light	0.75	[0.67; 0.85]	0.85	[0.72; 1]
Medium strong	1.00	Ref.	1.00	Ref.
Strong	0.86	[0.76; 0.97]	0.73	[0.62; 0.86]
Very strong	0.72	[0.59; 0.88]	0.43	[0.31; 0.61]
Extremely strong	0.52	[0.31; 0.85]	0.80	[0.44; 1.46]
Time^b
2016	1.00	Ref.	1.00	Ref.
2017 vs. 2016	1.26	[1.11; 1.44]	0.84	[0.71; 0.99]
2018 vs. 2016	1.20	[1.04; 1.38]	0.70	[0.58; 0.83]
2019 vs. 2016	1.15	[1.01; 1.30]	0.59	[0.50; 0.70]
2020 vs. 2016	1.25	[1.04; 1.50]	0.54	[0.42; 0.70]
Age^b
20	1.00	Ref.	1.00	Ref.
30 vs. 20	1.00	[0.81; 1.24]	2.04	[1.50; 2.78]
40 vs. 20	1.00	[0.80; 1.25]	1.88	[1.35; 2.63]
50 vs. 20	1.04	[0.85; 1.28]	1.55	[1.15; 2.09]
60 vs. 20	1.10	[0.89; 1.36]	1.87	[1.36; 2.58]
70 vs. 20	1.01	[0.81; 1.27]	1.32	[0.94; 1.85]
80 vs. 20	0.74	[0.55; 1.00]	0.50	[0.30; 0.84]
Sex
Male	1.00	Ref.	1.00	Ref.
Female	1.18	[1.08; 1.3]	1.13	[1; 1.29]
Education
Low	1.00	Ref.	1.00	Ref.
Middle	1.13	[1.01; 1.26]	1.19	[1.02; 1.39]
High	1.33	[1.16; 1.52]	1.65	[1.37; 1.98]
Region
Rural	1.00	Ref.	1.00	Ref.
Urban	1.12	[1.01; 1.24]	1.17	[1.01; 1.35]
Metropolitan	0.92	[0.81; 1.05]	0.91	[0.76; 1.09]
Household income^b^,^c^,^d
[0–1]	1.00	Ref.	1.00	Ref.
[1–2] vs. [0–1]	1.10	[0.88; 1.38]	0.74	[0.55; 0.99]
[2–3] vs. [0–1)	1.00	[0.76; 1.31]	0.59	[0.41; 0.85]
[3–4] vs. [0–1]	0.79	[0.52; 1.18]	0.49	[0.27; 0.90]

Abbreviation: MTSS, motivation to stop smoking
^a Reference category: MTSS: absence of motivation.
^b Continuous covariates were modelled using restricted cubic splines. Respective model estimates were evaluated at specified values of the covariate and contrasted to a reference category; estimates and confidence intervals were calculated using Rubins' rule.
^c Net household income was standardised into a value range of 0 = lowest to 7 = highest, according to an equalisation technique of the Organisation for Economic Cooperation and Development [27].
^d Evaluation of the interaction effect is omitted in the table but explicitly shown in Figure 2.

Estimated marginal means for the probabilities of absence and MTSS, stratified by monthly net household income, varied non-linearly over smokers' age from high net income groups versus low monthly net income groups (Figure 2). A MTSS was more likely in participants between 30 and 40 years from lower income groups compared to their counterparts with higher monthly net household income. As outlined in Figure 2, the highest probabilities of reporting a MTSS in participants between 40 and 50 years were seen among those from the lowest income groups. No other intersection with regard to MTSS was detected.

Second outcome: QA

In the model that included QA as dependent variable, higher monthly net household income was associated with lower odds of reporting a past-year QA (OR = 0.83; 95% CI = 0.77–0.89) (Table 3).

TABLE 3. Pooled coefficients for reporting a QA.^a

Term	OR	95% CI
(Intercept)	0.48	[0.37; 0.62]
Strength of urges to smoke
None	2.54	[2.11; 3.05]
Light	0.90	[0.78; 1.04]
Medium strong	1.00	Ref.
Strong	1.06	[0.92; 1.23]
Very strong	0.80	[0.61; 1.05]
Extremely strong	1.47	[0.89; 2.43]
Time	0.77	[0.73; 0.81]
Age	0.95	[0.92; 0.98]
Sex
Male	1.00.	Ref.
Female	1.26	[1.04; 1.52]
Education
Low	1.00	Ref.
Middle	1.1	[0.91; 1.32]
High	1.42	[1.16; 1.73]
Region
Rural	1.00	Ref.
Urban	1.02	[0.90; 1.15]
Metropolitan	0.99	[0.85; 1.15]
Net household income^b	0.83	[0.77; 0.89]
Interaction of sex:education^c
Female:middle	0.94	[0.73; 1.22]
Female:high	0.67	[0.51; 0.89]

Abbreviation: QA, quit attempt.
^a At least one quit attempt vs. no quit attempt
^b Net household income was standardised into a value range of 0 = lowest to 7 = highest, according to an equalisation technique of the Organisation for Economic Cooperation and Development [27].
^c Evaluation of the interaction effect is omitted in the table but explicitly shown in Figure 3.

As outlined in Figure 3, the highest probabilities of reporting a past-year QA were seen in females between 20 and 40 years and across all educational levels. Similar probabilities in males were observed only among those with high education. No other variables of our data set formed a relevant intersection with regard to QA.

The predictive accuracy of the results was reduced by 3.1% for MTSS and 3.4% for QA when comparing learning with validation data.

DISCUSSION

The present study consists of a thorough investigation regarding the nature of intersection of co-occurring demographic and socio-economic factors for two main predictors for smoking cessation: MTSS and QA. The study yielded three principal findings. First, for both outcomes, the optimal regression model comprised one interaction term.

Two episodes of age were identified with a higher proportion of smokers reporting a MTSS. Higher probabilities were observed in participants between 30 and 50 years from lower and middle (30–40 years) income groups. Those insights are worthwhile in the context of more targeted societal and individualized prevention programs. It is possible that the MTSS is pushed by lower financial resources in middle agers with lower or middle net household income because of several financial requirements at this stage of life (e.g. starting a career and/or a family, educational support for children). Mainly people from lower socio-economic groups report behavioural steps toward stop smoking in response to increased taxes on tobacco products [53, 54]. This finding could also be used in the adaptation of behavioural smoking cessation programs by combining these challenges of life with health-related aspects to foster the MTSS. Regardless of educational level, probabilities of reporting at least one past-year QA were highest in females between 20 and 40 years. Similar probabilities in males were seen only among those with high education. This finding extends the already existing debate on sex differences in smoking cessation. Although women and men do not seem to differ in terms of the likelihood of desire to quit, or the number of quit attempts [55], women seem to be less successful in quitting smoking compared to men [55], and they experience more difficulties maintaining long-term abstinence than their male counterparts [56]. Quitting among female smokers may be different from quitting among male smokers, also because of life transitions such as becoming pregnant. There is a need for longitudinal studies to determine how consumption patterns (e.g. combustible cigarettes, other tobacco and nicotine products), contextual factors (e.g. smoking status of the partner, voluntary smoking bans at home) or psychosocial factors (e.g. stress, wellbeing) are associated with quitting in females and males. Identifying relevant aspects in the prediction of QA may facilitate targeted smoking cessation programs that adapt treatments by considering both gender and education.

Second, advanced statistical methods like resampling techniques were used to ensure the robustness of the results [18]. We have learned from a recent simulation study that, for example, BE according to the BIC combined with resampling, reduces false-positive results and reliably identifies relevant differences between intersections [29]. The mechanism that protects against overfitting and improves the generalisability of results is twofold. On the one hand, resampling and training the models on thousands of replicates of the data avoids a regression model being optimal for only one specific data set. On the other hand, predictive accuracy of all regression models is evaluated in data not being used for model training. The combination of both results in a model with the highest possible complexity that is optimal on average over several thousand replications and variants of the data. This model may or may not contain interaction terms. In the latter case, it is not suggestive that intersections with true differences with regard to the outcome are present.

Third, functional forms for continuous covariates such as splines were used to avoid arbitrary categorisation of continuous covariates. To date, heuristic approaches of multi-variable stratification to form intersections of discrete and categorised continuous covariates are widely used [11, 57]. Adopting this approach to the present study, the stratification variable would comprise 162 levels (2 × 3 × 3 × 3 × 3] [sex (female, male), age (18–39, 40–49 and >50), educational level (low, middle and high), monthly net household income per person (low, middle and high) and region of residence (rural, urban and metropolitan)]. This approach appears to be pragmatic, but it has two major limitations. First, in real-world data it is impossible to distinguish true from spurious differences between intersections [58]. In simulated data, however, it has been shown [29] that multi-variable stratification resulted in spurious findings equal to or higher than the number of intersections with true differences. Second, the approach categorises continuous covariates, which is against acknowledged recommendations [28, 30], and, with regard to intersectionality, further inflates spurious findings [14]. Nevertheless, omitting the consideration of interactions between covariates at all, which is common in regression analysis, may not represent the optimal alternative as it could also result in bias [59].

Strength and limitations

The methods applied in this study represent standards for the development of prediction models [18]. When applied to our intersectionality research questions, it yielded sparse and robust models, whereas the prediction accuracy in data not used for model training was only slightly reduced (3.1% for MTSS, 3.4% for QA). As demonstrated in a recent simulation study [29], the approach is particularly reliable in larger data collections (n > 2000). Nevertheless, alternative frameworks that are capable of modelling non-linear relationships while penalising complexity, such as generalized additive models [17, 60], are also appropriate for examining these research questions. The DEBRA study obtains a large representative data corpus, including repeated survey waves over a period of 4 years and highly standardised data acquisition processes. From this data collection, 9253 past year smokers (current smokers: 8944) contributed to the training data and 3992 (3840) to the validation data, which represents another strength of this study. A study protocol was registered before conducting the study, which enhances transparency, particularly in relation to required modifications, as it is difficult to anticipate all data peculiarities.

Despite these strengths, some limitations should be noted. First, all data were based on self-report and, therefore, sensitive to recall bias. Second, there is no information about the survey response. Therefore, we were not able to adjust for potential wave-specifically non-response. Third, the survey sample is limited to German-speaking individuals, which reduces the opportunity for migrant participation. This selection may result in lower generalisability of the findings to the whole population in Germany. Fourth, the approach of this study to identify relevant intersections is different from those describing all possible intersections [11, 57], which prevents a comparison of the results. Nevertheless, the final likelihood ratio test showed the relevance of the interaction terms for model enhancement, thereby supporting the existence of pertinent intersections. Overall, however, the predictive performance of the covariates considered is low (i.e. it seems more probable that other characteristics) such as marital status, desire to have children or health events, have a more pronounced influence on both outcomes. Fifth, collapsing categories of the MTSS scale is only partially methodologically justified and represents a clear constraint of this study. This decision was made because of the violation of the proportional odds assumption and to avoid the construction of an overly complex multinomial model. However, this approach is not entirely methodologically sound, because there is only limited methodological guidance on the fusion of categories except for one penalized approach [61].

In sum, the identification of subgroups among currents smokers, which might be overlooked when focusing only on single demographic (e.g. age, sex) or socio-economic dimensions (e.g. education, income) could contribute to generate new ideas to address them by societal and individualized prevention programs. There is a need for consented recommendations for exploring data to identify relevant instead of all intersections and to avoid non-reproducible and spurious findings. The approach used in this study can serve as foundation for future studies exploring intersectionality in other addictive behaviours (e.g. alcohol use) to achieve long-term a reduction of disparities in consumption patterns and to foster health equity.

AUTHOR CONTRIBUTIONS

Sabina Ulbricht: Conceptualization (equal); formal analysis (supporting); methodology (supporting); writing—original draft (lead); writing—review and editing (equal). Adrian Richter: Conceptualization (equal); formal analysis (lead); methodology (lead); writing—original draft (equal); writing—review and editing (lead). Daniel Kotz: Conceptualization (equal); funding acquisition (lead); methodology (supporting); writing—original draft (supporting); writing—review and editing (supporting). Sabrina Kastaun: Conceptualization (equal); methodology (supporting); supervision (lead); writing—original draft (supporting); writing—review and editing (supporting).

ACKNOWLEDGEMENTS

We extend our appreciation to the market research institute Oracle Life Science for their contribution to the data collection, with special acknowledgement to Constanze Cholmakow-Bodechtel and Marvin Krämer. Open Access funding enabled and organized by Projekt DEAL.

DECLARATION OF INTERESTS

None.

CLINICAL TRIAL REGISTRATION

German Clinical Trials Register (numbers DRKS00011322, DRKS00017157 and DRKS00028054).

Open Research

DATA AVAILABILITY STATEMENT

The data underlying this study are third-party data and are available to researchers on reasonable request. All proposals requesting data access will need to specify how it is planned to use the data, and all proposals will need approval of the DEBRA study team before the data release.The R code used in this study is publicly available in an open GitLab repository. [https://gitlab.com/Adrian_HGW/debra_intersectionality].

Supporting Information

REFERENCES

1 World Health Organization (WHO). Who discussion paper for the regional expert consultations: development of an implementation roadmap 2023–2030 for the global action plan for the prevention and control of ncds 2023–2030. [Internet]. 2021. Available from: who.int/publications/m/item/implementation-roadmap-2023-2030-for-the-who-global-action-plan-for-the-prevention-and-control-of-ncds-2023-2030
Google Scholar
2Schaller K, Kahnert SGL, Mons U, Ouédraogo n. Tabakatlas Deutschland 2020 Heidelberg: Deutsches Krebsforschungszentrum; 2020.
Google Scholar
3Ding D, Do A, Schmidt HM, Bauman AE. A widening gap? Changes in multiple lifestyle risk Behaviours by socioeconomic status in New South Wales, Australia, 2002–2012. PLoS ONE. 2015; 10(8):e0135338. https://doi.org/10.1371/journal.pone.0135338
10.1371/journal.pone.0135338
PubMed Web of Science® Google Scholar
4Ernstsen L, Strand BH, Nilsen SM, Espnes GA, Krokstad S. Trends in absolute and relative educational inequalities in four modifiable ischaemic heart disease risk factors: repeated cross-sectional surveys from the Nord-Trøndelag health study (HUNT) 1984–2008. BMC Public Health. 2012; 12:266.
10.1186/1471-2458-12-266
PubMed Web of Science® Google Scholar
5Scholes S, Bajekal M, Love H, Hawkins N, Raine R, O'Flaherty M, et al. Persistent socioeconomic inequalities in cardiovascular risk factors in England over 1994–2008: a time-trend analysis of repeated cross-sectional data. BMC Public Health. 2012; 12(1):129. https://doi.org/10.1186/1471-2458-12-129
10.1186/1471-2458-12-129
PubMed Google Scholar
6Nguyen-Grozavu FT, Pierce JP, Sakuma K-LK, Leas EC, McMenamin SB, Kealey S, et al. Widening disparities in cigarette smoking by race/ethnicity across education level in the United States. Prev Med. 2020; 139:106220.
10.1016/j.ypmed.2020.106220
PubMed Web of Science® Google Scholar
7Hoebel J, Kuntz B, Kroll LE, Finger JD, Zeiher J, Lange C, et al. Trends in absolute and relative educational inequalities in adult smoking since the early 2000s: the case of Germany. Nicotine Tob Res. 2018; 20(3): 295–302. https://doi.org/10.1093/ntr/ntx087
10.1093/ntr/ntx087
PubMed Web of Science® Google Scholar
8Pampel F, Legleye S, Goffette C, Piontek D, Kraus L, Khlat M. Cohort changes in educational disparities in smoking: France, Germany and the United States. Soc Sci Med. 2015; 127: 41–50. https://doi.org/10.1016/j.socscimed.2014.06.033
10.1016/j.socscimed.2014.06.033
PubMed Web of Science® Google Scholar
9Nagelhout GE, de Korte-Boer D, Kunst AE, van der Meer RM, de Vries H, van Gelder BM, et al. Trends in socioeconomic inequalities in smoking prevalence, consumption, initiation, and cessation between 2001 and 2008 in the Netherlands. Findings from a national population survey. BMC Public Health. 2012; 12:303. https://doi.org/10.1186/1471-2458-12-303
10.1186/1471-2458-12-303
PubMed Web of Science® Google Scholar
10Vangeli E, Stapleton J, Smit ES, Borland R, West R. Predictors of attempts to stop smoking and their success in adult general population samples: a systematic review. Addiction. 2011; 106(12): 2110–2121.
10.1111/j.1360-0443.2011.03565.x
CAS PubMed Web of Science® Google Scholar
11Potter LN, Lam CY, Cinciripini PM, Wetter DW. Intersectionality and smoking cessation: exploring various approaches for understanding health inequities. Nicotine Tob Res. 2021; 23(1): 115–123. https://doi.org/10.1093/ntr/ntaa052
10.1093/ntr/ntaa052
PubMed Web of Science® Google Scholar
12Brown TH, Richardson LJ, Hargrove TW, Thomas CS. Using multiple-hierarchy stratification and life course approaches to understand health inequalities: the intersecting consequences of race, gender, SES, and age. J Health Soc Behav. 2016; 57(2): 200–222. https://doi.org/10.1177/0022146516645165
10.1177/0022146516645165
PubMed Web of Science® Google Scholar
13Ailshire JA, House JS. The unequal burden of weight gain: an intersectional approach to understanding social disparities in BMI trajectories from 1986 to 2001/2002. Soc Forces. 2011; 90(2): 397–423. https://doi.org/10.1093/sf/sor001
10.1093/sf/sor001
PubMed Web of Science® Google Scholar
14Thoresen M. Spurious interaction as a result of categorization. BMC Med Res Methodol. 2019; 19(1): 28. https://doi.org/10.1186/s12874-019-0667-2
10.1186/s12874-019-0667-2
PubMed Google Scholar
15Anderson DR, Burnham KP, Gould WR, Cherry S. Concerns about finding effects that are actually spurious. Wildl Soc Bull (1973–2006). 2001; 29(1): 311–316.
Web of Science® Google Scholar
16Bell A, Holman D, Jones K. Using shrinkage in multilevel models to understand intersectionality. Methodology. 2019; 15(2): 88–96. https://doi.org/10.1027/1614-2241/a000167
10.1027/1614-2241/a000167
Web of Science® Google Scholar
17Wood SN. Fast stable direct fitting and smoothness selection for generalized additive models. J R Stat Soc Series B Stat Methodology. 2008; 70(3): 495–518. https://doi.org/10.1111/j.1467-9868.2007.00646.x
10.1111/j.1467-9868.2007.00646.x
Web of Science® Google Scholar
18Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference, and prediction Springer Science & Business Media; 2009.
10.1007/978-0-387-84858-7
Google Scholar
19Kastaun S, Brown J, Brose LS, Ratschen E, Raupach T, Nowak D, et al. Study protocol of the German study on tobacco use (DEBRA): a national household survey of smoking behaviour and cessation. BMC Public Health. 2017; 17(1): 378. https://doi.org/10.1186/s12889-017-4328-2
10.1186/s12889-017-4328-2
PubMed Google Scholar
20Kotz D, Brown J, West R. Predictive validity of the motivation to stop scale (MTSS): a single-item measure of motivation to stop smoking. Drug Alcohol Depend. 2013; 128(1–2): 15–19. https://doi.org/10.1016/j.drugalcdep.2012.07.012
10.1016/j.drugalcdep.2012.07.012
CAS PubMed Web of Science® Google Scholar
21Pashutina Y, Kastaun S, Ratschen E, Shahab L, Kotz D. Externe Validierung einer single-item Skala zur Erfassung der motivation zum Rauchstopp. SUCHT. 2021; 67(4): 171–180. https://doi.org/10.1024/0939-5911/a000719
10.1024/0939-5911/a000719
Google Scholar
22 Organisation for Economic Co-operation Development (OECD). Oecd framework for statistics on the distribution of household income, consumption and wealth. 2013.
Google Scholar
23Völzke H, Neuhauser H, Moebus S, Baumert J, Berger K, Stang A, et al. Urban-rural disparities in smoking behaviour in Germany. BMC Public Health. 2006; 6(1):146. https://doi.org/10.1186/1471-2458-6-146
10.1186/1471-2458-6-146
PubMed Google Scholar
24Fidler JA, Shahab L, West R. Strength of urges to smoke as a measure of severity of cigarette dependence: comparison with the Fagerström test for nicotine dependence and its components. Addiction. 2011; 106(3): 631–638.
10.1111/j.1360-0443.2010.03226.x
PubMed Web of Science® Google Scholar
25Kozlowski LT, Porter CQ, Orleans CT, Pope MA, Heatherton T. Predicting smoking cessation with self-reported measures of nicotine dependence: FTQ, FTND, and HSI. Drug Alcohol Depend. 1994; 34(3): 211–216. https://doi.org/10.1016/0376-8716(94)90158-9
10.1016/0376-8716(94)90158-9
CAS PubMed Web of Science® Google Scholar
26Klosterhalfen S, Kotz D, Kastaun S. Did smoking, alcohol consumption, and physical activity change during the COVID-19 restrictions in Germany in spring 2020? SUCHT. 2022; 68(3): 129–138. https://doi.org/10.1024/0939-5911/a000765
10.1024/0939-5911/a000765
Google Scholar
27Starker A, Kuhnert R, Hoebel J, Richter A. Rauchverhalten und Passivrauchbelastung Erwachsener – Ergebnisse aus GEDA 2019/2020-EHIS. J Health Monit. 2022;(3): 7–22.
PubMed Google Scholar
28Harrell FE Jr. Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis 2nd ed. Cham Heidelberg New York Dordtrecht London: Springer; 2015.
10.1007/978-3-319-19425-7
Google Scholar
29Richter A, Ulbricht S, Brockhaus S. Categorization of continuous covariates and complex regression models – friends or foes in intersectionality research. J Clin Epidemiol. 2024; 171:111368. https://doi.org/10.1016/j.jclinepi.2024.111368
10.1016/j.jclinepi.2024.111368
PubMed Google Scholar
30Altman DG. Categorising continuous variables. Br J Cancer. 1991; 64(5): 975.
10.1038/bjc.1991.441
CAS PubMed Web of Science® Google Scholar
31Buettner P, Garbe C, Guggenmoos-Holzmann I. Problems in defining cutoff points of continuous prognostic factors: example of tumor thickness in primary cutaneous melanoma. J Clin Epidemiol. 1997; 50(11): 1201–1210. https://doi.org/10.1016/S0895-4356(97)00155-8
10.1016/S0895-4356(97)00155-8
CAS PubMed Web of Science® Google Scholar
32Shepherd BE, Rebeiro PF, the Caribbean C, epidemiology SAnfH. Brief report: assessing and interpreting the association between continuous covariates and outcomes in observational studies of HIV using splines. JAIDS J Acquir Immune Defic Syndr. 2017; 74(3): e60–e63.
10.1097/QAI.0000000000001221
PubMed Google Scholar
33Brant R. Assessing proportionality in the proportional odds model for ordinal logistic regression. Biometrics. 1990; 46(4): 1171–1178. https://doi.org/10.2307/2532457
10.2307/2532457
CAS PubMed Web of Science® Google Scholar
34Tutz G, Pößnecker W, Uhlmann L. Variable selection in general multinomial logit models. Comput Stat Data Anal. 2015; 82: 207–222. https://doi.org/10.1016/j.csda.2014.09.009
10.1016/j.csda.2014.09.009
Web of Science® Google Scholar
35Hastie T, Tibshirani R, Tibshirani R. Best subset, forward stepwise or lasso? Analysis and recommendations based on extensive comparisons. Stat Sci. 2020; 35(4): 579–592. https://doi.org/10.1214/19-STS733
10.1214/19-STS733
Web of Science® Google Scholar
36Royston P, Sauerbrei W. Stability of multivariable fractional polynomial models with selection of variables and transformations: a bootstrap investigation. Stat Med. 2003; 22(4): 639–659. https://doi.org/10.1002/sim.1310
10.1002/sim.1310
CAS PubMed Web of Science® Google Scholar
37Buchholz A, Holländer N, Sauerbrei W. On properties of predictors derived with a two-step bootstrap model averaging approach—a simulation study in the linear regression model. Comput Stat Data Anal. 2008; 52(5): 2778–2793. https://doi.org/10.1016/j.csda.2007.10.007
10.1016/j.csda.2007.10.007
Web of Science® Google Scholar
38Heinze G, Wallisch C, Dunkler D. Variable selection – a review and recommendations for the practicing statistician. Biom J. 2018; 60(3): 431–449.
10.1002/bimj.201700067
PubMed Web of Science® Google Scholar
39Burnham KP, Anderson DR. Multimodel inference: understanding aic and bic in model selection. Sociological Methods Res. 2004; 33(2): 261–304.
10.1177/0049124104268644
Web of Science® Google Scholar
40Dover DC, Islam S, Westerhout CM, Moore LE, Kaul P, Savu A. Computing the polytomous discrimination index. Stat Med. 2021; 40(16): 3667–3681. https://doi.org/10.1002/sim.8991
10.1002/sim.8991
PubMed Web of Science® Google Scholar
41Van Calster B, Van Belle V, Vergouwe Y, Timmerman D, Van Huffel S, Steyerberg EW. Extending the c-statistic to nominal polytomous outcomes: the polytomous discrimination index. Stat Med. 2012; 31(23): 2610–2626. https://doi.org/10.1002/sim.5321
10.1002/sim.5321
CAS PubMed Web of Science® Google Scholar
42Gao M, Li J. Mcca: multi-category classification accuracy Comprehensive R Archive Network (CRAN); 2019.
Google Scholar
43Rufibach K. Use of brier score to assess binary predictions. J Clin Epidemiol. 2010; 63(8): 938–939. https://doi.org/10.1016/j.jclinepi.2009.11.009
10.1016/j.jclinepi.2009.11.009
PubMed Web of Science® Google Scholar
44Rubin DB. Multiple imputation after 18+ years. J am Stat Assoc. 1996; 91(434): 473–489. https://doi.org/10.1080/01621459.1996.10476908
10.1080/01621459.1996.10476908
Web of Science® Google Scholar
45van Buuren S, Groothuis-Oudshoorn K. Mice: multivariate imputation by chained equations in r. J Stat Softw. 2011; 45(3): 1–67.
10.18637/jss.v045.i03
Web of Science® Google Scholar
46White IR, Royston P, Wood AM. Multiple imputation using chained equations: issues and guidance for practice. Stat Med. 2011; 30(4): 377–399. https://doi.org/10.1002/sim.4067
10.1002/sim.4067
PubMed Web of Science® Google Scholar
47Searle SR, Speed FM, Milliken GA. Population marginal means in the linear model: an alternative to least squares means. Am Stat. 1980; 34(4): 216–221. https://doi.org/10.1080/00031305.1980.10483031
10.1080/00031305.1980.10483031
Web of Science® Google Scholar
48Lüdecke D. Ggeffects: tidy data frames of marginal effects from regression models. J Open Source Softw. 2018; 3(26):772. https://doi.org/10.21105/joss.00772
10.21105/joss.00772
Google Scholar
49 R Development Core Team. R: a language and environment for statistical computing Vienna, Austria: R Foundation for Statistical Computing; 2023.
Google Scholar
50Weston S, Microsoft Corporation. DoParallel: foreach parallel adaptor for the 'parallel' package. R package version 1.0.17. 2022.
Google Scholar
51 University of Greifswald. High performance computing. [Available from: https://rz.uni-greifswald.de/en/services/general/miscellaneous/high-performance-computing/]
Google Scholar
52Richter A. DEBRA: Intersectionality 2023. [Available from: https://gitlab.com/Adrian_HGW/debra_intersectionality
Google Scholar
53Bader P, Boisclair D, Ferrence R. Effects of tobacco taxation and pricing on smoking behavior in high-risk populations: a knowledge synthesis. Int J Environ Res Public Health. 2011; 8(11): 4118–4139.
10.3390/ijerph8114118
PubMed Web of Science® Google Scholar
54Wilson LM, Avila Tang E, Chander G, Hutton HE, Odelola OA, Elf JL, et al. Impact of tobacco control interventions on smoking initiation, cessation, and prevalence: a systematic review. J Environ Public Health. 2012; 2012(1):961724. https://doi.org/10.1155/2012/961724
10.1155/2012/961724
PubMed Google Scholar
55Smith PH, Kasza KA, Hyland A, Fong GT, Borland R, Brady K, et al. Gender differences in medication use and cigarette smoking cessation: results from the international tobacco control four country survey. Nicotine Tob Res. 2015; 17(4): 463–472. https://doi.org/10.1093/ntr/ntu212
10.1093/ntr/ntu212
PubMed Web of Science® Google Scholar
56Smith PH, Bessette AJ, Weinberger AH, Sheffer CE, McKee SA. Sex/gender differences in smoking cessation: a review. Prev Med. 2016; 92: 135–140.
10.1016/j.ypmed.2016.07.013
PubMed Web of Science® Google Scholar
57Zavala-Arciniega L, Meza R, Hirschtick JL, Fleischer NL. Disparities in cigarette, E-cigarette, cigar, and smokeless tobacco use at the intersection of multiple social identities in the US adult population. Results from the tobacco use supplement to the current population survey 2018–2019 survey. Nicotine Tob Res. 2023; 25(5): 908–917. https://doi.org/10.1093/ntr/ntac261
10.1093/ntr/ntac261
PubMed Web of Science® Google Scholar
58Boulesteix A-L, Groenwold RH, Abrahamowicz M, Binder H, Briel M, Hornung R, et al. Introduction to statistical simulations in health research. BMJ Open. 2020; 10(12):e039921. https://doi.org/10.1136/bmjopen-2020-039921
10.1136/bmjopen-2020-039921
PubMed Web of Science® Google Scholar
59Vatcheva KP, Lee M, McCormick JB, Rahbar MH. The effect of ignoring statistical interactions in regression analyses conducted in epidemiologic studies: an example with survival analysis using cox proportional hazards regression model. Epidemiology (Sunnyvale). 2015; 6(1):216. https://doi.org/10.4172/2161-1165.1000216
10.4172/2161-1165.1000216
PubMed Google Scholar
60Wood SN. Stable and efficient multiple smoothing parameter estimation for generalized additive models. J am Stat Assoc. 2004; 99(467): 673–686. https://doi.org/10.1198/016214504000000980
10.1198/016214504000000980
Web of Science® Google Scholar
61Tutz G, Gertheiss J. Regularized regression for categorical data. Stat Model. 2016; 16(3): 161–200. https://doi.org/10.1177/1471082X16642560
10.1177/1471082X16642560
Web of Science® Google Scholar

Volume120, Issue9

September 2025

Pages 1863-1875

Identifying relevant intersections in relation to motivation and attempt to stop smoking by using a combination of methods to develop robust predictive models and resampling techniques: A cross-sectional study of the German population

Abstract

Aims

Design, Setting and Participants

Measurements

Findings

Conclusions

INTRODUCTION

MATERIAL AND METHODS

Data source

MEASURES

Outcome variables

Intersectionality measures

Potential confounding measures

Statistical analysis

Initial data analysis

Analysis of intersectionality

Regression models

MTSS

QA

Model training and validation

Missing data

Computations

RESULTS

Characteristics of participants

First outcome: MTSS

Second outcome: QA

DISCUSSION

Strength and limitations

AUTHOR CONTRIBUTIONS

ACKNOWLEDGEMENTS

DECLARATION OF INTERESTS

CLINICAL TRIAL REGISTRATION

Open Research

DATA AVAILABILITY STATEMENT

Supporting Information

REFERENCES

Figures

References

Related

Information