Volume 48, Issue 6 pp. 859-866
SYSTEMATIC REVIEW
Open Access

Study design and primary outcome in randomized controlled trials in periodontology. A systematic review

Maryam Alshamsi

Maryam Alshamsi

Periodontology Unit, Centre for Host-Microbiome Interactions, Dental Institute, King's College London, London, UK

Search for more papers by this author
Jaimini Mehta

Jaimini Mehta

Periodontology Unit, Centre for Host-Microbiome Interactions, Dental Institute, King's College London, London, UK

Search for more papers by this author
Luigi Nibali

Corresponding Author

Luigi Nibali

Periodontology Unit, Centre for Host-Microbiome Interactions, Dental Institute, King's College London, London, UK

Correspondence

Luigi Nibali, King's College London Dental Institute, Guy’s Hospital, Great Maze Pond, London, UK.

Email: [email protected]

Search for more papers by this author
First published: 11 February 2021
Citations: 6

Maryam Al-Shamsi and Jaimini Mehta have contributed equally to this study

Funding information

No specific funding was obtained for this study.

Abstract

Aim

The aim of this review is to assess study design and risk of bias related to primary outcome in recently published randomized controlled trials (RCTs) in periodontology.

Method

An electronic (Medline, EMBASE and Cochrane library) and a manual search were completed to detect RCTs in humans, with an outcome in the field of periodontology and published in English from January 2018 up to March 2020.

Results

Data extraction of 318 publications meeting the inclusion criteria was performed by two reviewers. Most studies adopted a parallel-group superiority design in a university setting. Overall, 54% of papers reported the primary outcome and relative sample size calculation, while only 37% also included reproducibility estimates relative to the primary outcome. Papers published in journals with higher impact factors had better compliance with primary outcome reporting and lower overall risk of bias scores.

Conclusion

Improvements in the quality of RCTs in periodontology are still needed. The importance of defining a clinically relevant study primary outcome and building the study around it needs to be emphasized. Furthermore, RCTs in periodontology could consider, when appropriate, some of the study design options which facilitate application of the principles of personalized medicine.

Clinical relevance

Scientific rationale for the study: Clinical advances in periodontology need to be supported by robust research studies. We aimed to go beyond simple assessment of risk of bias, to systematically appraise the reporting of primary outcomes and the details of study design.

Principle findings: Only around half of randomized controlled trials (RCTs) in periodontology published in the last 2 years clearly report the primary outcome and associated sample size calculation. The overall risk of bias is inversely associated with the journal's impact factor.

Practical implications: RCTs in periodontology, irrespective of the journal in which they are published, should report the primary outcome, associated sample size calculation and reproducibility measures.

1 INTRODUCTION

Randomized controlled trials (RCTs) comparing the safety, efficacy and effectiveness of different treatment modalities have long been used to determine the gold standard intervention for different phases and types of periodontitis. The CONSORT statement was introduced and updated in 2010 (Schulz et al. 2010) to improve the methodology of clinical trials and reduce risk of bias, and it has been adopted as a submission requirement by most international peer-reviewed dental journals. Although the quality of RCTs has clearly improved over the last couple of decades, not all methodological aspects are always adhered to. While a lot of effort has been dedicated to improving methodology of randomization, perhaps less attention has been placed on a very important methodological aspect of RCTs, which is the definition of the primary outcome. The primary outcome should be clearly identified and reported, bearing in mind what is clinically relevant (Meher & Alfirevic, 2014). Clear “a priori” identification of a primary outcome avoids the risk of emphasizing secondary outcomes or switching outcomes. It also adds robustness to the study, as other aspects of the study can be built around it. For example, the study sample size calculation should be performed based on the primary outcome, to avoid the risk of carrying out underpowered studies or wasting resources by recruiting too large a sample (Andrade, 2015). It is also important that measures of reproducibility of the primary outcome are reported, to show accuracy and ensure that observed differences between groups are not due to measurement error.

Most RCTs in periodontology are designed as parallel-group superiority studies and a limited breadth of study design options are represented. In other words, non-inferiority, crossover, factorial study designs, which sometimes may be appropriate to answer specific clinically relevant questions, may be under-used in periodontology. The same probably applies to novel RCT designs now employed in medicine, which facilitate application of the principles of personalized medicine, such as SMART design (Lavori & Dawson, 2004), n-of-1 trials or randomized studies with adaptive design (Antoniou et al. 2016). The aim of this review is to assess primary outcome and study design related to primary outcome in recently published RCTs in periodontology. These two aspects are closely correlated, as they should both stem from a clear definition of the research question (“what do we want to test and how?”). Also, aspects of risk of bias were assessed, as routinely done in systematic reviews to assess any potential association between the main study findings (primary outcome and study design aspects) and overall risk of bias.

2 MATERIALS AND METHODS

A systematic review protocol was written in the planning stages, and the PRISMA statement (Moher et al., 2009) was followed both in the planning and reporting of the review (checklist attached as Supporting Information 1).

2.1 Focused questions

  • What study designs are employed in human RCTs in periodontology?
  • How well is the primary outcome defined in recent human RCTs in periodontology?
  • Is the sample size of recent human RCTs in periodontology based on the primary outcome?
  • Are reproducibility estimates for the primary outcome carried out in recent human RCTs in periodontology?

2.2 Eligibility criteria

In brief, the PICOS method was the following:
  • (P) Participants: Patients included in RCTs in the field of periodontology
  • (I) Interventions: Any type of periodontal intervention
  • (C) Comparisons: Different types/timings of periodontal intervention
  • (O) Outcomes: Definition of study design, description of primary outcome, sample size calculation and reproducibility estimates
  • (S) Studies: Randomized controlled trials (RCTs) in periodontology

The inclusion criteria for studies in this systematic review were as follows: (i) randomized controlled trials; (ii) with outcomes in the field of periodontology (clinical, radiographic, patient-reported or other). Exclusion criteria were as follows: (i) animal studies, (ii) laboratory studies, (iii) studies focusing on implants or peri-implant diseases and (iv) systematic reviews and meta-analysis of RCTs.

2.3 Information sources and search

The search included papers published from January 2018 to 31st March 2020 in Medline, EMBASE and Cochrane library and was complemented by a manual search on Journal of Dental Research, Journal of Clinical Periodontology, Journal of Periodontology and Journal of Periodontal Research and by a search on Open Grey.

Keywords used for the search were as follows: “periodontal” OR “periodontology” AND “randomized controlled trial” OR “randomised controlled trial” OR “randomised clinical trial” OR “randomized clinical trial” OR “RCT”.

2.4 Study characteristics

Study selection was conducted in duplicate by two independent reviewers (M.A., J.M.), who assessed all papers identified in the search to define those suitable according to the inclusion criteria. In case of doubts or disagreements between reviewers for study inclusion, the decision about study eligibility was made by trying to reach a consensus between the two reviewers or by consulting a third reviewer or arbitrator (author L.N.). Data extraction was carried out upon study inclusion. Owing to the very large number of papers to screen, studies were divided between the two reviewers and a subset of studies were assessed in duplicate for calibration purposes. The agreement value between reviewers was calculated in the subset of papers for which data were extracted by both reviewers.

This systematic review focused specifically on trial design and methodology. A standardized data extraction spreadsheet was used, where data from eligible studies were recorded. In particular, the specific items below were recorded:
  1. Study design:
    • Superiority/non-inferiority/unclear
    • Explanatory/pragmatic/unclear
    • Phase I/II/III/IV
    • Parallel/crossover
    • Open/single-, double-, triple-blinded or unclear
    • Factorial design or not.
  2. Primary outcome:
    • Described/unclear
    • Description of method for primary outcome measure reproducibility provided or not
    • Description of sample size calculation for primary outcome provided or not.

2.5 Risk of bias analysis

The quality of the included studies was assessed using the Cochrane Collaboration's Tool for assessing risk of bias for RCTs. The checklist described by Higgins and co-workers (Higgins et al. 2011) was used.

2.6 Summary measures and planned method of analysis

Studies were analysed descriptively, but no meta-analysis for quantitative data synthesis was possible due to the nature of this review. Statistical analysis (Chi-square with p value for statistical significance <0.05) was carried out to test associations between both complete reporting of primary outcome and overall risk of bias with journal impact factor (IF), based on 2019 IF (accessed on website www.bioxbio.com on 1st October 2020).

3 RESULTS

The study flow chart is presented in Figure 1. A total 642 papers were identified by the electronic search, and an additional 6 were identified through manual search, for a combined total of 648. Following screening, 318 papers were found to be suitable. Reasons for exclusions were no periodontal outcomes (n = 155), focusing on implants (n = 134), laboratory-based (n = 4), on animal models (n = 2), not in English (n = 8), not RCT (n = 6), duplicate reports (n = 7), protocol only (n = 1), secondary analysis of previously published study (n = 6), abstract only (n = 6) and full text not found (n = 1). Inter-reviewer agreement for study inclusion was 0.93. Agreement of 92.4% of data extracted was achieved between examiners in the 28 papers undergoing duplicate data extraction.

Details are in the caption following the image
Flow chart of study inclusion

The 318 included papers reported studies carried out in 46 different countries and published between 2018 and 2020. The total number of patients included ranged from 7 (Rajendran et al. 2018) to 1877 (Ramsay et al. 2018). A total of 62 papers (19.5%) were published in journals with IF < 1, 213 papers (66.9%) in journals with IF 1–4 and 43 papers (13.5%) in journals with IF > 4.

3.1 Study design

Table 1 reports data relative to study design, including if and what study design was clearly reported in the paper, and how study design was interpreted by the reviewers (if not reported). Overall, 19 studies were described as pilot/feasibility, seven papers reported if the trial was a superiority/inferiority trial and three papers reported the trial “phase.” One paper reported the study as pragmatic/explanatory (Ramsay et al. 2018). Eighty-six papers reported the study as parallel-groups, 13 defined them as crossover, while the remaining papers did not report this aspect. A total of 70 studies were described as “split-mouth.” In terms of blinding, 240 studies reported details. Out of these, 111 were described as “single-blind,” 107 as “double-blind” and 22 as “triple-blind.”

TABLE 1. Breakdown of details of study design as reported in the papers and as identified by the reviewers
Described in the publication Identified by the reviewers
Pilot/feasibility 19 Pilot/feasibility 16
Phase II 2 II 2
III 1 III 311
Not reported 315 Not identifiable 5
Parallel/crossover Parallel 86 Parallel 273
Crossover 13 Crossover 14
Not reported 219 Not identifiable 1
Split-mouth or not Split-mouth 70 Split-mouth 70
Not split-mouth 248 Not split-mouth 248

When the papers were critically reviewed to assess study design, 16 studies were considered “pilot,” 1 was considered “feasibility,” three were considered “non-inferiority” and eight were considered to have a factorial design. A handful of studies reported other specific study characteristics, such as clustered randomization (Ramsay et al. 2018; Al Bardaweel & Dashash, 2018) or Mendelian randomization (Czesnikiewicz-Guzik et al. 2019). “Blinding” included the examiner (n = 213), the therapist (n = 49), the patient (n = 110) and the statistician (n = 29).

3.2 Primary outcome

Out of 318 papers, 212 reported what the primary outcome of the study was, while 106 did not. The primary outcome ranged from a clinical parameter (n = 173), a laboratory variable (n = 31) and a patient-reported outcome (n = 8).

3.3 Sample size calculation

Overall, 195 papers reported a sample size calculation, while 123 did not. Twenty papers reported a sample size calculation despite not defining the primary outcome of the study. Overall, 170 papers (54%) reported primary outcome and relative sample size calculation.

3.4 Reproducibility estimates

A total of 154 papers reported reproducibility estimates, while 163 did not. Interestingly, some papers which did not report what the primary outcome was actually calculated reproducibility estimates (n = 29).

Overall, only 117 papers (37.3% of the total included) clearly reported the primary outcome, relative reproducibility estimates and relative sample size calculation. A strong association was detected between papers reporting all aspects of primary outcome (definition of primary outcome, relative sample size and reproducibility estimate) and journal IF. More specifically, 15.5% of papers in journals with IF < 1, 39.0% of papers in journals with IF 1–4 and 58.1% of papers in journals with IF > 4 complied with the primary outcome reporting outlined above (Chi-square p < 0.001; Figure 2).

Details are in the caption following the image
Compliance with outcome reporting for studies divided by journal impact factor (IF)

3.5 Overall risk of bias

The overall risk of bias score (see Supporting Information 2) revealed that 73 papers (23.0%) were judged to have overall low risk of bias (all domains at low risk of bias). A total of 121 papers (38.1%) were judged to have “some concerns” (unclear bias in at least one item, but no domain at high risk of bias), while 123 (39.0%) were judged at high risk of bias (at least one item at high risk of bias; Higgins & Thomas, 2020).

The parameter least likely to be scored as “low risk of bias” (41.3% of papers) was allocation concealment, while the most likely to be scored “low risk of bias” was “other bias” (98.4%). A higher number of papers with “high risk of bias” (53.6%) were detected among papers published in journals with IF < 1, compared with papers in journals with IF 1–4 (37.5%) and in journals with IF > 4 (28.8%; Chi-square p = 0.022; Figure 3).

Details are in the caption following the image
Risk of Bias (RoB) of studies divided by journal impact factor (IF)

4 DISCUSSION

Randomized controlled trials published in the field of periodontology in the last 2 years were reviewed to assess details of study design and items related to the study primary outcome. The main focus of the present study was on primary outcome and study design, as they are closely correlated. Other aspects of risk of bias were assessed, as routinely done in systematic reviews to assess any potential association between the main study findings (primary outcome and study design aspects) and overall risk of bias. The main finding of this review is that just over half of the papers (54%) reported the study primary outcome and the sample size calculation related to that outcome. This is in agreement with 61% of RCTs of pre-term birth intervention found to have a pre-specified primary outcome that was underpinned by a sample size calculation (Meher & Alfirevic, 2014). This finding extends beyond original studies, with authors of a publication including 283 Cochrane reviews reporting that more than half did not include data on the primary outcome (Kirkham et al. 2010).

Furthermore, just over a third of the papers included in the present review clearly reported what the primary outcome of the study was, calculated the sample size based on the primary outcome and presented reproducibility estimates for the primary outcome. Registering the study in a clinical trial register before the first participant is recruited is important to reduce publication and outcome reporting biases, as well as to provide a public record of basic study results in a standardized format (Zarin & Keselman, 2007). Interestingly, many authors of the “primary outcome non-compliant” studies had reported the primary outcome at the time of clinicaltrials.gov registration, but subsequently failed to report it in the publication. These findings should be interpreted in the context of increasing awareness about the importance to select, out of all possible outcomes of the study, the primary outcome. This a priori choice avoids cherry-picking significant results (“switched outcomes”), thus reducing mis-interpretation of results and protecting from potential type I error of significant associations found for secondary outcomes. Switching reporting of outcomes can also lead to “spin,” which is a misleading emphasis placed on the study results with the aim to present positive findings (Heneghan et al., 2017). Presenting a clear primary outcome also allows the calculation of the required sample size, based on the main research question (selected out of several possible questions to be tested; Andrade, 2015). The correct sample size avoids wasting resources of recruiting too few patients (not enough to show a difference between groups) or too many patients (over and above the number sufficient to show a difference). A systematic review on RCTs published in dentistry between 1955 and 2013 judged the sample size calculation as adequate in only 17.6% of the reviewed trials, although an improvement was observed in more recent trials (Saltaji et al., 2017), more in agreement with the present study.

So, how should we interpret the large number of studies with no defined primary outcome? It has been suggested that they should be viewed with caution, as they may emphasize results not in line with the actual objectives of the study (Andrade, 2015). The interpretation is of course different for exploratory studies, which have a recognized inherent higher risk of false positive findings.

The great majority (82%) of reported primary outcomes were clinical, the most common being centred on clinical attachment loss (CAL) (43 papers), followed by probing pocket depths (PPD), bleeding on probing and Gingival Index. Only a small proportion of studies (4%) had patient-reported primary outcomes (usually measured in a visual-analogue scale), while the remainder of primary outcomes (15%) were laboratory-based, such as HbA1c. and microbial biomarkers. Therefore, the proportion of studies focusing mainly on patient-reported outcomes in the periodontal literature is still relatively small, and researchers tend to concentrate their attention on clinical parameters. It has been stressed that clinical trials need to select outcomes that have real importance in clinical settings, thus making the findings of the study translational (Heneghan et al., 2017). For practical reasons, this often involves the choice of surrogate outcomes, such as clinical attachment loss(CAL) and PPD. In this case, one must question whether surrogate outcomes really correlate with important long-term disease outcomes. Furthermore, primary outcomes should not be subjective, should use validated scales and, of paramount importance, should be relevant to patients and decision makers. Hence, the recent effort to increase patient and public involvement in research to ensure studies is conceived and designed with greater input from end users (Heneghan et al., 2017). For example, the choice of primary outcome was recently provided in studies on rheumatoid arthritis, based on a discrete choice experiment used to assess affected people's preferences (Stamuli et al. 2017). Using composite end points (Trombelli et al. 2020) is another common strategy, although selection of the correct combination of clinically relevant outcomes is not straightforward.

Another important finding of this review is that the great majority of RCTs were phase-3 superiority parallel-group studies with some form of blinding reported (usually examiner-blinding and often patient-blinding). A few studies had a factorial design (i.e. when two or more experimental interventions are not only evaluated separately, but also in combination and against a control), although it was not always reported as such. Approximately 4% of studies employed a crossover design, which allows each patient to receive different treatments during different time periods. A fairly sizeable proportion (22%) of the reviewed RCTs in periodontology used a split-mouth design, which offers the clear advantage of requiring smaller sample sizes, since this design removes inter-subject variability from the estimated treatment effect (Zhu et al. 2017), while at the same time introducing a potential “carry-over effect” to the contralateral side (Pozos-Guillén et al., 2017). An example of an innovative study design was provided by Ramsay and co-workers, who conducted a multicentre, pragmatic split-plot, randomized open trial with a cluster factorial design. In this study, each practice was randomized to provide routine or personalized oral hygiene advice. Within each practice, participants were then randomized to different frequencies of periodontal instrumentation (Ramsay et al. 2018). In the era of personalized medicine, it is striking that studies in periodontology in the last 2 years have not ventured towards innovative study designs such as SMART design (Lavori & Dawson, 2004) and adaptive design (Antoniou et al. 2016). In contrast with traditional RCTs, SMARTs offer the possibility of studying treatment sequences (Moodie et al. 2016). Studies with adaptive design break the traditional “rigidity” of RCTs, opening to the possibility of adapting the study to initial results, leading to augmenting or reducing a certain intervention or modifying its frequency. Although these types of studies are becoming increasingly popular in medicine, particularly for cancer (Wang et al. 2012; Kidwell, 2014), their application to periodontology is still limited (Xu et al. 2020). We can speculate that this is due to investigators not being fully aware of these alternative study designs, to difficulties in obtaining approvals and funding with unconventional study designs or alternatively to authors deeming that adaptive or SMART design studies may not be appropriate in periodontal research.

Based on the reviewers' judgement using the Cochrane Collaboration's Tool, 39% of papers had at least one item at high risk of bias, while 23% were judged to have overall low risk of bias. This highlights the fact that certain aspects of study design, particularly allocation concealment, should be better reported in RCTs in periodontology. Aspects related to randomization, allocation concealment and blinding were assessed in RCTs in periodontology published in 1996–1998 in a systematic review by Montenegro and co-workers (Montenegro et al. 2002). They concluded that quality of RCTs in periodontology frequently did not meet recommended standards. A follow-up systematic review by the same group (Leow et al. 2016) showed that dramatic improvements had occurred over 14 years in the aspects of randomization, allocation concealment and masking. In agreement with the present study, the aspect with the least adherence to required reporting standards was allocation concealment (Leow et al. 2016). Publications in journals with higher IF had lower overall RoB scores and better compliance with “primary outcome” reporting, compared with papers published in lower IF journals. This association may be testament to the stricter review process and scrutiny required in higher IF journals in periodontology but is controversial in other fields (Macleod et al. 2015; Cramer et al. 2015; Saginur et al. 2020). In fact, it has been suggested that journal IF is a poor measure of research quality (Tressoldi et al. 2013).

The novelty of this study is the aim to assess study design and aspects related to primary outcome of recent RCTs published in periodontology, extending beyond a simple assessment of risk of bias. Limitations are the assessment of papers published in a relatively short timeframe and the exclusion of studies covering other aspects related to periodontology such as implants.

5 CONCLUSION

In summary, this study clearly shows that improvements in the quality of RCTs in periodontology are still needed. The importance of defining a clinically relevant study primary outcome and “building” the study around it needs to be emphasized. All stakeholders, particularly patients, should be involved in the choice of relevant primary outcomes. This could be a positive step towards the reduction of bias. Editors and reviewers of periodontal scientific journals, especially those with lower IF, should give sufficient attention to experimental design as well as to the novelty of findings and should put in place measures to improve the reporting of primary outcomes and reduction of bias (Macleod et al. 2015). Adherence to the CONSORT statement (http://www.consort-statement.org/consort-2010; and to extensions to be used for particular designs of RCTs), as well as to Good Clinical Practice and Declaration of Helsinki principles (World Medical Association Declaration of Helsinki, 2013), should be enforced by all journals. Periodontal researchers could also consider being perhaps a bit more adventurous and attempt other study designs, which could be more useful for answering clinical questions and more relevant in the era of personalized medicine.

CONFLICT OF INTEREST

The authors declare they have no conflict of interests.

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are available from the corresponding author upon reasonable request.

    The full text of this article hosted at iucr.org is unavailable due to technical difficulties.