Predicting maximal oxygen uptake from the 6 min walk test in patients with heart failure
Abstract
Aims
A cardiopulmonary exercise (CPX) test is considered the gold standard in evaluating maximal oxygen uptake. This study aimed to evaluate the predictive validity of equations provided by Burr et al., Ross et al., Adedoyin et al., and Cahalin et al. in predicting peak VO2 from 6 min walk test (6MWT) distance in patients with heart failure (HF).
Methods and Results
New York Heart Association Class I–III HF patients performed a maximal effort CPX test and two 6MWTs. Correlations between CPX VO2 peak and the predicted VO2 peak, coefficient of determination (R2), and mean absolute percentage error (MAPE) scores were calculated. P-values were set at 0.05. A total of 106 participants aged 62.5 ± 11.5 years completed the tests. The mean VO2 peak from CPX testing was 16.4 ± 3.9 mL/kg/min, and the mean 6MWT distance was 419.2 ± 93.0 m. The predicted mean VO2 peak (mL/kg/min) by Burr et al., Ross et al., Adedoyin et al., and Cahalin et al. was 22.8 ± 8.8, 14.6 ± 2.1, 8.30 ± 1.4, and 16.6 ± 2.8. A significant correlation was observed between the CPX test VO2 peak and predicted values. The mean difference (0.1 mL/kg/min), R2 (0.97), and MAPE (0.14) values suggest that the Cahalin et al. equation provided the best predictive validity.
Conclusions
The equation provided by Cahalin et al. is simple and has a strong predictive validity, and researchers may use the equation to predict mean VO2 peak in patients with HF. Based on our observation, equations to predict individual maximal oxygen uptake should be used cautiously.
Introduction
Maximal oxygen uptake (VO2 max) is considered the gold standard for measuring aerobic exercise capacity.1 Clinically, peak oxygen consumption (VO2 peak) is often used as a surrogate for VO2 max for patient populations. VO2 peak is determined by cardiopulmonary exercise (CPX) testing, typically performed on a treadmill or bicycle ergometer, with incremental increases in exercise workloads to maximal effort, generally limited by symptoms or fatigue. Results from CPX testing are used to assess exercise tolerance, develop exercise prescriptions, evaluate treatment efficacy, and investigate exercise-induced adaptations of the oxygen transport and utilization system.1 Maximal effort CPX testing is generally well tolerated by patients with cardiovascular diseases.2 In patients with heart failure (HF), the landmark HF-ACTION study and other meta-analyses reported no adverse effects of CPX testing in HF with both preserved and reduced ejection fraction.3-5 However, with greater than minimal risk involved,6 it is recommended that maximal effort CPX testing in patients with clinically stable HF be performed in a laboratory setting supervised by trained medical personnel.7 Because the test is expensive and the required infrastructure and qualified personnel may not be readily available, alternate forms of testing to measure functional capacity are often used. For patients with HF, the sub-maximal 6 min walk test (6MWT), with established reliability and validity, is commonly recommended as an alternate choice.8, 9
The 6MWT is cost-effective, easy to administer, and well tolerated.10 It is used as an outcome measure to determine activity of daily living11 and quality of life outcomes12-14 and in clinical practice to evaluate treatment response and exercise capacity, predict frailty, and evaluate mortality and 30 day re-hospitalization rate in patients with HF.15-17 With 6MWT distance correlating with VO2 values,18 its ability to accurately predict physiologic change in VO2 peak in patients with HF can have practical utility.
Over the years, various equations to predict a VO2 max or VO2 peak from a 6MWT have been proposed. In this study, we evaluated each of the four prediction equations developed by Burr et al.,19 Ross et al.,20 Adedoyin et al.,21 and Cahalin et al.22 and assessed their ability to accurately predict VO2 peak from the 6MWT in patients with HF. The equations, as described in the Methods section, are different from each other in how they were formulated, the population they were tested on, and the number of variables required to predict VO2 peak. These equations were specifically selected for their simplicity and ease of use whereby the estimation of VO2 peak is calculated from easily assessable information such as age, sex, weight, resting heart rate (HR), and 6MWT distance. Cahalin et al.22 provided several prediction equations from their study of patients with HF. We excluded the equations that required values of forced vital capacity, forced expiratory volume, cardiac index, rate pressure product, pulmonary artery pressure, or left ventricular ejection fraction for computing peak VO2, as these values may not be readily available or would require additional testing. Additionally, treatment guidelines for the management of HF, including the use of beta-blockers and other drugs that impact cardiac function, have undergone modifications in the past 15 years.23-30 Consequently, because each of the study equations were developed before more recent treatment modifications came into effect, there is need to re-evaluate their ability to accurately predict peak VO2 in stable patients with HF. Therefore, the purpose of the study was to evaluate and compare the predictive validity of equations provided by Burr et al.,19 Ross et al.,20 Adedoyin et al.,21 and Cahalin et al.22 in predicting peak VO2 from 6MWT distance in patients with stable HF.
Methods
This secondary analysis utilizes baseline measures of peak VO2 from CPX testing and 6MWT data from the National Institutes of Health-funded longitudinal study titled Heart failure Exercise And Resistance Training (HEART) Camp (R01-HL112979). The two-site HEART Camp study was a facility-based randomized controlled intervention designed to improve adherence to the recommended 150 min/week of aerobic exercise in patients with HF. The investigation conforms with the principles outlined in the Declaration of Helsinki and was approved by the Institutional Review Board at the University of Nebraska Medical Center (Chairperson: Dr. Bruce Gordon; IRB# 608-11-FB, approved 14 December 2011). Participants signed informed consent before participation. The primary study protocol and its results have been published.31, 32 For this study, we use data from the Lincoln (NE) site.
Subjects
A total of 119 participants diagnosed with HF with New York Heart Association (NYHA) scale I–III were recruited for the study. Of these, 12 participants dropped out after enrolment or did not complete the second 6MWT, leaving a sample size of 106. Power analysis was performed using G* power.33 The post-hoc t-test for correlational statistic (VO2 and 6MWT) using a sample size of 106, α of 0.05, and a moderate effect size of 0.30 gave a power of 0.90. Participants were recruited from the Bryan Heart Clinic in Lincoln, Nebraska. Inclusion criteria for participation were (i) diagnosis of HF (Stage C chronic HF confirmed by echocardiography and clinical evaluation); (ii) 19 years of age or older; (iii) able to speak and read English; (iv) telephone access in-home; and (v) stable pharmacologic therapy per guidelines for past 30 days. Exclusion criteria included (i) clinical evidence of decompensated HF; (ii) unstable angina pectoris; (iii) myocardial infarction, coronary artery bypass surgery, or biventricular pacemaker <6 weeks prior; (iv) orthopaedic or neuromuscular disorders preventing participation in aerobic exercise and strength/resistance training; (v) participation in three times per week aerobic exercise during the past 8 weeks; (vi) cardiopulmonary stress test results that precluded safe exercise training; (vii) plans to move >50 miles from the exercise site within the next year; (viii) peak oxygen consumption >21 mL/kg/min in women and >24 mL/kg/min in men; and (ix) planned or current pregnancy. Additionally, per recommendations by Ross et al.,20 6MWT distances > 600 m were not considered for analysis. Their prediction equation underestimated VO2 peak in participants walking >600 m in the 6MWT. In our sample, one person walked a distance of 605 m. We excluded this value in the analysis.
Experimental procedure
The study was divided into two parts: (i) conducting a CPX test to evaluate VO2 peak and (ii) participants completing two 6MWTs to record distance covered in 6 min. These measurements were done during baseline testing for the HEART Camp study. Data were collected between 2013 and 2015. The values from the 6MWT were used to predict the maximal oxygen uptake using the equations in Table 1.
- CPX test: On Day 1 of the study, participants performed a maximal effort CPX using a 10-stage ramp protocol to determine VO2 peak. The test was conducted by an exercise specialist and supervised by a nurse practitioner/physician assistant/physician. The treadmill speed was started at 1 mile/h with 0% incline. Both the speed (max 3.3 miles/h) and incline (max 15%) were increased every 2 min until voluntary exhaustion or if HR, respiration, and/or physical appearance indicated peak effort. 29 For patients who were not limited by symptoms, the respiratory exchange ratio was also assessed to determine maximal effort during the test. Rating of perceived exertion (RPE) was recorded at each stage of the test. Heart rhythm was monitored throughout the test.
- 6MWT: Two 6MWTs were performed by each participant. The first test was performed on the day of the CPX test. Participants were given at least 3 h of rest after the CPX test before performing the 6MWT. Two baseline tests were performed in a 30 m long hallway as per the instructions provided by the American Thoracic Society.10 The second 6MWT was performed 7–10 days after the first 6MWT as per the protocol of the HEART Camp study.31 Participants vocalized their RPE using the Borg (6–20) scale at the end of the test.34, 35 For the purpose of consistency, the best score of the two walks was used for analysis, as it is a better representation of the participant's functional capacity than the mean score of the two testing results of the 6MWT.
- Description of the prediction equations:
Author | Equation | Population/data source |
---|---|---|
Burr et al. (2011)19 | VO2 max = 70.161 + 0.023 * 6MWT (m) − 0.276 * weight (kg) − 6.79 * sex (M = 0; F = 1) − 0.193 * resting HR (b.p.m.) − 0.191 * age (years) | Tested on healthy middle-aged individuals |
Ross et al. (2010)20 | VO2 peak = 4.948 + 0.023 * mean 6MWT distance (m) | Data from the literature involving studies on HF and chronic obstructive pulmonary disease patients |
Adedoyin et al. (2010)21 | VO2 = 0.0105 × distance (m) + 0.0238 age (years) 0.03085 weight (kg) + 5.598 | NYHA Class II–III HF patients |
Cahalin et al. (1996)21 | VO2 peak = 0.03 * distance (m) + 3.98 | Advanced symptomatic HF patients |
- CPX, cardiopulmonary exercise; F, female; HR, heart rate; M, male; NYHA, New York Heart Association; VO2, oxygen uptake.
Data analysis
Participants with atrial fibrillation were considered for analysis if their HRs were controlled for the past year with stable dosing of beta-blockers. Pearson's correlations between the VO2 peak achieved at maximal effort on the CPX test and the calculated VO2 peak from three prediction equations were analysed to test for predictive validity. The mean differences and error estimates were calculated for the VO2 peak from the CPX testing and the predicted VO2 peak from the equations. If no mean difference was found between the two values, the predictive validity of the equations to predict peak VO2 of patients belonging to different NYHA functional classes were tested. Bland–Altman plots were created to compare the standardized VO2 measures. To evaluate the accuracy of the predictive equations against the observed values from CPX testing, coefficient of determination (R2) to explain the amount of variance in the VO2 peak from CPX testing and mean absolute percentage error (MAPE) were calculated. A lower MAPE score indicates better predictive accuracy. IBM-SPSS 25 and R 3.6.2 were used to analyse the data. The P-values were set at 0.05.
Data availability
The primary study associated is available at https://clinicaltrials.gov/ (study identifier NCT01658670).
Results
With 12 participants dropping out of the study before completing the second 6MWT, and one participant walking >600 m during the 6MWT, we were left with a final sample size of 106 participants. No significant difference was observed in the baseline characteristics of the participants who dropped out and the sample included for analysis. The test–retest reliability of the two 6MWT test was 0.94. The mean age of the participants in the study was 62.4 ± 11.4 years. Of the 106 participants, 65 were men and 41 women. The clinical characteristics included ejection fraction (40.4 ± 10.6%), months to HF diagnosis (61.1 ± 64.9), ischaemic cardiomyopathy (33%), and HF with reduced ejection fraction (87.7%). The sample consisted of four NYHA Functional Class I, 72 NYHA Functional Class II, and 30 NYHA Functional Class III patients. Descriptive statistics including age, height, weight, HR, and 6MWT distance are described in Table 2.
Age (years) | Height (cm) | Weight (kg) | Resting HR (b.p.m.) | 6MWT (m) | |
---|---|---|---|---|---|
Mean | 62.4 | 172.4 | 102.95 | 72.68 | 419.2 |
SD | 11.4 | 10.1 | 25.68 | 12.67 | 93.0 |
Range | 25–84 | 150–172.4 | 56.4–206.4 | 41–109 | 141–600 |
The results (mean ± standard deviation; range) of VO2 peak from CPX testing and estimated VO2 peak (mL/kg/min) from the four prediction equations are CPX (16.5 ± 3.9; 7.0–26); Burr et al.19 (22.8 ± 8.8; −10 to 41); Ross et al.20 (14.6 ± 2.1; 8.2–18.8); Adedoyin et al.21 (8.03 ± 1.4; 3.6–11.6); and Cahalin et al.22 (16.6 ± 2.8; 8.2–22.1). Mean difference between VO2 peak from CPX testing and predicted VO2 peak was as follows: Burr et al.19 = 6.3, Ross et al.20 = −1.9, Adedoyin et al.21 and Cahalin et al.22 = 0.10. The Burr et al.19 (r = 0.48, P < 0.0001; R2 = 0.90) and Adedoyin et al.21 (r = 0.61; P < 0.0001; R2 = 0.96) equations showed a moderate correlation, while the Ross et al.20 (r = 0.75, P < 0.0001; R2 = 0.98) and Cahalin et al.22 (r = 0.75, P < 0.0001; R2 = 0.98) equations strongly correlated with the mean VO2 peak values from CPX testing. MAPE scores for Burr et al.,19 Ross et al.,20 Adedoyin et al.,21 and Cahalin et al.22 were 0.56, 0.15, 0.48, and 0.14, respectively. Figure 1 reflects the ordered CPX VO2 peak values vs. predicted individual VO2 peak values from the four equations. Figures 2 and 3 show the association between CPX VO2 peak values and predicted VO2 peak from the three equations using Bland–Altman and scatter plots.



We did not find a significant difference (P = 0.58) between CPX VO2 peak and calculated predicted VO2 peak with the Cahalin et al.22 equation, while significant mean differences (P < 0.001) were observed using the other equations. Therefore, the Cahalin et al.22 equation was further analysed to predict VO2 by NYHA functional class. The predicted values were compared with the CPX values (Table 3). For this analysis, the four NYHA Class I participants were included with the NYHA Class II participants. Mean difference, standard error, and significant correlation (r) categorized by NYHA classes included Class I and II combined (−0.22, 0.27; r = 0.63) and Class III (−0.86; 0.3; r = 0.67). No significant differences were observed within the NYHA classes.
NYHA | Method | Mean | Median | SD | Mean difference (standard error) | Correlation (r) |
---|---|---|---|---|---|---|
Class I and II (n = 76) | CPX | 17.5 | 17.7 | 3.5 | 0.22 (0.27) | 0.63* |
Cahalin equation | 17.3 | 17.4 | 2.5 | |||
Class III (n = 30) | CPX | 13.8 | 14.4 | 3.5 | −0.86 (0.3) | 0.67* |
Cahalin equation | 14.48 | 14.63 | 2.7 |
- * A significant correlation between actual CPX testing VO2 peak and predicted VO2 peak values at P = 0.05.
Discussion
The results from our study show that while the four equations were able to explain ≥90% of the variance present in the peak VO2 values from CPX testing, there was a difference in their predictive ability as measured by MAPE. Mean differences show that the predicted VO2 peak using the four equations either grossly overestimated (Burr et al.19), grossly underestimated (Adedoyin et al.21), slightly underestimated (Ross et al.20), or closely predicted (Cahalin et al.22) the peak VO2 as observed from CPX testing. The Bland–Altman (Figure 2) and scatter plots in (Figure 3) show that the individual scores predicted by each of the equations are spread out, which resulted in a moderate-to-strong correlation between the CPX and predicted peak VO2 values. The equations with a larger mean difference between predicted and observed values showed a moderate correlation, while the equations with small differences in mean values showed a stronger correlation. When categorized by NYHA functional classes, the predicted values using the Cahalin et al.22 equation showed a significant positive correlation with a small mean difference of ≤1. Ross et al.20 have stated that generalized equations may be useful for accurately estimating mean peak VO2 values from mean 6 MWT distance scores, but they may not be accurate in making predictions for individual patients. From the spread seen in the scatter plots in our study, the opinion expressed by Ross et al.20 is justified and can also be extended to the other prediction equations as well. Overall, when comparing the four equations using the predicted mean peak VO2 and the MAPE scores, we found that the predictive ability of Cahalin et al.22 (0.1 mL/kg/min; MAPE = 0.14) equation was superior to that of the Burr et al.19 (6.8 mL/kg/min; MAPE = 0.54), Ross et al.20 (1.9 mL/kg/min; MAPE = 0.16), and Adedoyin et al.21 (−8.47 mL/kg/min; MAPE = 0.48) equations.
It is reported that the use of sub-maximal exercise tests to predict VO2 max may underestimate the actual VO2 max.36-38 The 6MWT is generally considered a sub-maximal exercise test where participants do not reach their maximal exercise capacity and are also allowed to rest, if needed, during testing.10 As such, it can be assumed that predictions of VO2 peak values using the 6MWT distance will slightly underestimate the peak VO2 values from CPX testing for a participant. This phenomenon was observed with the equation provided by Ross et al.20 While the Burr et al.19 equation grossly overestimated the mean VO2 values from CPX testing, the results of the Adedoyin et al.21 equation, which grossly underestimated the observed mean peak VO2, were surprising. With the equation formulated by testing a sample of stable NYHA Class II and III HF patients, we expected the Adedoyin et al.21 equation to be the most accurate in predicting peak VO2 in our study. The gross inaccuracy may be explained by the difference in the samples between the two studies. Whereas our sample consisted of participants who were mostly Caucasian, the Adedoyin et al.21 study, done in Nigeria, although not described by the authors, most likely consisted mostly of Black population. Another important factor may be a difference in the medication management of HF. The use of beta-blockers, angiotensin-converting enzyme inhibitors/angiotensin receptor blockers, and diuretics is part of the standard clinical therapy that our participants were provided as part of their usual care. Such therapy has been shown to improve functional capacity in patients with HF.24, 29 This information along with observed VO2 peak from CPX testing is not reported in the Adedoyin et al.21 study. Our study also excluded participants with peak oxygen consumption >21 mL/kg/min in women and >24 mL/kg/min in men, which may contribute to the differences in peak VO2 values from CPX testing. The differences found in the mean peak VO2 values using the predicted equations may be explained by the participant demographics used to formulate these equations and the fact that the treatment of HF has undergone significant changes in the past 20 years.
The population that Ross et al.20 used to formulate their prediction equation included a diverse group of clinical patients that included HF patients. The equation proposed by Cahalin et al.22 was based on the testing of advanced, symptomatic HF patients, while our sample was composed of stable HF patients. Burr et al.19 developed their prediction equation based on their results from testing of healthy adults with ages ranging between 28 and 60 years. This may explain the gross overestimation in mean VO2 peak values for our clinical population when using the Burr et al.19 equation. The sample sizes in the four studies were different as well. Burr et al.19 had a sample size of 44; Ross et al.20 mostly used raw data from previously published research and had a sample size of 1083 that included 673 HF patients; Adedoyin et al.21 had a sample of 65 (30 men and 35 women); and Cahalin et al.22 had a sample size of 45. Our sample was similar to that of the Burr et al.,19 Ross et al.,20 and Cahalin et al.22 studies where the majority of participants were Caucasian. The sample size and demographic are crucial components in developing prediction equations to accurately represent the population and to avoid errors.
Considering that HF patients, depending on the extent of their disease, can be severely compromised in their ability to exercise, it may be argued that predicted peak VO2 values should be as accurate as possible or for safety purposes and it may be better to underestimate peak VO2 by a small margin than risk grossly overestimating it. While it may be argued that HF patients have safely performed high-intensity exercise and the risk associated with overestimation of VO2 peak may be exaggerated, such programmes have been conducted in supervised settings, mostly with NYHA Class I–III HF patients and may not be appropriate for non-supervised performance in the community setting.39-42 Also, the HF-ACTION study found moderate-intensity exercise to be safe for patients with HF, and the current HF exercise guidelines from various organizations reflect the same.8, 23, 43, 44 As such, the equation provided by Burr et al.,19 developed from testing conducted on healthy adults, may not be appropriate for predicting peak VO2 for patients with HF to prescribe exercise. It may also be argued that the RPE from 6MWT may in itself be adequate for developing exercise prescriptions for this population, as monitoring improvement in functional capacity pre-exercise to post-exercise training by use of a 6MWT is a common outcome measure. Although this may be true, the validity of these prediction equations needs to be established for accurate estimation of VO2 peak, as they may also be used for research purposes. The strength of the study is that it takes into account four different equations developed by testing of diverse populations. Variations seen in our study in the predictive ability of the four equations should not be taken as a testament to their value or utility. Our sample consisted of stable HF patients who were mostly Caucasian, and our findings suggest the use of population-specific VO2 prediction equations. We agree with Ross et al.20 in utilizing caution in using these predictions to estimate individual peak VO2. We found the Cahalin et al.22 equation to be the most accurate in predicting mean peak VO2 scores in patients with HF. The simplicity of the equation also justifies its use for practical purposes.
Conclusions
The use of a population-specific prediction equation to predict mean VO2 peak from mean 6MWT distance in patients with HF may be a viable alternative to a CPX VO2 peak exercise test when constraints of cost and available personnel cannot be overcome. However, with the possibility of grossly underestimating or overestimating exercise capacity, caution is advised when using these equations to predict peak VO2 at an individual level. Based on our finding, for research purposes, where the mean peak VO2 needs to be estimated, the Cahalin et al.22 equation may be used when the study sample is similar to that of Cahalin et al.22 and our study. Future studies should investigate the development of prediction equations that are more accurate in predicting peak VO2 at the individual level.
Conflict of Interest
None declared.
Funding
The primary HEART Camp study was supported by NHLBI of the National Institutes of Health (award number R01HL112979). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.