Standardization of Variables and Collinearity Diagnostic in Ridge Regression
José García
Department of Economic and Business, Almería University, Almería, 04120 Spain
Search for more papers by this authorRomán Salmerón
Department of Quantitative Methods for Economic and Business, Granada University, Granada, 18071 Spain
Search for more papers by this authorCatalina García
Department of Quantitative Methods for Economic and Business, Granada University, Granada, 18071 Spain
Search for more papers by this authorMaría del Mar López Martín
Department of Quantitative Methods for Economic and Business, Granada University, Granada, 18071 Spain
Search for more papers by this authorJosé García
Department of Economic and Business, Almería University, Almería, 04120 Spain
Search for more papers by this authorRomán Salmerón
Department of Quantitative Methods for Economic and Business, Granada University, Granada, 18071 Spain
Search for more papers by this authorCatalina García
Department of Quantitative Methods for Economic and Business, Granada University, Granada, 18071 Spain
Search for more papers by this authorMaría del Mar López Martín
Department of Quantitative Methods for Economic and Business, Granada University, Granada, 18071 Spain
Search for more papers by this authorSummary
Ridge estimation (RE) is an alternative method to ordinary least squares when there exists a collinearity problem in a linear regression model. The variance inflator factor (VIF) is applied to test if the problem exists in the original model and is also necessary after applying the ridge estimate to check if the chosen value for parameter k has mitigated the collinearity problem. This paper shows that the application of the original data when working with the ridge estimate leads to non-monotone VIF values. García et al. (2014) showed some problems with the traditional VIF used in RE. We propose an augmented VIF, VIFR(j,k), associated with RE, which is obtained by standardizing the data before augmenting the model. The VIFR(j,k) will coincide with the VIF associated with the ordinary least squares estimator when k = 0. The augmented VIF has the very desirable properties of being continuous, monotone in the ridge parameter and higher than one.
References
- Anderson T. W. (1985). An Introduction to Multivariate Statistical Analysis. London: John Wiley & Sons.
- Belsley D. A. (1982). Assessing the presence of harmful collinearity and other forms of weak data through a test for signal-to-noise. J. Econometrics, 20, 211–253.
- Belsley D. A., Kuh E. & Welsch R. E. (1980). Regression Diagnostics. New York: John Wiley & Sons.
10.1002/0471725153 Google Scholar
- Dias J. & Castro J. (2011). The corrected VIF. J. Appl. Stat., 38(7), 1499–1507.
- Farrar D. E. & Glauber R. R. (1967). Multicollinearity in regression analysis: The problem revisited. Rev. Econ. Stat., 49(1), 92–107.
- Fox J. & Monette G. (1992). Generalized collinearity diagnostics. J. Amer. Statist. Assoc., 87, 178–183.
- García C., García J., López M. D. M. & Salmerón R. (2015). Collinearity: Revisiting the variance inflation factor in ridge regression. J. Appl. Stat., 32(3), 648–661.
- Greene W. H. (1993). Econometric Analysis., 2nd ed. New York: Macmillan.
- Gunst F. R. (1984). Toward a balanced assessment of collinearity diagnostics. Amer. Statist., 38(2), 79–82.
- Gunst R. F. & Mason R. L. (1977). Advantages of examining multicollinearities in regression analysis. Biometrics, 33(1), 249–260.
- Hadi A. S. (2011). Ridge and Surrogate Ridge Regressions. International Encyclopedia of Statistical Science. New York, NY: Springer.
- Himmelblau D. M. (1970). Process Analysis by Statistical Methods. New York, N. Y.: John Wiley & Sons.
- Hoerl A. E. & Kennard R. W. (1970a). Ridge regression: Applications to nonorthogonal problems. Technometrics, 12, 69–82.
- Hoerl A. E. & Kennard R. W. (1970b). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12, 55–67.
- Irfan M., Javed M. & Ali Raza M. (2013). Comparison of shrinkage regression methods for remedy of multicollinearity problem. Middle East J. Sci. Res., 14(4), 570–579.
- Jamal N. & Rind M. Q. (2007). Ridge regression: A tool to forecast wheat area and production. Pak. J. Stat. Oper. Res., 3(2), 125–134.
10.18187/pjsor.v3i2.67 Google Scholar
- Jensen D. R. & Ramirez D. E. (2008). Anomalies in the foundations of ridge regression. Int. Stat. Rev., 76(1), 89–105.
- Jensen D. R. & Ramirez D. E. (2010a). Anomalies in ridge regression: Addendum. Int. Stat. Rev., 78, 215–217.
- Jensen D. R. & Ramirez D. E. (2010b). Surrogate models in ill-conditioned systems. J. Statist. Plann. Inference, 140, 2069–2077.
- Jensen D. R. & Ramirez D. E. (2013). Revision: Variance inflation in regression. Adv. Decis. Sci., 2013, 1–15.
10.1155/2013/671204 Google Scholar
- Kapat P. & Goel P. K. (2010). Anomalies in the foundations of ridge regression: Some clarifications. Int. Stat. Rev., 78(2), 209–215.
- King G. (1986). How not to lie with statistics: Avoiding common mistakes in quantitative political science. Am. J. Polit. Sci., 30(3), 666–687.
- Kunugi T., Tamura T. & Naito T. (1961). New acetylene process uses hydrogen dilution. Chem. Eng. Prog., 57, 43–49.
- Marquardt D. W (1963). An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math., 11(2), 431–441.
- Marquardt D. W. (1970). Generalized inverses, ridge regression, biased linear estimation and nonlinear estimation. Technometrics, 12(3), 591–612.
- Marquardt D. W. (1980). A critique of some ridge regression methods: Comment. J. Amer. Statist. Assoc., 75(369), 87–91.
- Marquardt D. W. & Snee S. R. (1975). Ridge regression in practice. Amer. Statist., 29(1), 3–20.
- McDonald G. C. (2010). Tracing ridge regression coefficients. Wiley Interdiscip. Rev. Comput. Stat., 2, 695–703.
10.1002/wics.126 Google Scholar
- Montgomery D. C. & Peck E. A. (1982). Introduction to Linear Regression Analysis. New York: John Wiley & Sons.
- Myers R. H. (1990). Classical and Modern Regression with Applications., 2nd ed. Boston, Mass, USA: PWS-Kent.
- O'Brien R. M. (2007). A caution regarding rules of thumb for variance inflation factors. Qual. Quant., 41, 673–690.
- R Development Core Team. (2015). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Available at: http://www.R-project.org/. Accessed 26 February 2015.
- Salmerón R., García C., López M. D. M. & García J. (2013). A note about the variance inflation factor and the ridge regression. In 2nd International Conference of Informatics and Management Sciences, pp. 197–199. Slovak Republic.
- Sardy S. (2008). On the practice of rescaling covariates. Int. Stat. Rev., 76(2), 285–297.
- Silvey S. D. (1969). Multicollinearity and imprecise estimation. J. R. Stat. Soc. Series B Stat. Methodol., 31(3), 539–552.
- Smith G. & Campbell F. (1980). A critique of some ridge regression methods. J. Amer. Statist. Assoc., 75(369), 74–81.
- Stewart G. W. (1987). Collinearity and least squares regression. Statist. Sci., 2(1), 68–84.
10.1214/ss/1177013439 Google Scholar
- Theil H. (1971). Principles of Econometrics. New York: Wiley.
- Velleman P. F. & Welsch R. E. (1981). Efficient computing of regression diagnostics. Amer. Statist., 35(4), 234–242.
- Vinod H. D. & Ullah A. (1981). Recent Advances in Regression Methods. New York: M. Dekker.
- Willan A. R. & Watts D. G. (1978). Meaningful multicollinearity measures. Technometrics, 20(4), 407–412.
- Zhang J. & Ibrahim M. (2005). A simulation study on SPSS ridge regression and ordinary least squares regression procedures for multicollinearity data. J. Appl. Stat., 32(6), 571–588.