A Comparison of two Robust Estimation Methods for Business Surveys
Corresponding Author
Robert Graham Clark
National Institute for Applied Statistics Research, University of Wollongong, Wollongong, 2522 NSW, Australia
Search for more papers by this authorPhilip Kokic
National Institute for Applied Statistics Research, University of Wollongong, Wollongong, 2522 NSW, Australia
Search for more papers by this authorPaul A. Smith
Southampton Statistical Sciences Research Institute (S3RI), University of Southampton, Southampton, SO17 1BJ UK
Search for more papers by this authorCorresponding Author
Robert Graham Clark
National Institute for Applied Statistics Research, University of Wollongong, Wollongong, 2522 NSW, Australia
Search for more papers by this authorPhilip Kokic
National Institute for Applied Statistics Research, University of Wollongong, Wollongong, 2522 NSW, Australia
Search for more papers by this authorPaul A. Smith
Southampton Statistical Sciences Research Institute (S3RI), University of Southampton, Southampton, SO17 1BJ UK
Search for more papers by this authorSummary
Two alternative robust estimation methods often employed by National Statistical Institutes in business surveys are two-sided M-estimation and one-sided Winsorisation, which can be regarded as an approximate implementation of one-sided M-estimation. We review these methods and evaluate their performance in a simulation of a repeated rotating business survey based on data from the Retail Sales Inquiry conducted by the UK Office for National Statistics. One-sided and two-sided M-estimation are found to have very similar performance, with a slight edge for the former for positive variables. Both methods considerably improve both level and movement estimators. Approaches for setting tuning parameters are evaluated for both methods, and this is a more important issue than the difference between the two approaches. M-estimation works best when tuning parameters are estimated using historical data but is serviceable even when only live data is available. Confidence interval coverage is much improved by the use of a bootstrap percentile confidence interval.
Supporting Information
Filename | Description |
---|---|
insr12177-sup-0001-supplementary.txtplain text document, 40.6 KB |
Supporting Information |
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
References
- Basak, P., Chandra, H. & Sud, U. 2014. Estimation of finite population total for skewed data. J. Indian Soc. Agri. Statist., 68(3), 333–341.
- Beaumont, J.-F., Haziza, D. & Ruiz-Gazen, A. 2013. A unified approach to robust estimation in finite population sampling. Biometrika, 100(3), 555–569.
- Beaumont, J.-F. & Alavi, A. 2004. Robust generalized regression estimation. Surv. Method., 30(2), 195–208.
- Beaumont, J.-F. & Rivest, L.-P. 2009. Dealing with outliers in survey data. In Handbook of Statistics, Vol. 29, Elsevier, pp. 247–279.
10.1016/S0169-7161(08)00011-4 Google Scholar
- Central Statistical Office 1992. Standard Industrial Classification of Economic Activity. Central Statistical Office: Newport.
- Chambers, R. & Clark, R. 2012. An Introduction to Model-Based Survey Sampling with Applications. Oxford University Press: Oxford.
10.1093/acprof:oso/9780198566625.001.0001 Google Scholar
- Chambers, R., Hentges, A. & Zhao, X. 2004. Robust automatic methods for outlier and error detection. J. R. Stat. Soc. Ser. A Stat. Soc., 167(2), 323–339.
- Chambers, R.L 1986. Outlier robust finite population estimation. J. Amer. Statist. Assoc., 81(396), 1063–1069.
- Chambers, R. & Kokic, P. 1993. Outlier robust sample survey inference. In Proceedings of the 49th Session of the International Statistical Inference, pp. 55–72: Firenze.
- Clark, R.G. 1995. Winsorisation methods in sample surveys. Master's Thesis, Australian National University: Canberra. Available at http://hdl.handle.net/10440/1031. Accessed 1 Sep 2015.
- Clark, R.G. 2015. surveyoutliers: a package to help handle outliers in sample surveys. Available at http: //CRAN.R-project.org/package=surveyoutliers, R package version 0.0.
- Cochran, WG 1977. Sampling Techniques. Wiley: New York.
- Cruddas, M. & Kokic, P. 1996. The treatment of outliers in ONS business surveys. In Proceedings of the GSS(M) Methodology Conference,Office of National Statistics: Newport UK.
- Davison, A. & Hinkley, D. 1997. Bootstrap Methods and their Application. Cambridge University Press: Cambridge.
10.1017/CBO9780511802843 Google Scholar
- Duchesne, P. 1999. Robust calibration estimators. Surv. Method., 25, 43–56.
- Gaujoux, R. 2014. dorng: Generic reproducible parallel backend for foreach loops. R package version 1.6.
- Gross, W.F., Bode, G, Taylor, J. & Lloyd-Smith, C. 1986. Some finite population estimators which reduce the contribution of outliers. In Proceedings of the Pacific Statistical Congress, Elsevier Science Publishers BV: Amsterdam, The Netherlands, pp. 386–390.
- Gwet, J.-P. & Rivest, L.-P. 1992. Outlier resistant alternatives to the ratio estimator. J. Amer. Statist. Assoc., 87(420), 1174–1182.
- Hedlin, D., Falvey, H., Chambers, R. & Kokic, P. 2001. Does the model matter for GREG estimation? a business survey example. J. Off. Stat., 17, 527–544.
- Hidiroglou, M. & Berthelot, J. 1986. Statistical editing and imputation for periodical business surveys. Surv. Method., 12(1), 73–83.
- Huber, P.J. & Ronchetti, E. 2009. Robust Statistics. Wiley: Hoboken.
10.1002/9780470434697 Google Scholar
- Hulliger, B. 1995. Outlier robust Horvitz-Thompson estimators. Surv. Method., 21(1), 79–87.
- Karlberg, F. 2000. Survey estimation for highly skewed populations in the presence of zeroes. J. Off. Stat., 16(3), 229–242.
- Kokic, P. & Jones, T. 1997. Comparing estimation methods for a monthly business inquiry. In Proceedings of the Statistics Canada Conference: New Directions in Surveys and Censuses, pp. 269–272: Ottawa.
- Kokic, P 1997. Repeated sampling through panel rotation. In Social and Community Planning Research, pp. 17, 6–8, Survey Methods Centre Newsletter: UK.
- Kokic, P & Bell, P 1994. Optimal winsorizing cutoffs for a stratified finite population estimator. J. Off. Stat., 10, 419–435.
- Lewis, D. 2007. Winsorisation for estimates of change and outstanding issues with the implementation of winsorisation for level estimates. ONS Report for 13th Meeting of the National Statistics Methodology Advisory Committee, Available from http://www.ons.gov.uk/ons/guide-method/method-quality/advisory-committee/2005-2007/thirteenth-meeting/index. html. Accessed 1 Sep 2015.
- Lumley, T. 2014. survey: analysis of complex survey samples. R package version 3.30.
- Martinoz, C.F., Haziza, D. & Beaumont, J.-F. 2015. A method of determining the winsorization threshold, with an application to domain estimation. Surv. Method., 41(1), 57–77.
- Mulry, M.H, Oliver, B.E & Kaputa, S.J 2014. Detecting and treating verified influential values in a monthly retail trade survey. J. Off. Stat., 30(4), 721–747.
- Myers, R. & Pepin, P 1990. The robustness of lognormal-based estimators of abundance. Biometrics, 46, 1185–1192.
- Preston, J & Mackin, C 2002. Winsorization for generalised regression estimation. Paper for the Methodological Advisory Committee, Available from http://www.abs.gov.au/ausstats/[email protected]/mf/1352.0.55.051. Accessed 1 Sep 2015.
- Preston, J. & Watmuff, R. 2005. Winsorization for linear related items. In Proceedings of the 55th Session of the ISI Conference: Sydney.
- R Core Team 2014. R: A Language and Environment for Statistical Computing.R Foundation for Statistical Computing: Vienna, Austria.
- Revolution Analytics & Weston, S. 2014. doparallel: Foreach parallel adaptor for the parallel package. R package version 1.0.8.
- Rivest, L.-P. & Hidiroglou, M. 2004. Outlier treatment for disaggregated estimates. In Proceedings of the Section on Survey Research Methods, pp. 4248–4256, American Statistical Association Alexandria: Virginia.
- Särndal, C., Swensson, B. & Wretman, J. 1992. Model Assisted Survey Sampling.Springer-Verlag: New York.
- Searls, D.T. 1966. An estimator for a population mean which reduces the effect of large true observations. J. Amer. Statist. Assoc., 61(316), 1200–1204.
- Ståhl, O. 2016. Point estimation using tail modelling for right skew populations. J. Stat. Comput. Simul., 86(11), 2073–2088.
- Zhang, L.-C. & Hagesaether, N. 2011. A domain outlier robust design and smooth estimation approach. Can. J. Stat., 39(1), 147–164.