Objective Housing Sales and Rent Prices in Representative Household Surveys: Implications for Wealth, Inequality, Housing Market, and Affordability Statistics
Abstract
Many economic analyses require hypothetical but realistic sales and rent prices for properties representative of the housing stock and reflecting current market conditions. To achieve this, we replace subjectively reported prices in a representative household survey in Luxembourg with objectified hedonic imputations informed by observable market data. Thus, we propose a powerful tool for assessing the health and affordability of housing markets, compiling housing-related statistics and simulating hypothetical scenarios. This approach also enables us to test for the reliability of survey responses. When switching to objectified values, we detect shifts in the wealth distribution, large regional variation in market indicators, and striking affordability concerns: only 18 percent of Luxembourg's renters could theoretically afford to purchase their inhabited dwellings given current market conditions. Further, participants' tendency to mis-estimate market values strongly correlates with tenure length and type, dwelling type, income, and wealth.
1 Introduction
At any given point in time, the vast majority of dwellings forming a country's housing stock are neither on sale nor available for rent. For all these dwellings not having recently undergone a market matching process, their current market value is thus not revealed.1Moreover, any housing unit part of the residential housing stock cannot simultaneously be active on the rent and sales market.2Thus, even if properties were re-occupied regularly, the fact of different forms of tenure implies that either the rent or the sales price necessarily remains unknown at any point in time. This simple and natural fact, however, limits the compilation of housing statistics that are representative of the entire housing stock and, at the same time, reflect current market conditions although such kind of statistics would be urgently needed to compare like-with-like and test economic theories.
As a practical consequence, many housing-related statistics thus either focus on the (usually non-representative) subset of dwellings on the market or rely on hypothetical market prices, often estimated by homeowners. Regarding rents, surveys usually collect the current amount of rent paid, yet do not ask for an estimate of current market rents. This may be a consequence of the common interpretation of rent payments as a consumption spending rather than the re-current pay-offs of an asset (hence, owner-occupied housing as a component of the CPI is–if included–usually treated as investment and thus evaluated at market prices while rents are not evaluated at market rates yet simply such realised payments are tracked via regular surveys, see Hill et al., 2023). While this challenge of missing current market prices also exists for other asset classes, the enormous amount of private wealth tied up in real estate (according to Syz, 2008, roughly one third of global total wealth in 2008) and the sizeable share of monthly income spent on rents3require particular attention to housing.
For Norway, statistical imputation models fed by market data are applied to linked tax and administrative data for valuating current housing wealth on a household level (Fagereng et al., 2020). This yields estimates for the market value of the Norwegian residential housing stock over two decades. Yet, the richness of linked administrative data available in Norway is rare. Other countries thus need to rely on alternative data sources. One such example is given by Molloy & Nielsen (2018) and Gallin et al. (2021) who match census housing counts in the USA to an automated valuation model to impute the current value of the owner-occupied housing stock. However, this match does not deliver information on socioeconomic characteristics of the current inhabits.
In this article, we thus propose an alternative strategy to overcome the mentioned shortcomings and provide a framework that is applicable even in the absence of linked administrative data, which is currently not yet available in many countries. This article should thus also be understood as step-by-step guidelines for the creation of such data.
We achieve compiling such data by individually matching dwellings described in a population-representative household survey as main residences—the Luxembourgish part of the 2018 Household Finance and Consumption Survey (LU-HFCS)4—to realized market sales and rent prices resulting from hedonic valuation models estimated on these market data. Specifically, we have added linking questions to the survey that yield attributes that are also available for market data collected by the Luxembourg Housing Observatory.5We then estimate hedonic models on these market data using all linkable characteristics (i.e., information collected from survey participants, the interviewer, and further matched geographical information) and, following an ample set of internal and external validation steps, use these models to predict current market sales and rent prices for all properties appearing in the survey. This yields hypothetical yet objective and timely close-to-market sales and rent prices for every survey observation.6
A sophisticated survey weighting scheme (see Subsection 2.2 for details) allows us to gross up results to population totals and thus also statistics describing the owner-occupied and rented residential housing stock as well as further housing-related macro-economic statistics.
We first perform a micro-level assessment of the reliability of current market sales and rent prices reported by survey participants: We characterize the type of deviations and assess how they are linked with characteristics of the surveyed household.
We find that, on average, imputed prices are slightly higher than those reported by homeowners. However, renters strongly underreport the sales price potential of their currently rented unit. These findings may appear insightful and call for caution should statistical institutes choose to use survey data (including estimated current market sales and rent prices) more widely in the future as substitutes for observed market data when compiling, for example, the respective components in the Consumer Price Index or National Accounts (see Hill et al., 2023, for further references).
These deviations observable for single units translate into substantial aggregate increases and, even more importantly, changes in the distribution: median net worth of owner-occupiers increases by almost EUR 50,000; however, in the lowest-income quintile, the increase amounts to EUR 170,302, in the second-highest income quintile to EUR 20,448 and even a small decrease for the top 20 percent. Similarly, large changes are also observed along the wealth distribution. Because the LU-HFCS is part of the European Household Finance and Consumption Survey, we are able to perform some indicative simulations for other participating countries. Not surprisingly, countries with large shares of owner-occupiers and expensive housing stocks experience substantial “corrections” in net wealth measures.
Our imputation strategy also allows us to assess the secondary housing market in terms of affordability by running a counterfactual study. The results suggest that current market conditions would make it almost impossible for a large fraction of renters in Luxembourg to purchase the unit they currently rent using their own assets and income. For 50 percent, such a purchase would require financial resources equal to 12 total annual net household incomes (excluding any transaction and financing costs). Only 15 percent of all renting households could theoretically afford the purchase and would also economically benefit from doing so. Roughly 17.5 percent would have to pay less interest on a hypothetical mortgage than their current rent. These households thus could greatly benefit from homeownership but do not have sufficient funds to bear up-front costs. Taking into account future inheritance or gifts does not substantially change this conclusion, as only one-fifth of those households not having enough own resources expects receiving such intergenerational support.
Similar to the simulations we perform here, the established toolkit linking survey observations to market data enables policymakers and researchers to micro-simulate effects of (planned) policies (e.g., targeted to support either homeownership or renting), stress-test households' portfolios in hypothetical scenarios, or simply bringing housing statistics forward or backward in time. For the latter, the housing stock described in the survey would be kept constant (as the housing stock anyhow is fixed in the short run), but sales and rent prices could flexibly evolve in line with changing market conditions by adjusting the time window the market data are retrieved from. By doing that, Luxembourg's current set of population-representative housing statistics would be vastly extended.
Survey data have been widely used before for similar purposes. For instance, Garner & Verbrugge (2009) make use of the ample information within US household surveys to compile macroeconomic housing statistics. Other studies use surveys to compile wealth statistics, including housing wealth components (see, e.g., EG-LMM, 2020). However, these studies are limited to self-reported current house prices and do not use any micro-level market data, except for plausibility checks of resulting totals. Thus, with this study we aim to inspire further research along these lines that would likely lead to more comprehensive and reliable statistics.
The micro-match with market data seems important, as estimates of wealth components collected in surveys are known to be prone to several sources of imprecision (see, for instance, Vermeulen, 2018). After reviewing studies focusing on housing market prices reported by survey participants themselves, Agarwal (2007) concludes: “there is general agreement […] that homeowners significantly misestimate their house value.” He reports substantial average absolute mis-estimation (mainly focusing on the USA) ranging between 14 percent and 25 percent (see Kish & Lansing, 1954; Kain & Quigley, 1972; Goodman Jr. & Ittner, 1992; Kiel & Zabel, 1999; Agarwal, 2007; Benítez-Silva et al., 2015; Waltl & Lepinteur, 2023, for detailed results). However, price changes measured by subjective and objective house price indices seem to follow similar dynamics (see Kiel & Zabel, 1997; Mathä et al., 2017). Waltl & Lepinteur (2023) find that housing market trends are well tracked when conducting systematic convergent validity tests on pooled estimates of all participants in national surveys in the USA and Europe. In contrast, the level of estimates appears to be systematically biased. All these findings suggest that it would be important to correct price estimates provided by survey participants when focusing on absolute numbers, and our proposed strategy could be helpful in this regard.
The remainder of this article is structured as follows: Section 2 presents the conceptual framework applied and data sources used, Section 3 describes our imputation strategy, and Section 4 demonstrates impacts on wealth, inequality, housing market, and affordability statistics. Finally, Section 5 concludes. A comprehensive appendix provides additional details and the attached online appendix includes supplemental materials.
2 Conceptual Framework and Data Sources
2.1 Conceptual Framework
Our imputation procedure consists of several steps, each involving one or more data sources. Figure 1 shows the structure of this process. The main data (shown as rectangles in Figure 1) are the LU-HFCS (see Subsection 2.2 for details) and advertisements collected by the Housing Observatory (see Subsection 2.3 for details). Auxiliary data supporting this match (shown as ellipses in Figure 1) are added at various stages. Details about these are provided in subsection A.1 in the appendix. As a matching vehicle between the two main data sources, we estimate imputation models on market data separately for rent and sales prices (shown as clouds in Figure 1). The models regress prices on price-determining characteristics in a hedonic fashion (Rosen, 1974). These models are subsequently used for predicting sales and rent prices for all survey observations. By that, we avoid a one-to-one match between the probabilistic survey sample and the deterministic sample of properties for rent or sales that would otherwise require applying further statistical techniques to avoid biased estimators (see Chen et al., 2020, for details).

Notes: Models are shown as clouds, and connections are indicated by arrows. Main datasets appear as rectangles and auxiliary data as ellipses. Validation and harmonization procedures between data sources are shown as double arrows.
For matching, we make use of dwelling information (physical and locational details) available for both survey and market data. Most information in the LU-HFCS is provided by survey respondents, yet the HFCS interviewer conducting the interview at a household's main residence additionally assesses the overall status of the dwelling's structure and surrounding area. This means—next to locational data—further objective information.
The EU-wide harmonized HFCS questionnaire does not ask for an ample set of housing characteristics. To enable a match, we have added several linking questions eliciting hedonic characteristics including location identifiers to the third wave of the LU-HFCS questionnaire (shown as an attachment to the LU-HFCS in Figure 1). The specific links between survey questions and characteristics observable for market data are summarized in a correspondence table in the appendix (Table A.1). To accommodate differences in recording and coding style, we harmonize definitions (shown as a double arrow in Figure 1).
Of course, no imputation model yields perfectly accurate prices, which, however, is less of a concern when targeting aggregate measures as long as no systematic bias is introduced. As Gallin et al. (2018) put it, “The noise component of these estimation errors is a much less fundamental concern for us than any bias component. If each property-level AVM [automated valuation model] estimate is unbiased but noisy, their sum (or average) will also be unbiased. Of course, a better unbiased AVM, in the sense of having a lower mean squared error of prediction across the relevant properties, would still be preferable to a worse (noisier) unbiased AVM, as it would yield a less noisy aggregate.” In this spirit we perform a wide range of statistical tests to make our model fit for purpose. These tests are documented in Subsection 3.1 and may be used as guidelines for future research.
We next present the main data used and perform some validation steps before section 3 performs the actual matching between them.
2.2 The LU-HFCS
The LU-HFCS is part of the Pan-European HFCS initiative conducting ex-ante harmonized wealth surveys across Europe. In the third wave of the LU-HFCS conducted in 2018, 1,616 households participated (see Chen, Mathä, et al., 2020). Responses are population-weighted.7 Table 1 reports summary statistics.
Surface | Plot Sizea | Tenure Length | Monthly Rentb | Current Valueb | |||
---|---|---|---|---|---|---|---|
[years] | Construction Year | No. of Bedrooms | [EUR] | [EUR] | |||
All tenure statuses | |||||||
Median | 125.00 | 500.00 | 10.00 | 1985 | 3.00 | 1,500 | 600,000 |
Mean | 139.48 | 1,028.99 | 15.32 | 1974 | 3.39 | 1,696 | 642,660 |
Std. dev. | 80.93 | 7,663.39 | 14.93 | 40.37 | 1.79 | 885 | 457,772 |
Owner-occupiers | |||||||
Median | 146.80 | 500.00 | 15.00 | 1984 | 4.00 | 1,800 | 652,000 |
Mean | 163.54 | 1,104.31 | 18.91 | 1974 | 3.89 | 1,996 | 747,151 |
Std. dev. | 81.54 | 8,218.12 | 15.81 | 40.19 | 1.73 | 849 | 432,997 |
Renters | |||||||
Median | 80.00 | 345.20 | 4.00 | 1985 | 2.00 | 950 | 350,000 |
Mean | 85.93 | 531.42 | 7.33 | 1974 | 2.28 | 1,031 | 410,048 |
Std. dev. | 46.60 | 551.83 | 8.34 | 40.80 | 1.37 | 531 | 424,992 |
- Notes: The table reports summary statistics related to the household main residence (HMR). Measures are derived respecting survey weights.
- a The plot size is available only for houses and not for apartments.
- b Prices and rents are reported by survey participants.
- Source: LU-HFCS, third wave.
The HFCS routinely asks owner-occupiers to estimate their home's current market price. Renters, in contrast, report the currently paid monthly rent. While the former reflects subjective beliefs, the latter could be affected by reporting errors but not subjectivity. It is important to note that survey participants are, in general, no experts when it comes to real estate markets. Albeit people interested in buying or selling real estate are likely to gather extra information by observing the market and gathering expert advise from real estate professionals, such costly endeavors are very unlikely to be undertaken when preparing for a survey interview. Thus, responses arguably reflect subjective beliefs and potentially also wishes.
Exclusively, the third wave of the LU-HFCS contains questions about hypothetical prices, as well as hedonic (physical and locational) characteristics of the main residence. Regarding the former, owner-occupiers are asked for a hypothetical monthly rent they believe could realistically charge to a new tenant, and all respondents regardless of their own tenure status are tasked to estimate a hypothetical current market sales price (see subsection C.2). Regarding the latter, several physical characteristics have been retrieved during the survey interview or added by the interviewer directly. As interviews are usually conducted at the interviewee's homes and, thus, the interviewer observes the described dwelling, reporting errors for standard hedonic characteristics are expected to be rather minor.
The following physical characteristics are retrieved as part of the survey interview: surface, type of residence, plot size, year of construction, number of bedrooms, and the energy class. The survey interviewer provides an additional assessment of the structure and neighborhood. Further, we record the postcode. Luxembourg has a strikingly detailed postcode system: while small municipalities share one code, in Luxembourg City (almost) every street has a separate code. Longer streets are even split into several ones. Overall, there are 4,022 regular postcodes in a country of 2,586 km and roughly 600,000 residents.8
This additional information allows us to impute realistic market values retrieved from a hedonic valuation model for all dwellings described in the survey. We do so for both owner-occupiers (69 percent) and renters or households using their residence free of charge (31 percent).
Some personal characteristics of the interviewee we use for compiling for instance affordability measures refer—if not stated differently—to the financially most knowledgeable person in the household who acts as a main interview partner (reference person). The LU-HFCS dataset is multiply imputed for all variables.9Reported point estimates are the average of the weighted point estimates across five implicates. The variance of our estimators stems from a bootstrap using 1,000 replicate weights adjusted for the between- and within-imputation variance of the five multiply imputed implicates.
The estimation uncertainty of our hedonic valuation models to impute objective current market prices is indirectly taken into account via bandwidths used for classifying accurate versus over- or underreported values. Any ambiguities regarding the exact location of dwellings are reflected via induced heterogeneity across implicates (see Appendix A.3 for a detailed description).
2.3 Market Data
We use a comprehensive data pool consisting of advertised dwellings for sale and rent in Luxembourg. The data are maintained by the public research institute LISER to serve as Housing Observatory. Table 2 reports basic summary statistics. We estimate our imputation models on advertised sales prices net of taxes and fees, and advertised rent prices net of utilities and other charges. As predictors, we use all characteristics available for both survey and market data. We select all adverts posted during the LU-HFCS fieldwork period; for validation purposes, we also consider all notary deeds recorded for dwelling transactions for owner-occupation purposes during the same period (see Subsection 2.4 for a discussion of this choice).
Surface | Plot Size | Monthly Rent | Sales Price | |||
---|---|---|---|---|---|---|
Construction Year | No. of Bedrooms | [EUR] | [EUR] | |||
All Dwelling Types | ||||||
Median | 97.00 | 0.00 | 1968 | 2.00 | ||
Mean | 118.50 | 8.05 | 1979 | 2.48 | ||
Std. dev. | 74.58 | 485.07 | 30.02 | 1.36 | ||
Dwellings for Sale | ||||||
Median | 113.58 | 0.00 | 1968 | 3.00 | 618,977 | |
Mean | 133.58 | 11.77 | 1979 | 2.81 | 702,564 | |
Std. dev. | 77.19 | 597.22 | 32.05 | 1.35 | 397,664 | |
Dwellings for Rent | ||||||
Median | 79.51 | 0.00 | 1968 | 2.00 | 1,500 | |
Mean | 89.33 | 0.86 | 1979 | 1.85 | 1,684 | |
Std. dev. | 59.18 | 15.23 | 25.65 | 1.11 | 858 |
- Notes: The advertisements refer to the period January 1 to December 31, 2018. Overall there are 7,860 rent and 14,759 sales advertisements. For both categories, the overwhelming majority of properties do not come with any outside plots, which explains the median values for this variable.
- Source: Housing Observatory.
Adverts' information content, though providing much more dwelling characteristics than notary deeds, is still not massive. Yet, the single most important information is available: location. Thus, we can make extensive use of geographic data as explained later. We have selected the municipal (commune) level (the finest administrative granularity available in Luxembourg) as our main unit of analysis due to the opportunities it offers to proxy local phenomena and policies, but also to connect into other datasets available at this scale. Nonetheless, Luxembourg's two main urban agglomerations—Luxembourg City and Esch-sur-Alzette—are significantly larger population-wise compared to any other municipality.10 Given this disparity in population distributions, we break down the larger urban areas into smaller but administratively relevant neighborhood units. Precisely, Luxembourg City is divided into its composing neighborhoods, and Esch-sur-Alzette is separated from its largest—and truly distinct—suburb Belval. For all other locations we preserve the municipal level of analysis.
Location ultimately enters the hedonic models in one of three different ways: dummies, a distance measure, or a non-parametric spline. The first option estimates for each locational entity a separate shadow price. The second option follows Glaesener & Caruso (2015) in calculating the approximate travel distance from each dwelling to Luxembourg City. Finally, we estimate a locational spline11as a Markov Random Field Smooth (see Wood, 2006) that links each geographical entity part of the composite geography to all of its neighbors, thus measuring potential spatial lags.
2.4 Validating Market Data
Our imputations do not strictly follow the concept of market prices, as the imputation models are informed by advertised sales and rent prices. In principle, the results reflect supply-side sentiments keeping any demand-side effects constant. While this appears to be of minor concern for rents, as there is usually only little or even no bargaining (see Hill & Syed, 2016), it is per se not clear whether such data are valid to measure movements in the sales market. Notary deeds would report the final prices achieved in the course of the bargaining processes and thus may appear as the preferred source.
However, there are serious drawbacks of using these data for our imputations for Luxembourg. Below, we provide five arguments why advertisements indeed appear to be the most suitable data source in our case. The first argument refers to overall market conditions. In general, the tighter a market, the more power the supply-side has. Thus, in hot markets potential buyers are less likely to achieve (substantial) price reductions during price negotiations. The pre-pandemic housing market in Luxembourg can be classified as a sellers' market12—demand had heavily exceeded supply. Accordingly, lower final prices as compared to advertised ones were uncommon.
The second argument refers to timing. Notary deeds reflect the market with a delay of several months. This is a consequence of the lengthy gap between price agreement and final recording of the transaction. Recording with notaries only happens at the very end of the purchase process (which often also requires a successful mortgage application). The length of this process is not recorded and likely varies substantially across transactions. In contrast, relying on advertised prices means a timely snapshot of a precisely specified action—namely the first time of advertisement—and thus reflects current supply-side sentiments (see also Anenberg & Laufer, 2017, using advertisements to construct a timely house price index).
While, strictly speaking, advertisements are observed before price setting, signing the contract takes place after price setting. Thus, both advertisements and notary deeds do not perfectly allocate prices to the crucial moment of price setting. Moreover, in times with little room for bargaining and rapidly growing prices (real estate prices of existing and new dwellings grew on average by 9.2 percent between 2017 Q4 and 2018 Q4), accurate timing seems to be even more relevant for guiding an accurate match between data sources to avoid introducing an unrealistic time lag in measuring current market conditions. This choice is also supported by Lyons (2019), who shows that list prices lead sales prices by several months due to the lengthy sales process and delays in recording transactions with notaries. Also, Lyons (2019) finds only a very small difference between advertised and final sales prices in the very hot Irish housing market before the Global Financial Crisis.
As survey participants are tasked to estimate the price they expect to receive if they sell their homes on the day of the survey interview, the time of advertised prices seems to better fit the survey question.
The third issue concerns the type of transactions. Notaries record all changes in ownership of land and real estate; thus deeds also contain transactions of property that had never been advertised because the trading partners are relatives or friends potentially benefiting from favorable conditions. Thus, the recorded prices too sometimes do not describe true market prices, while the survey explicitly asks for such a price.
Fourth, Luxembourgish notary deeds do not always refer to complete bundles of transacted goods; that is, a unit sold may be spread over several lines in the notary deeds due to the merging of previously separated units at some point in the past or additional acquisition of garages, other types of storage facilities, or outside space.
Finally, the amount of physical and locational information available in the notary deeds data is very limited, thus hindering the estimation of a trustworthy imputation model.
Thus, we rely on advertisements as main data source.13
To understand the quantitative impact of switching between notary deeds and advertisements, we pool all adverts appearing during our study period and all deeds recorded at notaries in the same period.
We perform two tests: First, we check whether relying on adverts would mean a change in the structure of market data and, second, we measure maximal price differences between advertised and finally recorded prices by comparing prices per square meter between the two sources.
For checking the first, the two prices are related to each other via (unpaired) Pearson correlation coefficients. As a result, we obtain percent and, thus, reassuringly document almost perfect co-movement of prices reported in the two sources.
Concerning the second, we assess prices and dwelling surface (in square meters). Both the price and surface distributions are, as a whole, shifted downwards in notary deeds data as compared to advertisements. Magnitudes are reported in Table 3. While the shifts are significant in statistical terms14 the differences are minor in economic terms: the median price per square meter recorded by notary deeds is merely EUR 560 (or roughly 10 percent) lower.
Adverts | Notary Deeds | |
---|---|---|
Price per unit [EUR/] | ||
Q1 | 4,774.81 | 4,227.05 |
Median | 5,748.91 | 5,188.47 |
Mean | 6,462.81 | 5,550.55 |
Q3 | 7,506.32 | 6,475.64 |
IQR | 2,731.51 | 2,248.59 |
Std. dev. | 2,364.86 | 2,024.04 |
Dwelling surface [] | ||
Q1 | 71.00 | 70.00 |
Median | 87.16 | 84.66 |
Mean | 91.65 | 88.46 |
Q3 | 106.00 | 101.35 |
IQR | 35.00 | 31.35 |
Std. dev. | 31.63 | 29.67 |
Number of observations | 8,381 | 4,737 |
- Notes: The table reports summary statistics for advertisements and notary deeds for characteristics reported in both sources. The data refer to flats advertised or sold (registered as for own use) in Luxembourg in 2018. and abbreviate quartiles.
- Source: Luxembourg's Housing Observatory.
Overall, the price differential appears to be a minor concern here as compared to the other drawbacks of notary deeds discussed earlier.
2.5 Comparing Market and Survey Data
Table 1 describes dwelling characteristics as reported in the HFCS and Table 2 reports summary statistics for advertised units. HFCS statistics are population-representative, and thus a comparison reveals differences between the current residential housing stock and the mix of dwellings currently on the market. One key insight from this comparison is that the mean and median age of main residences of renters and owner-occupiers are almost identical.
Looking at differences by tenure status, we find overall similar deviations: dwellings for sale or already owner-occupied are on average substantially larger than those rented.
While rented properties in the two sets are comparable in size, properties owned/sold appearing in the pool of advertised properties are noticeably smaller (fewer bedrooms per property and smaller plot sizes) and are less likely to come with additional private outside space (e.g., a garden) than existing residences of HFCS respondents.
3 Imputing Market Prices
3.1 Imputation Strategy
In the following, we sketch out our model selection procedure. Appendix A provides comprehensive methodological details.
Our imputation model follows a standard hedonic pricing equation, and thus we regress sales or rent prices on physical and locational characteristics. Acknowledging the skewness in the price distributions, we follow common practice here and use a log-transformation.
Precisely, we estimate separate models for sales and rent data and regress logged sales or logged monthly rent prices observed during the survey's fieldwork period for dwellings on a set of corresponding characteristics.
As an alternative, a quantile regression specification would also yield median-unbiased predictions. However, accounting for location effects via a Markov Random Field Smooth as done here is, to the best of our knowledge, not yet developed for quantile regression models. As predictions from a quantile regression specification are very similar to the linear regression equivalent, we thus report them only as a robustness check in Appendix A.4.
We finally assess five specifications summarized in Table 4 by step-wise varying sets of explanatory variables. This yields one preferred main model and four alternatives. We sketch out the selection procedure we have applied in the following and refer to supporting details in the appendix (including full estimation results in Table A.2) whenever appropriate. In particular, we consider three alternative ways to account for location effects (dummies, splines, and distances) and test whether the inclusion of intuitively meaningful interaction effects is supported by the data (dwelling type surface, as well as construction period energy class).
Main Model | Alt.1 | Alt.2 | Alt.3 | Alt.4 | |
---|---|---|---|---|---|
Physical characteristics | ✓ | ✓ | ✓ | ✓ | ✓ |
Interactions | ✓ | ✓ | ✓ | ✘ | ✘ |
Locality dummies | ✓ | ✘ | ✘ | ✓ | ✘ |
Distance to capital | ✘ | ✓ | ✘ | ✘ | ✘ |
Linked neighbor spline | ✘ | ✘ | ✓ | ✘ | ✘ |
- Notes: Physical characteristics comprise all match-able characteristics specified in Table A.1 except location. Interactions between dwelling type and surface as well as between construction period and energy class are included in three of the models. Locality means dummy variables hinting toward the composite geography obtained from combining municipalities with the neighborhood level for the two largest urban areas in the country. The distance to the capital is retrieved from Glaesener & Caruso (2015). The linked neighbor spline employs smooth terms linking neighboring entities of the composite geography.

Notes: The figure depicts observed and imputed market data for (a) sales and (b) rent prices using the models (S.Main) and (R.Main).
Source: Authors' calculations based on data from the Housing Observatory, on advertisements available between January 1 and December 31, 2018.
Adjusted | AIC | BIC | |
---|---|---|---|
Sales models | |||
S.Main | 0.824 | 4,962.48 | 3,783.79 |
S.Alt.1 | 0.703 | 2,282.41 | 2,509.08 |
S.Alt.2 | 0.819 | 4703.96 | 4039.10 |
S.Alt.3 | 0.809 | 3,820.68 | 2,725.11 |
S.Alt.4 | 0.507 | 9,414.07 | 9,550.08 |
Rent models | |||
R.Main | 0.744 | 1,219.10 | 164.11 |
R.Alt.1 | 0.679 | 307.96 | 514.81 |
R.Alt.2 | 0.728 | 886.18 | 624.54 |
R.Alt.3 | 0.742 | 1,153.16 | 174.02 |
R.Alt.4 | 0.526 | 3,160.34 | 3,284.46 |
- Notes: The table reports goodness-of-fit measures for all hedonic models considered. AIC means the Akaike information criterion and BIC the Bayesian information criterion.
- Source: Authors' calculations based on data from the Housing Observatory, on advertisements available between January 1 and December 31, 2018.
We assess the models' predictive power following an out-of-sample (OoS) and within-sample (WS) procedure: for the first, we re-estimate the hedonic models leaving out 20 percent (i.e., 4,284) of all observations based on geographically stratified random sampling and use the resulting reduced models to predict sales prices and rents via plug-in estimators. We compare the predicted values for these 4,284 units using the full and restricted model, respectively, and assess the ratio between the two. An MAE close to 1 and an MRE close to 0 suggest that a respective model works well also for bundles of characteristics not necessarily found in the original estimation sample. This is important as we use the models for predicting prices for dwellings that are by definition off the market and thus not used to train the model.
Table 6 reports the results: OoS and WS MAEs are comparable in size, reflected in ratios close to 1. By construction, OoS errors are expected to be larger than WS errors. This is indeed the case reflected by the ratio of OoS to WS errors being consistently larger than—but reassuringly very close to—one.
Out-of-sample (OoS) | Within-sample (WS) | Ratio | Difference | |||
---|---|---|---|---|---|---|
MAE | MRE | MAE | MRE | of absolute | of relative | |
(MedAE) | (MedRE) | (MedAE) | (MedRE) | errors | errors | |
S.Main | 109,688.7 | 0.1483 | 108,038.6 | 0.1465 | 1.0152 | 0.0018 |
(61,288.5) | (0.1063) | (61,256.5) | (0.1055) | (1.0005) | (0.0008) | |
S.Alt.1 | 147,367.9 | 0.2062 | 146,576.8 | 0.2057 | 1.0053 | 0.0004 |
S.Alt.2 | 116,955.8 | 0.1596 | 109,910.5 | 0.1490 | 1.0641 | 0.0105 |
S.Alt.3 | 119,256.4 | 0.1587 | 117,503.1 | 0.1569 | 1.0149 | 0.0018 |
S.Alt.4 | 195,999.4 | 0.2726 | 195,315.8 | 0.2723 | 1.0034 | 0.0002 |
R.Main | 261.65 | 0.1503 | 252.39 | 0.1441 | 1.0368 | 0.0061 |
(161.00) | (0.1102) | (154.50) | (0.1083) | (1.0420) | (0.0019) | |
R.Alt.1 | 305.64 | 0.1757 | 298.09 | 0.1710 | 1.0253 | 0.0047 |
R.Alt.2 | 272.68 | 0.1570 | 265.95 | 0.1528 | 1.0253 | 0.0042 |
R.Alt.3 | 262.38 | 0.1513 | 255.41 | 0.1457 | 1.0272 | 0.0055 |
R.Alt.4 | 380.33 | 0.2232 | 376.34 | 0.2198 | 1.0106 | 0.0033 |
- Notes: Out-of-sample (OoS) and within-sample (WS) mean absolute errors (MAEs) and mean relative errors (MREs) are computed for units for rent and sale following formula (2). The smaller the MRE and MAE, the better the predictive power. We relate the two measures to each other by computing ratios MAE(OoS)/MAE(WS) for absolute measures and distances MRE(OoS)–MRE(WS) for relative measures reported in the last two columns. We also include in brackets figures on the median relative errors for the main models to give a clearer picture of the shape of the error curves.
- Source: Authors' calculations based on data from the Housing Observatory, on advertisements available between January 1 and December 31, 2018.
The mean relative prediction error for dwellings excluded from the estimation sample amounts to roughly 15 percent. Hence, we must expect an error margin of this size when imputing sales and rent prices for survey observations. Therefore, we use this margin as a lower bound to identify over- or under reported values as detailed in Subsection 3.2.
Finally, we examine residuals in detail to detect potential biases. On average, our models predict slightly lower sales and slightly higher rent prices than observed ones; however, the median residual is almost zero (the main model yields an average residual of 0.00259 for sales prices and of 0.00265 for rent prices). Thus, the models are on average not completely free of bias, but its expected size is very small and hardly varies across model specifications.
Yet, the models may also be affected by bias in other dimensions. As, further on, distributional effects will be studied, we check for bias along proxies for the income and wealth distributions, namely observed sales and rent price distribution. As housing assets are usually households' most important single wealth component (see, for instance, Causa et al., 2019), plotting residuals along the sales price distribution comes close to a plot along the wealth distribution. Similarly, rent payments correlate with income as monthly rent payments are—generally speaking—restricted by monthly disposable income. For the main models, Figure 3 plots residuals along the observed sales and rent price distributions. Absolute unbiasedness would mean that residuals perfectly cluster around the horizontal lines.

Notes: The figure plots observed market data for (a) sales and (b) rent prices against associated residuals retrieved from models (S.Main) and (R.Main).
Source: Authors' calculations based on data from the Housing Observatory, on advertisements available between January 1 and December 31, 2018.
Throughout large parts of the distributions this is indeed the case. Only at the very bottom of the distributions prices deviate more. Thus, our predictions are most accurate outside the lowest price segment.
Furthermore, we assess residuals by the most important price-determining characteristic: location. That checks whether imputations systematically deviate in certain areas of the country. Figure 4 shows residuals per canton retrieved from the main models (S.Main) and (R.Main).

Notes: The figure depicts residuals clustered by canton retrieved from the (a) S.Main and (b) R.Main models.
The width of interquartile ranges is quite similar across cantons and consistently overlaps zero. This holds true even when assessing the much more fine-grained split by our composite geography roughly corresponding to municipalities as reported in Figure C.1 in Appendix C.
We perform two additional robustness checks: first, we estimate the main specifications as quantile regression models and, second, we account for potential price changes during the period of observation by including time-dummies. Both checks lead to the conclusion that such alternatives do not introduce relevant changes nor improvements. Details are provided in Appendix A.4.
3.2 The Magnitude, Direction, and Source of Deviations
Every owner-occupier interviewed provides a hypothetical sales and a hypothetical rent price. We assess the differences between subjectively reported and imputed current values for owner-occupiers. We also compare the deviations observed for sales prices and rent prices.
We look first at this relationship without any further controls: We find a positive Pearson correlation of percent between the two types of deviations and Figure 5 visualizes this strong link. This means a smaller deviation of reported sales prices goes hand in hand with a smaller deviation of reported rent prices. Thus, survey respondents tend to be simultaneously good (or bad) in estimating hypothetical sales and rent prices revealing a certain measurable “degree of ability” performing such tasks in general. Thus, this also provides a measure of credibility in the data provided by each survey participant.

Notes: The figure compares the deviation between reported and imputed sales prices to the deviation between reported and imputed rent prices for each owner-occupied dwelling in the survey. The dashed line corresponds to a fitted linear regression model.
Next, we characterize survey participants driving deviations, making use of the large amount of additional information the survey provides. We therefore estimate multinomial logistic regressions describing the type of deviation via several sets of thematically grouped explanatory variables.
As response, we classify observations by the direction of deviation between reported and imputed prices. We label them as over-reported (under-reported) if a reported price is at least percent higher (lower) than the imputed counterpart. We consider deviations by less than percent as accurate and use this outcome as the reference category. As thresholds, we select .
These thresholds are in-line with the width chosen by Molloy & Nielsen (2018) and are large enough to overlap prediction uncertainty, that is, the relative OoS errors measured via formula (2) and reported in the second column of Table 6: As we impute market prices and rents for dwellings in the survey, we need to choose a wide enough window to cover imputation uncertainty. For this, the OoS error is relevant as dwellings appearing in the survey are usually not on the market as they are main residences at the moment of the interview. The total median OoS relative error amounts to roughly 10 percent for both and . The median OoS relative errors separately computed for positive and negative deviations are very similar. Precisely, we find median positive OoS errors for sales (rent) prices of 0.111 (0.114) and negative ones of 0.102 (0.105). We thus can assume a symmetric interval but need to select percent. To limit the probability that the deviation we interpret as reporting errors may in fact solely reflect hedonic estimation uncertainty, we repeat the analysis for the even more conservative choice of percent. For the same reason, we refrain from using exact EUR-amounts of deviations as response variables, but rely on broad classifications differing only between “accurate” estimates, as well as positive and negative deviations.
10% threshold | 15% threshold | |||
---|---|---|---|---|
Over-reporting | 26.7 of 40% | (13.35pp) | 22.4 of 35% | (16.48pp) |
Accurate | 24.4 of 20% | (+4.37pp) | 35.4 of 30% | (+5.42pp) |
Under-reporting | 49.0 of 40% | (+8.99pp) | 42.2 of 35% | (+7.23pp) |
- Notes: Over-reporting (underreporting) means that the reported value exceeds (undercuts) the imputed one by percent. We call a reported price accurate whenever it does not deviate by more than percent from the imputed one. The frequency is expressed as percentage shares (%) in comparison to the theoretically expected share when assuming evenly distributed deviations. In parenthesis, we report the difference between observed and theoretical shares in percentage points (pp).
We assess the odds of the deviation exceeding percent with respect to two groups of predictors: The set pools characteristics of the entire household inhabiting a certain dwelling including also the survey reference person. In addition, the set comprises physical and locational characteristics of the dwelling itself.
A summary of the key results is presented in Table 8. Full results are printed in Table C.4 in the appendix. We find greater deviations (both for under- and overreporting) for homes acquired by the household long ago, suggesting that it becomes more difficult for owners to track and mentally adjust overall changes in house prices to a specific dwelling over an extended period of time. Some long-term owners with no intention to sell their home may also simply consider this mentally challenging task as not being worth the effort. Furthermore, underreporting is more likely for larger premises. This holds true when measuring living space in square meters or using the type of HMR (houses versus apartments).
Over-reporting | Under-reporting | |||
---|---|---|---|---|
Response: mis-reported by | 10% | 15% | 10% | 15% |
Intercept | 1.032 | 0.521 | 0.980 | 0.447** |
(0.642) | (0.323) | (0.310) | (0.147) | |
Housing status | ||||
Owner | (ref. cat.) | |||
Renter | 2.571** | 2.975** | 3.275*** | 4.434*** |
(1.091) | (1.291) | (0.878) | (1.203) | |
Education level | ||||
Low | (ref. cat.) | |||
Medium | 0.931 | 0.884 | 0.796 | 0.714* |
(0.231) | (0.202) | (0.168) | (0.138) | |
High | 1.031 | 1.002 | 1.000 | 0.868 |
(0.288) | (0.246) | (0.259) | (0.187) | |
Net income | ||||
Q1 | (ref. cat.) | |||
Q2 | 0.630 | 0.722 | 0.714 | 0.767 |
(0.240) | (0.343) | (0.229) | (0.265) | |
Q3 | 0.436 | 0.382 | 0.465** | 0.413** |
(0.206) | (0.203) | (0.155) | (0.137) | |
Q4 | 0.424** | 0.455** | 0.321*** | 0.330*** |
(0.150) | (0.167) | (0.096) | (0.098) | |
Q5 | 0.518 | 0.516 | 0.333*** | 0.311*** |
(0.210) | (0.250) | (0.120) | (0.115) | |
Net wealth | ||||
Q1 | (ref. cat.) | |||
Q2 | 1.769 | 1.448 | 1.733* | 1.569 |
(0.845) | (0.627) | (0.570) | (0.481) | |
Q3 | 1.290 | 1.031 | 0.993 | 0.914 |
(0.611) | (0.459) | (0.344) | (0.269) | |
Q4 | 2.040 | 1.644 | 0.576 | 0.510** |
(1.012) | (0.705) | (0.193) | (0.163) | |
Q5 | 4.570*** | 4.133*** | 0.521* | 0.501* |
(2.287) | (1.942) | (0.201) | (0.181) | |
Type of HMR | ||||
House | (ref. cat.) | |||
Apartment | 0.774 | 0.615** | 2.781*** | 2.558*** |
(0.201) | (0.149) | (0.583) | (0.421) | |
Surface of HMR | 0.998 | 0.998* | 1.003** | 1.002*** |
(0.002) | (0.001) | (0.001) | (0.001) | |
Years since acquisition [log] | 1.170 | 1.261* | 1.234** | 1.342*** |
(0.146) | (0.162) | (0.107) | (0.119) | |
Interviewer rating: exterior conditions | ||||
mid-range, modest, or low-income | (ref. cat.) | |||
luxury or upscale | 1.247 | 1.482** | 0.539*** | 0.588*** |
(0.202) | (0.253) | (0.082) | (0.091) | |
Interviewer rating: interior conditions | ||||
good, fair, poor, or not seen | (ref. cat.) | |||
excellent | 0.824 | 0.869 | 0.475*** | 0.481*** |
(0.195) | (0.211) | (0.096) | (0.075) | |
Canton | ||||
Countryside | (ref. cat.) | |||
Luxembourg City | 0.524*** | 0.458*** | 1.970*** | 1.802*** |
(0.124) | (0.109) | (0.388) | (0.322) | |
Esch-sur-Alzette | 0.640** | 0.668** | 0.665** | 0.620*** |
(0.121) | (0.118) | (0.112) | (0.100) |
- Notes: The models regress deviation between reported and imputed prices on dwelling and household characteristics. Models (1)–(5) refer to distinct specifications. We distinguish two response variables describing the deviation between imputed and observed current values of the HMR. Overreporting and underreporting, respectively, mean that the reported value diverges by percent or percent from the imputed one. to denote first to fifth quintiles. Models are estimated on the full LU-HFCS data (1,616 observations). Significance is indicated using standard notation: p-value 0.1; p-value 0.05; p-value 0.01.
- Source: LU-HFCS, third wave and authors' calculations.
Some assessments by survey interviewers add valuable additional information:16 Underreporting tends to be less likely when the interviewer reports a positive impression of interior or exterior conditions (i.e., dwellings rated as luxury or upscale from the exterior). This suggests that these ratings provide useful information complementing the core survey content.
Overreporting is less common in the urban agglomerations Luxembourg City and Esch-sur-Alzette than in smaller municipalities and rural areas. However, underreporting is more common in Luxembourg City than in the rest of the country, but less common in Esch-sur-Alzette.
The tenure status, renting versus owner-occupying, turns out to be a prime predictor: Renters tend to be much more likely to mis-estimate the market value of their home and less likely to report prices close to the ones suggested by the hedonic model. Households at the top of the net wealth distribution are more likely to overreport and less likely to underreport relative to imputed values. Likewise, higher household income is associated with a lower likelihood to mis-estimate.
In terms of education, we document a lower likelihood to mis-report for persons with medium or high formal educational attainment. However, the estimated coefficients are only significant for underreporting and a medium education level.
4 Implications for Wealth, Inequality, Housing Market, and Affordability Statistics
4.1 Net Wealth, Residential Housing Wealth, and Its Distribution
Switching from reported to imputed values means an adjustment of the measured value of the current residential housing stock and net wealth at current market prices. These computations may be particularly helpful for filling gaps in Luxembourg's canon of official statistics. STATEC conducts an annual survey to update statistics on the current housing stock. A questionnaire is sent to every registered owner of newly constructed dwellings. However, only construction costs are collected, which are usually far from market sales prices.17 In addition, in the non-financial accounts dwellings (ESA, 2010, code: AN.111) and land underlying buildings and structures (ESA, 2010, code: AN.2111) are currently not distinguishable between households and non-profit institutions serving households (NPISHs). There are no reliable estimates of the current market value of the residential housing stock owned by private households (see also EG-LMM, 2020).
Table 9 reports owner-occupiers' residential housing wealth and net wealth according to reported and imputed market values for different quintiles of the income and wealth distribution. We focus here on distributional indicators revealing the impact on different income and wealth groups using objectively imputed rather than subjectively reported values.
Shares | Median [EUR] | |||||||
---|---|---|---|---|---|---|---|---|
Observed [%] | Imputed [%] | Difference [pp] | Observed | Imputed | Difference | |||
Complete sample (, ) | ||||||||
Residential housing wealth | ||||||||
Total | 100 | 100 | – | 600,000 | 654,104 | 54,104 | *** | |
(14,277) | (8,129) | |||||||
Net wealth | ||||||||
Total | 100 | 100 | – | 498,454 | 563,334 | 64,880 | ||
(23,399) | (20,047) | |||||||
Net wealth breakdowns by | ||||||||
Net income—Q1 | 8.8 | 9.6 | 0.9 | 74,210 | 79,292 | 5,082 | ||
Net income—Q2 | 12.2 | 12.7 | 0.5 | 272,193 | 389,896 | 117,703 | ||
Net income—Q3 | 14.0 | 14.4 | 0.5 | 473,780 | 549,618 | 75,838 | ||
Net income—Q4 | 20.9 | 20.9 | 0.0 | 707,394 | 746,932 | 39,538 | ||
Net income—Q5 | 44.2 | 42.3 | 1.9 | 1,040,763 | 1,013,534 | 27,229 | ||
Net wealth—Q1 | 0.2 | 0.2 | 0.0 | 7,060 | 7,736 | 676 | ||
Net wealth—Q2 | 3.9 | 5.5 | 1.6 | 157,138 | 203,639 | 46,501 | ||
Net wealth—Q3 | 11.1 | 12.6 | 1.5 | 498,751 | 558,022 | 59,271 | *** | |
Net wealth—Q4 | 19.0 | 19.4 | 0.4 | 839,520 | 852,966 | 13,446 | ||
Net wealth—Q5 | 65.8 | 62.3 | 3.5 | 1,858,008 | 1,840,418 | 17,590 | ** | |
Owner-occupiers (, ) | ||||||||
Residential housing wealth | ||||||||
Total | 100 | 100 | – | 652,000 | 696,994 | 44,994 | *** | |
(18,730) | (11,731) | |||||||
Net wealth | ||||||||
Total | 100 | 100 | – | 732,360 | 783,653 | 51,293 | *** | |
(19,718) | (17,050) | |||||||
Net wealth breakdowns by | ||||||||
Net income—Q1 | 8.9 | 9.9 | 0.9 | 513,081 | 683,383 | 170,302 | *** | |
Net income—Q2 | 11.7 | 12.2 | 0.5 | 573,937 | 646,734 | 72,797 | *** | |
Net income—Q3 | 14.0 | 14.5 | 0.5 | 636,916 | 702,271 | 65,355 | *** | |
Net income—Q4 | 20.8 | 20.9 | 0.0 | 784,495 | 804,944 | 20,448 | ** | |
Net income—Q5 | 44.6 | 42.6 | 2.0 | 1,121,400 | 1,111,741 | 9,659 | ||
Net wealth—Q1 | 0.0 | 0.0 | 0.0 | 2,240 | 56,989 | 54,749 | * | |
Net wealth—Q2 | 2.9 | 4.6 | 1.7 | 226,010 | 326,988 | 100,977 | *** | |
Net wealth—Q3 | 11.1 | 12.6 | 1.5 | 508,324 | 570,284 | 61,959 | *** | |
Net wealth—Q4 | 19.7 | 20.1 | 0.4 | 844,560 | 858,481 | 13,921 | * | |
Net wealth—Q5 | 66.4 | 62.7 | 3.7 | 1,849,368 | 1,823,905 | 25,463 | *** |
- Notes: Residential housing wealth is the total current value of households' main residences (neglecting partial ownership when owners may not all belong to the same household). denotes the number of survey observations and the sum of survey weights, and thus, the number of households in Luxembourg. Net wealth takes partial ownership of residential housing wealth into account. – are quintiles. Standard deviations are reported in parentheses. Test statistics use quantile regressions taking into account replicate weights and the multiply imputed nature of the LU-HFCS. Significance is coded using standard notation: p-value 0.1; p-value 0.05; p-value 0.01. See Table C.2 for totals in EUR.
- Source: LU-HFCS, third wave and authors' calculations based on data from the Housing Observatory, on advertisements available between January 1 and December 31, 2018.
Overall, our imputation increases average total residential housing wealth—and thus also total net wealth—jointly held by Luxembourg households. The increase is substantial in magnitude and also significant in statistical terms: imputed market values increase total median net worth of owner-occupiers by roughly EUR 50,000.
We observe the largest increases among households in the second net wealth quintile. For the median owner-occupiers in this quintile, net wealth increases by around EUR 100,000. Using reported values, the lowest quintile of households held no wealth at the median; that is, the current value of their liabilities roughly matches the current value of their assets. With imputed prices, this number turns positive.
According to the imputed values, the top quintile (both in terms of income and wealth) are less prosperous than they report in surveys.
In the transition from observed to imputed values for owner-occupiers (Table 9, lower panel), we observe stable or slightly higher shares of net wealth held by the lowest quintile of the net wealth or net income distribution, and decreasing shares of the top quintile. This suggests a decrease in net wealth inequality along the wealth and income distribution. Changes in the median for the lowest and top quintile confirm this pattern. The Gini coefficient for the net wealth of owner-occupiers falls from 53.69 (observed) to 50.48 (imputed). At the same time, however, the increase in property values is widening the wealth gap between owner-occupiers (who are generally richer) and tenants (who are generally poorer). The impact of imputed values on the net wealth distribution can be seen in the upper panel of Table 9, which includes both owner-occupiers and renters. The pattern of net wealth along the wealth and income distribution can be confirmed, even if it is somewhat less pronounced: the Gini coefficient of net wealth decreases from 65.17 (observed) to 63.06 (imputed) for the complete sample. The use of imputed house prices therefore has an equalizing effect not only on the net wealth of owners but also on the entire net wealth distribution.
Table 10 analyses the changes in the net wealth distribution when switching from reported to imputed prices: if the change had no effect, observations stayed in the same quintile. Put differently, there would only be zeros off the diagonal in the transition matrix in Table 10. While low-wealth households indeed largely remain in the lowest wealth quintile, a non-negligible share is kicked out of the second (4 on 20), third (8 on 20), fourth (7 on 20), and top (3 on 20) quintile. Although switches predominantly occur between neighboring quintiles, a small fraction also moves to more distant groups. Overall, these changes in total net wealth lead to a small decrease in the measured wealth inequality along the income and wealth distribution as the distributions are squeezed.
Imputed | |||||||
Total | |||||||
Reported | 19.65 | 0.42 | 0.00 | 0.00 | 0.00 | 20.07 | |
0.41 | 16.00 | 2.97 | 0.58 | 0.08 | 20.04 | ||
0.00 | 3.41 | 12.44 | 3.86 | 0.23 | 19.94 | ||
0.00 | 0.11 | 4.32 | 12.84 | 2.75 | 20.02 | ||
0.00 | 0.02 | 0.29 | 2.72 | 16.91 | 19.94 | ||
Total | 20.06 | 19.97 | 20.01 | 20.00 | 19.96 | 100.00 |
- Notes: For the complete sample (owners and renters), the table reports the percentage of observations changing quintile in the net wealth distribution due to the imputation. Figure C.4 presents these results graphically.
These quite significant changes indicate that an objective evaluation does not simply shift the entire distribution but reveals differential effects along several dimensions. This can lead to changes in wealth decompositions, for example, distributional national accounts (see EG-LMM, 2020; Waltl & Chakraborty, 2022; Waltl, 2022). The impact is likely larger for countries with a significant share of owner-occupiers. For purely illustrative purposes, applying the shifts along the wealth and income distribution identified in Luxembourg to other countries participating in the HFCS provides an indication of how important reporting issues could be.
For this hypothetical analysis of other countries, we take country-specific total residential housing wealth and owner-occupation rate, and apply Luxembourg's measured mis-reporting rates per wealth and income18 quintile to owner-occupiers in other countries. To obtain a full wealth measure, we then subtract reported residential housing wealth from total wealth and plug in the adjusted equivalent. Appendix B describes the compilation steps applied.
Results are compiled by first adjusting quintile-specific wealth for owner-occupiers. The proportion of owner-occupiers varies widely across countries in the European HFCS (ranging from 43.9 percent in Germany to 88.8 percent in Slovakia in 2018). Indeed, the relation between owner-occupation rate and the change in net wealth is statistically significant as reported in Table 11.
Germany | France | Luxembourg | Italy | Slovakia | |
---|---|---|---|---|---|
Owner-occupation rate [%] | 43.9 | 57.9 | 69.0 | 68.5 | 88.8 |
Change in net wealth [%] | 3.28 | 3.49 | 3.56 | 3.89 | 4.20 |
93.76% (-value: 0.0185) |
- Notes: The table reports changes in measured total net wealth for four European countries increasingly ordered by the change in net wealth when applying mis-reporting shares found for Luxembourg. measures the Pearson correlation between changes in measured total net wealth and the owner-occupation rate across countries. The -value reports results of a Pearson's product-moment correlation test for rejected at the 5 percent level. See Appendix B for computational details.
- Source: HFCS, third wave.
Therefore, we select four HFCS countries that represent different realities in this regard. This includes the countries with the lowest (Germany) and the highest (Slovakia) proportion of owner-occupiers, as well as two with intermediate shares (France and Italy). We expect changes to be larger for countries with a larger share of owner-occupiers.
Table B.2 in the appendix reports full results. In general, the adjustment factors computed for Luxembourg lead to a small decrease in inequality and a large increase in overall wealth. As reported in Table 11, total net wealth is most affected in Slovakia, which also had the largest share of owner-occupiers. At the other extreme, total net wealth is least affected in Germany, which had the lowest share of owner-occupiers. This finding suggests that wealth distributions may be poorly measured in countries with large shares of owner-occupiers or large shares of residential housing in overall wealth. Given the very large amounts at play even small shifts—whenever found to be systematic—imply large aggregate impacts.
4.2 Housing Market Indicators
Common housing market measures include the price-to-rent (PR) ratio, price-to-income (PI) ratio, and the rent-to-income (RI) ratio. PI and RI ratios are widely used affordability measures: A high RI ratio implies households spend a large share of monthly earnings on rent payments. A high PI ratio suggests low affordability. The PR ratio is considered a measure of investment potential and the sustainability of the housing market: a high ratio suggests that sales prices are high compared to potential returns, that is, rents. This implies high expected future house prices according to the user cost model for durable goods (Hicks, 1975). For residential housing, Himmelberg et al. (2005) show that in equilibrium the cost of buying and using a housing unit for a period (the sales price multiplied by the per-dollar user cost ) should equal the total rent for the same period : where includes the interest rate, maintenance and average transaction costs, the depreciation rate for housing, a risk premium for owning as opposed to renting, and expected capital gains over the period. In the case of Luxembourg, mortgage payments are tax deductable (see Subsection 4.3), which is also accounted for in the user cost formula. This framework implies that in periods of high PR ratios, investing in properties is less attractive due to a limited earning potential.
Hill & Syed (2016) observe that the PR ratio is frequently monitored over time because a sales and a rent price index are the only inputs needed. However, direct measurement of the PR ratio is rare because it is hard to simultaneously observe price and rent for a given dwelling. Our double-imputed dataset allows us to calculate both sales price and rent for each dwelling and match them to the income of the current inhabitant.
There are two main challenges for measuring PR ratios: Commonly used measures either fail to be representative of the entire stock of houses or compare prices of very heterogeneous properties (see Bracke, 2015; Hill & Syed, 2016; Waltl, 2018, for different procedures how to compute quality-adjusted ratios when relying directly on market data). Issue arises when relying on transaction data only, as this usually represents just a tiny share of the housing stock and is unlikely to be representative of the housing stock as a whole. Issue is particularly important to obtain unbiased ratios as the equilibrium condition in the user cost model implicitly assumes that and refer to properties of equivalent quality as noted by Hill & Syed (2016).
A third issue arises when income enters as an additional dimension to compile RI or PI. For market data matching information on household income is hardly ever available. It is thus cumbersome to compare incomes to prices or rents beyond comparing the median (see also Gan & Hill, 2009). In our case, the household income is available from the survey.
The double-imputed data overcomes all three types of shortcomings and allows us to compute un-biased population-representative ratios by aggregating individual ratios. Thus, no kind of quality-adjustment a la Hill & Syed (2016) is needed to ensure a comparison of like with like and aggregation issues are circumvented.
Figure C.3 shows the results. Median PR ratios are about 32, which is considered as high in international comparisons19 and associated with low sustainability. Owner-occupiers have higher PR ratios over large parts of the distribution. These differences are largely statistically significant, as indicated by the non-overlapping confidence intervals.
In line with our argumentation, the OECD writes that aggregate statistics “provide only a general indication of the extent to which housing is (un)affordable for a (median) household, they are ill suited to support policy makers in targeting housing supports to different groups” (OECD, 2021, Box 1.1). Our micro-data yield more disaggregated indicators revealing a substantial degree of variation within Luxembourg (see Figure 6): differences appear across regions and across tenure types. Particularly the Canton including the capital Luxembourg City displays very high ratios indicating pronounced affordability concerns.

Notes: The figures depict median PR, PI, and RI ratios by canton relying on imputed prices and rents (renters and homeowners). Cantons with a low number of survey respondents are merged with their neighbors. Corresponding values are reported in Table C.1 in Appendix C.1.
Similar results are obtained when comparing house prices to income. For 80 percent of the population the PI ratio varies between 6.4 and 31.3. The median of 12 indicates that for 50 percent of the population acquiring their home at current market prices would require financial resources equal to 12 total annual household incomes—excluding transaction costs and interest. The PI ratio is higher for renters than for owner-occupiers reflecting lower incomes among renters (see Table C.3 in Appendix C.4).
4.3 Housing Affordability Indicators
Linneman & Wachter (1989) show how borrowing constraints reduce the likelihood of home purchase. Low earnings reduce the repayment capacity (income effect), and insufficient savings limit a down-payment required to secure a mortgage (wealth effect). Gan & Hill (2009) operationalize income effects as repayment affordability and wealth effects as purchase affordability . Both criteria have to be fulfilled for a home purchase, that is, .
is defined as a household's ability to finance the purchase. This criterion is fulfilled if the household either owns enough funds right away or is eligible to borrow sufficiently. , however, captures the ability to bear the monthly financial burden imposed on a household repaying its mortgage debt. A successful purchase takes place if both criteria are simultaneously fulfilled.
To assess affordability, we restrict the analysis to current renters, that is, potential future owners. We assume the dwelling they rent matches their current needs. This assumption allows us to address the question: Could renters buy their main residence?
Table 12 summarizes Luxembourg-specific calibration parameters. In addition, we adjust for policies affecting truly incurred costs as Luxembourg decreases the entry barriers for first-time owners via several favorable tax treatments (see LISER, 2022). These include most prominently the Bëllegen Akt providing first-time buyers (which we identify with renters in the HFCS) with a tax credit on certain purchase-related administrative fees and taxes up to a total of EUR 20,000 for a single buyer and EUR 40,000 for couples (see again Table 12). In addition, interest payments on mortgages are tax-deductible (see also Girshina et al., 2021; Kaempff, 2018). Further subsidies are granted depending on income and family situation. We do not take these into account, as one can only apply for them after the acquisition (thus not directly affecting ) and they constitute one-time payments (thus not affecting ).20
Description | Rate/Value | Details | |
---|---|---|---|
Transaction costs components | In % of the purchase price | ||
Registration tax | 6% | ✓ | |
Transcript tax | 1% | ✓ | |
Surtax | 3% | ✓; Luxembourg City only | |
Notary fees | 1.5% | ✘ | |
Total transaction costs | 11.5% | Within Luxembourg City | |
Total transaction costs | 8.5% | Outside of Luxembourg City | |
Tax exempt | Household-specific | Granted tax exempt in % of the purchase price | |
Average lending rate | 1.74% | Can be deducted in the income tax declaration | |
Maximum maturity for mortgages | 30 years | ||
Maximum age for repaying mortgages | 66 years |
- Notes: The table summarizes key calibration variables for computing purchase and repayment affordability measures. The reported rates are as of 2018. The table specifies which taxes are eligible for the tax credit Bëllegen Akt (✓/✘).
To assess , we calculate the initial loan-to-value ratio needed to finance the house purchase. denotes the (imputed) current market value of household 's main residence and the amount of mortgage taken out.
A LTV ratio less than or equal to 100 percent implies that must cover at least transaction costs . Around 0.7 percent of all renters have sufficient net liquid assets to finance the purchase without a mortgage. The required external financing is estimated to be EUR 562,000 for the average renter and EUR 508,000 for the median renter.
Transaction costs are rather high in Luxembourg.23 Costs for the buyer include the registration tax (6 percent of the property price), the transcript tax (1 percent), notary fees (around 1.5 percent), and an additional surtax for Luxembourg City (3 percent). This means minimum transaction costs of percent in Luxembourg City and percent in the rest of the country. Real estate agents' commissions are not considered here as they are typically payable by the seller.
For 2018, we find that 69.4 percent (3.0 pp)24of all renting households do not jointly fulfill purchase affordability criteria (6) and (7) meaning they do not have sufficient net liquid assets for the required transaction costs. Only 5.7 percent (1.4 pp) of all renting households could achieve an initial of 80 percent or less when buying their main residence at market prices.
Mortgage interest payments are tax deductible:26 In the first 5 years, the deductible amount is limited to EUR 2,000 per year and household member (reference person, spouse or registered partner, and their children) and decreases step-wise for subsequent years. Taxable income is not directly reported in the HFCS, so we need to estimate the amount saved relying on a strong assumption: we multiply the marginal tax rate of 39 percent (+7 percent of this amount as an additional contribution to the employment fund) by the amount of tax-deductible interest payments.27Furthermore, we subtract the amount of taxes saved via the tax deductibility of mortgage interest payments.
As a final ingredient for the financial margin, we need to estimate basic living costs based on survey questions.29The estimated mean monthly basic living costs range from EUR 814 in the lowest net income quintile to EUR 1,909 in the highest quintile.
We find a positive financial margin for 37 percent (2.7pp) of all renting households. This means that they have sufficient income to cover all recurring expenses including a hypothetical mortgage financing the acquisition of their current main residence. Only 18.1 percent (2.2pp) fulfill both criteria ().30
Table C.6 in the appendix shows different categories of renting households that fulfill both purchase and the repayment affordability criteria, or either criteria in isolation. The likelihood of simultaneously fulfilling both decreases with age. It is also lower for households resident in the canton Luxembourg compared to the rest of the country. The likelihood of simultaneously fulfilling both affordability criteria increases with net income and net liquid assets. Single households and large families rarely fulfill both criteria.
Furthermore, for 73.1 percent (2.7pp), monthly interest payments (taking into account the tax deductibility of interest payments) would be lower than their reported current monthly rent paid.
To conclude, only 15.3 percent (2.0pp) of all renting households meet both affordability criteria to purchase their main residence, and would face lower mortgage payments than their current rent. For the vast majority, however, this purchase is not feasible given current market conditions. Only roughly one-fifth of them, namely 19.7 percent (2.5pp), expect receiving substantial inheritances or gifts in the future, which may help realize such a purchase.
A more nuanced analysis shows that in total 18.9 percent (2.2pp) of all renting households would earn enough income to fulfill RA yet they lack sufficient resources to meet PA. This share decreases to 17.5 percent (2.2pp) if we require that they would have to pay less interest than their current rent. These households thus would greatly benefit from home-ownership but are hampered by lacking wealth to overcome the down-payment constraint (PA).
To assess the robustness of these results with regard to the uncertainty stemming from our imputation models, we apply a global discount equal to 10 percent on imputed market prices for renters' homes. The 10 percent discount is informed by the observed difference between median prices of notary deeds and median prices of advertised prices reported in Table 3.31Under this assumption, 36.2 percent of all renting households would fulfill the purchase affordability criteria, 40.9 percent would fulfill the repayment affordability criteria, and 21.6 percent would fulfill both criteria jointly. The differences to our main results are rather small: Particularly the last value is just 3.5pp above the corresponding value in our reference scenario. We thus conclude that even if our computations were overstating market conditions, the overall conclusions regarding severe affordability concerns remain unchanged.
5 Conclusions
Surveys routinely ask owner-occupiers to estimate the current market value of their home and ask renters to report their monthly rent. The reliability of these market valuations depends on owners' knowledge of the local housing market and their ability to apply it to their own home. Instead, reported rent is usually far from current market conditions because many tenants are on long-term rent contracts that are rarely updated in Europe's heavily regulated rental markets. Thus, survey data are a sub-optimal source to assess housing markets and compile housing statistics of all kinds.
We propose a feasible approach and clear guidelines on how to incorporate market data into such surveys. For this purpose, we elicit several additional dwelling characteristics in the LU-HFCS, which we match to market data via hedonic models to obtain more objective current market sales and rent values for the entire residential housing stock in Luxembourg.
Not surprisingly, we find large deviations between imputed and reported rents in our technical analysis, as the survey asks for current rents paid and existing rental contracts tend to be slow in adjusting when conditions tighten in the market. However, also for sales prices, where the survey aims to elicit current market values, switching to imputed prices leads to significant changes: An increase in total wealth, yet a small decrease in measured wealth inequality due to two opposing adjustments. The value of homes was adjusted downwards for the most affluent households—both in terms of income and wealth—but was adjusted upwards for the remaining 80 percent.
Our strategy also enables us to impute individual price-to-rent ratios, price-to-income ratios, and rent-to-income ratios for the entire population of Luxembourg. We can assess the Luxembourg housing market in terms of affordability. The results suggest that the real estate price levels made it impossible for the majority of renters to purchase the home they occupy given their financial situation and housing market conditions in 2018.
In this article we demonstrate some potential uses for objective sales and rent prices reflecting current market conditions within a multi-purpose representative survey: a policy tool for micro-simulation purposes as well as a source for population-representative housing market statistics. Other uses extend to policy evaluation targeted to support either home-ownership or renting, stress-testing households' portfolios in hypothetical scenarios, or bringing housing statistics forward or backward in time by adjusting the time window from which market data are retrieved. We thus believe that the proposed strategy carries the potential to greatly extend the scope of housing-related statistics.
References
- 1 Although rents are repeatedly paid, they usually do not reflect current economic conditions since existing contracts are rarely fully adjusted to market prices. This is particularly true for rent controlled markets—the norm in Europe (see Kholodilin, 2020). Rent control usually limits adjustments of existing contracts to changes in a country's consumer price index or average interest rates as for example reported by the Euro Interbank Offered Rate (EURIBOR). Rental-equivalent methods relying on paid rents are thus challenged in a variety of applications (see, for instance, de Haan & Diewert, 2013; Hill et al., 2023).
- 2 Recently acquired homes bought-to-let represent a small and likely specific sub-segment not quite representative of the housing stock as a whole. For instance, Bracke (2015) documents differences in prices as well as dwelling characteristics in housing sales in London depending on the transaction's purpose.
- 3 See the OECD Affordable Housing Database https://www.oecd.org/els/family/HC1-2-Housing-costs-over-income.pdf, last accessed on November 17, 2023.
- 4 The HFCS is the single most important source to compile harmonized wealth-related statistics across Europe. The network is coordinated by the ECB. See https://www.ecb.europa.eu/pub/economic-research/research-networks/html/researcher_hfcn.en.html, last accessed on September 3, 2021.
- 5 See https://observatoire.liser.lu, last accessed on November 30, 2021.
- 6 Similar imputations are occasionally performed to fill up missing values or gain counterfactual rents for owner-occupied dwellings in the United States' Panel Study of Income Dynamics (PSID), the Household, Income and Labour Dynamics in Australia (HILDA) Survey, and the German Socio-Economic Panel (SOEP) as highlighted in Alexeev (2020).
- 7 The construction of the final population weights of the LU-HFCS follows the principles outlined by the Household Finance and Consumption Network (HFCN, 2020, sect. 5). The precise weighting procedure applied for the LU-HFCS is described in Chen et al. (2020, sect. 5.8) and Girshina et al. (2017, sect. 2.6.4). With the best available sampling frame in Luxembourg, the Social Security Register, we still miss around 10 percent of the population including households exclusively consisting of international civil servants and (standard for household surveys) collective households such as, for instance, homes for the elderly.
- 8 See the Luxembourg Data Platform for population sizes per postcode: https://data.public.lu/en/datasets/population-par-code-postal-population-per-postal-code (last accessed on January 11, 2021).
- 9 The imputation procedure applied to the LU-HFCS dataset is described in Chen et al. (2020, sect. 5.8) and Girshina et al. (2017, sect. 2.6.3).
- 10 The capital Luxembourg City has a population of nearly five times that of the second-largest urban area around Esch-sur-Alzette, which in turn is 40 percent larger than the third-largest town. For further information, see the latest population per municipality figures provided by STATEC: https://statistiques.public.lu/stat/TableViewer/tableView.aspx, last accessed on January 11, 2021.
- 11 Modeling locational effects as a regression spline means that a non-parametrically estimated price function defined on the location of properties is added as an explanatory variable to the regression model. The rational using such locational splines is similar to including region- or postcode fixed effects yet preserves the neighborhood structure. This implies that prices observed in postcodes close to each other can influence each other. Similar locational splines have been used before in hedonic housing market models; see, for instance, Waltl (2016); Hill & Scholz (2018); Kholodilin et al. (2021).
- 12 The BCL Financial Stability Review reports that the supply of housing in Luxembourg adjusted only a little between 2011 and 2021 although the population grew by on average roughly 2 percent each year (see Chart 1.7, p. 22 in https://www.bcl.lu/fr/publications/revue_stabilite/RSF-2022/228812_BCL_RSF_2022_01_chap1.pdf, last accessed May 10, 2023). According to their data covering the period 2011 and 2019, just over 3,200 new dwellings were built each year, while the number of households increased by 5,300 annually. Statec estimates an error-correction model of real estate prices in Luxembourg over the past 40 years and finds that the structural surplus in housing demand is one of the main drivers of house price growth in Luxembourg (see STATEC, étude 7.2, https://statistiques.public.lu/en/publications/series/note-conjoncture/2021/note-conjoncture-01-21.html, last accessed May 10, 2023).
- 13 Yet, we recommend that future studies should ideally use individually linked advertised and final sales prices for such an exercise. In this case, the later recorded final price (should it have changed between advertising and contract signing) can be traced back in time to identify an appropriate price-setting.
- 14 According to a one-sided Mood's median test, the values reported in advertisements are statistically significantly higher at the 5 percent level. Furthermore, a Mann–Whitney U test is significant at the 1 percent level indicating that the two populations are nonidentical level-wise.
- 15 For ease of notation, we leave out the superscripts differentiating between the two equations for sales and rent prices from here onward. The models ultimately are identical in their structure and solely differ in the type and number of observations used for estimation.
- 16 The following interviewer ratings did not enter the final specification as coefficients were not significant. These are understanding questions, reliability of income and wealth information, ability to express amounts in EUR, ease in responding, ability to express himself/herself, and three ratings of the dwelling, namely the outward appearance, the comparison to the neighborhood, and the rating of surrounding buildings.
- 17 For details, see http://www.statistiques.public.lu/stat/TableViewer/document.aspx?ReportId=13442/IF_Language=eng/MainTheme=4/FldrName=4/RFPath=35 (last accessed October 21, 2021).
- 18 The harmonized HFCS core questionnaire only asks for gross household income. For Luxembourg, the Pearson correlation coefficient between gross and net household income is .
- 19 See OECD housing price indicators, https://doi.org/10.1787/54a3bf57-en (last accessed September 8, 2021).
- 20 See https://guichet.public.lu/en/citoyens/logement/acquisition/aides-capital/prime-construction-acquisition.html (last accessed November 8, 2021) for details.
- 21 See http://data.legilux.public.lu/file/eli-etat-leg-rcsf-2020-12-03-a969-jo-fr-pdf.pdf (last accessed November 9, 2021).
- 22 Net liquid assets exclude typically other non-mortgage loans and the value of non-self-employment private businesses.
- 23 See https://www.globalpropertyguide.com/Europe/Luxembourg/Buying-Guide (last accessed November 9, 2021).
- 24 Standard errors are reported in parentheses.
- 25 Source: ECB Statistical Data Warehouse, https://sdw.ecb.europa.eu/quickview.do?SERIES_KEY=124.MIR.M.LU.B.A2C.A.R.A.2250.EUR.N (last accessed November 9, 2021). Our model does not simulate the impact of the “Subvention et bonification d'intérêt” (see LISER, 2022), which reduces the mortgage rate for home purchases. Eligibility and size depend on income and family situation. At the same time, low income requires a risk premium on the mortgage rate granted by credit institutions. We argue that the inverse relationship between “Subvention et bonification d'intérêt” and the risk premium on mortgage rates is mutually offsetting and therefore excludes both factors from our simulation.
- 26 See https://guichet.public.lu/en/citoyens/logement/acquisition/aides-indirectes/declarer-residence-principale-secondaire.html (last accessed November 8, 2021).
- 27 This assumption is less strong as it may seem at first sight. In tax class 1 (single taxpayer), a marginal tax rate of 39 percent is charged between EUR 46,700 and EUR 100,750 of taxable income. This constitutes the most common marginal tax bracket. See an analysis of the Luxembourg Aconseil Economique et Social, https://ces.public.lu/dam-assets/fr/avis/prix-salaires/2015-fiscalite.pdf (last accessed November 8, 2021).
- 28 A law issued in 2019 (http://data.legilux.public.lu/eli/etat/leg/loi/2019/12/04/a811/jo) allows the Commission de Surveillance du Secteur Financier to set an upper limit to loan maturities between 20 and 35 years. Yet, LU-HFCS data show that 99 percent of home mortgages have an initial maturity of maximum 30 years. Bank statistics on the outstanding stock of mortgages to the household sector (https://www.bcl.lu/fr/statistiques/series_statistiques_luxembourg/11_etablissements_credit/11_07_Tableau.xlsx, table 11.07—version 04/30/2021) report an average share of 7.4 percent of mortgages with an initial maturity of more than 30 years in 2018. The difference between bank statistics and HFCS results is likely due to fundamentally different weighing concepts. The HFCS attributes weights to households to gain a population-representative total with regard to socioeconomic characteristics while bank statistics are weighted by outstanding amounts. In spite of all these differences in concepts, both sources support assuming a maximum maturity of 30 years (hyperlinks last accessed November 8, 2021).
- 29 We define monthly costs as the sum of food consumption at home, 50 percent of food consumption outside home, utilities, and 10 percent of a household's net equivalized income (square root equivalence scale, see Atkinson et al., 1995; OECD, 2015) to approximate expenditures not explicitly covered (e.g., clothing or mobility). These components are available in the LU-HFCS and thus reflect realized spending behavior of the household concerned.
- 30 These results are calibrated using the age of the reference person to determine the maximum maturity for a specific household (average 45.7 years). As it is common for couples to jointly purchase homes, both partners' ages may be relevant in practice. Thus, we estimate an upper and lower bound for the share of households fulfilling and , respectively. We re-estimate results using the couple's minimum (average 44.2 years) and maximum (average 46.5 years) age yielding an interval of plausible results: for (34.8 percent; 39.5 percent) and for (17.34 percent; 18.75 percent). By construction, the resulting intervals overlap the point estimates and are rather narrow. Thus, we proceed with the age of the reference person as main result.
- 31 We would argue that a discount of 10 percent from our imputed market prices provides a lower bound for true market prices of currently rented homes. As laid out in Subsection 2.4, realized transaction prices are most likely very close to advertised ones in the Luxembourgish sellers' market.