Using statistical models to better invest in residential Real Estate
Why Haystock can help you screen for better apartments
TLDR / summary:
I show that using a statistical model to determine if a unit is attractive or not can help achieve better investment returns. More specifically, I create a simple investment strategy where I invest in the units flagged as attractive by my model on a given year (say 2015) and compare the performance (until today) with a portfolio of all the other units.
The units flagged by the model are outperforming; the outperformance increases with the level of attractiveness.
The model used for this analysis is used by Haystock to screen for apartments. (Haystock also provides you with every single analytics you should have in front of you before investing in or selling an apartment. ).
NB: you should not invest solely based on a model, at Haystock we strongly believe that a model should serve as a screening guide to help unearth opportunities. Each of those should then be analyzed in much more depth (using the rest of our toolkit for example).
Methodology
We build a statistical model to determine whether a unit is cheap or expensive. (details around the methodology here).
We go back in the past (say in 2015) and use our model (calibrated with data prior to 2015 to avoid forward looking biases) to pick attractive units. For example, a unit can be deemed attractive if its market price (in 2015) is cheaper than the predicted price by [100$/sq ft]. Let’s call this portfolio the Investment Portfolio
All the other units which have not been selected constitute the Control Portfolio.
Every year thereafter we look for observed data points where units within our Investment Portfolio have been sold. Not all of the units within our Investment Portfolio will be sold every year. We only compute returns when an actual sale happens.
Say in 2019 only 10 units within our Investment Portfolio were sold in the market (during that year). We define the price appreciation of our portfolio as the median price appreciation for those 10 units (rescaled in price per sq foot to compare apple to apple).For the Control Portfolio (which contains a lot more units), we compute price appreciation using the median price of units sold within the portfolio on a given year. (so in 2019 we would look at the median price per sq foot of units sold in 2019 and compare it to those of 2015 to define the price appreciation). This gives a good proxy of how the standard unit in the portfolio has evolved across the years.
For this exercise, I define total_return as the sum of the return from price appreciation (defined above) and the cummulative net rental income (how much you’d get from renting the unit you’ve purchased say in 2015, up until a given year, after paying taxes and charges). This total_return can be thought of as your return on investment in real estate (not taking mortgage effects and tax rebates into account for this simplistic analysis).
Results
Looking at a single building: 15 William Street in the Financial-District

Having to estimate the return of the Investment Portfolio which only has 14 units is going to be choppy (very few observation data points).
Therefore we need to have a broader Investment Portfolio to reduce the noise. To do so, one simple idea is to look for buildings that are similar to 15 William Street. I do so by looking at a Luxury Score that takes all the amenities of a building into account and assigns a ranking (each amenity has a score, some of which are higher than others — for example having a pool generates more points than having central AC).
In my example I get a lit of 10 buildings now where I can apply my methodology:
[’15 William Street’, ’20 Pine — The Collection’, ’40 Broad Street’, ’75 Wall Street’, ‘Cipriani Club Residences at 55 Wall’, ‘District’, ‘Downtown Athletic Club Building’, ‘Downtown by Philippe Starck’, ‘Greenwich Club’, ‘W Downtown Hotel & Residences’]
Let’s now do the same exercise but across all years (since 2012) and for different levels of attractiveness:
a positive attractiveness number e.g +100: means we only select units that had a market price 100$/sq ft cheaper than what our model predicted
a negative attractiveness number e.g -200: means we only select units that had a market price 200$/sq ft higher than what our model suggested (we would expect those units to perform badly)
We focus on the total return as of 2023.

Let’s compare with the Control Portfolio:

Let’s look at the overperformance: we take the returns of the Investment Portfolio minus the return of the Control Portfolio:

Conclusion
This article was meant to show how using a model can be useful in practice. By flagging units that are attractive we can filter for opportunities which can then be further analyzed.
Haystock does just that: it provides an interactive map where every single active listing is flagged based on its attractiveness.
To recap, I don’t think one should use a model alone (units can have idiosyncrasies which justify a deviation from the model) but using a model overall outperforms the market as we have shown above.
I would love to get your feedback on what I am building at Haystock and hear how the tool is (hopefully!) improving your investment process.