London Property Finder, applied machine learning
Published:
Role: Applied machine learning, self-directed research. Timeline: 2026.
An interactive tool that scores live London property listings against a stacked machine-learning ensemble, built to test whether publicly available data alone can identify mispriced homes.
The question
Most house-price models predict a number and stop. The question here is sharper: using only publicly available data, can a model flag which current listings look mispriced relative to comparable sold properties, and can the resulting tool be interrogated by the user rather than trusted blindly?
Approach
- Training data. 2.18 million Land Registry transactions spanning 1995 to 2026, deflated to a common reference with the House Price Index.
- Model. A stacked ensemble of per-property-type, Optuna-tuned XGBoost base learners feeding a quantile-loss meta-learner, producing an 80% prediction interval for every property rather than a bare point estimate.
- Tool. A single-page interface scoring roughly 5,300 live London listings, with sold-nearby comparables drawn from the Land Registry, filter and map views, multi-key sorting and CSV / XLSX export.
Result
- A gap mean-absolute-error of £38,832, a 32% reduction over the baseline.
- 68.3% of predictions within ±10% of the asking price.
- An IAAO coefficient of dispersion of 8.6, in the “excellent” band, with a price-related differential near 1.0 indicating no systematic vertical inequity.
Stack: Python, XGBoost, Optuna, quantile regression, geospatial analysis and a stacked meta-learner architecture.
