Predicting Real Estate Prices in New York State: A Machine Learning Approach
In this project, I developed a machine learning model to predict transaction prices for real estate properties in New York State. Leveraging regression models and extensive data analysis, the goal was to achieve a model with a mean absolute error (MAE) below the project’s win condition of $70,000.
The project involved comprehensive data cleaning, exploratory data analysis, and feature engineering to ensure high-quality input data for the models. Various regression algorithms, including Lasso, Ridge, Elastic Net, Random Forest, and Gradient Boosting Trees, were trained and evaluated for their prediction performance.
Through careful feature engineering and preprocessing techniques, such as standardization and outlier handling, the models demonstrated impressive performance in predicting real estate prices. The Random Forest model emerged as the top-performing model, achieving the highest R2 score and the lowest MAE among the evaluated models.
The project’s outcomes have significant implications for the real estate industry, enabling more accurate and data-driven pricing strategies. The model can assist real estate investment firms in optimizing pricing decisions, assessing risks, and reducing costs associated with manual appraisals. The next steps involve deploying the winning model, continuous monitoring, incorporating external data sources, evaluating model robustness, and collaborating with stakeholders for further improvements.
To explore the project in detail, please visit the ML-real-estate-prediction GitHub repository.