When we decide to buy or rent a real estate (apartment, room, house, etc), one of the most important search criteria is the price. Its value depends mostly on characteristics, such as location, year of construction, number of rooms, area, central heating, etc.
However, two properties with the same characteristics, for example, can be sold at two totally different prices, and there are deeper reasons for that difference. The seller/buyer urgency in completing the deal, the market context, the real estate agency managing the deal, or the time of the year, all contribute to these differences.
Thus, it can be particularly challenging to determine what is the real selling price of a given property. By analyzing the listing prices of properties in real estate websites, we can get an incorrect idea of the true value of the place. That is especially true, due to overestimation of the realistic value, for selling purposes. This may lead us to end up buying/renting a place for a price way greater than the realistic one.
As such, we will explore an approach to determine the real selling price of a place, by taking into consideration different aspects considered relevant when making an offer.
Investment in real estate can be purchase or rental of a house, an apartment or a room. It can also be for private use or for commercial use. However, we will assume the scenario of purchasing an apartment for private use. Nevertheless, in all these different contexts, the same considerations can be taken into account.
Data
Besides the property characteristics, there are other factors that may influence the selling price, therefore, we should look at other types of indicators and data when making an evaluation, namely:
Demographics and Geo-spatial data: a recent boost in the population in a given neighborhood may indicate a higher demand for that area, therefore the price should be higher. As such, the population growth around the respective neighborhood may be an indicator of the property price. Also, the infrastructures in the neighborhoods like the number and variety of stores, malls, public transportation, and schools may help adjust the right price.
Unstructured data: the pictures, descriptions and opinions given in the form of images and text can help us capture the condition of the property, which can impact its price.
Market behavior: The current market conditions also have an impact on selling prices. The demand or number of similar houses on sale impacts the house value, following the Demand and Supply Law. Thus, the average price of similar houses bought recently can be a good indication of the price. Additionally, the length of time during which the property is on the market, compared to the average for similar properties, may raise a red flag with regards to the current price.
Economic Indicators: There are also economic factors that influence house pricing. For example, the increase in the employment rate or in the wage growth can lead to an increase on the property listing price. Also, changes in the interest rate or financial incentives can contribute to buying a property on credit.
Selling aspects: urgency in the selling process, conditions of the payment and expertise of the selling agency, are other aspects that may influence the selling price, compared to its real value.
In terms of the data available, we can assume we know the apartment characteristics (e.g., number of rooms, location, area, energy efficiency, etc) and some indicators, like number of infrastructures near the place, employment rate, average price of similar houses, urgency in the selling, images and texts of the apartments, etc.
Furthermore, in terms of pricing, we will assume that we know the listing price of all apartments, and the selling price of some apartments (e.g., the selling price of deals made by a single real estate agency). This information can be structured as follows:
Just for clarification, we refer to the selling price as the value a given property is effectively sold at and the listing price corresponds to the price the place was listed on the market in the first place.
Price Prediction
The selling price prediction has several challenges, namely the following two:
The real selling price is often missing in our dataset, as it is not always available
The listing price may help us predict the selling price, although the distribution of the ratio between listing price and selling price is not given
Semi-supervised approach
As the selling price is only available in a small set of samples, the exploration of a fully supervised approach is not suitable.
One first approach could be using a semi-supervised approach with the goal of predicting the selling price based on the few samples labeled, as follows:
F(apt features) -> selling price
Where apt features, includes all the aspects previously described, such as demographics and geo-spatial data, market behavior, economic indicators, etc, besides the apartment characteristics. The text or image data could be encoded to be used in a tabular data format.
There are different semi-supervised techniques we could explore (transductive, inductive, wrapper methods, etc) for modeling.
However, this approach would be biased towards the agency from which we gathered the real selling price. Furthermore, we would not be, explicitly, taking advantage of having the listing price available, which can be used as a weak label.
Semi-supervised + weakly supervised approach
As such, another approach can be considering the listing price as a weak label and use it to predict the selling price. For making a direct mapping, we would need to determine the distribution of the difference between the real selling price and listing price.
Thus, we can combine both semi-supervised learning and weakly supervised learning, in order to:
Adapt our approach, taking into consideration we have few data labeled (semi-supervised approach)
Use a noisy and weak label, the listing price, as a starting point to compute the real selling price (weakly supervised approach)
To achieve that, we will customize a loss function that can help us solve this task, taking these challenges into consideration.
Generically, we can model our problem as follows:
F(apt features, listing price) -> selling price
Again, the apt features would consist of all the aspects mentioned before and not only the apartment characteristics.
Distribution of the ratio between selling price and listing price
We will determine the relationship between the listing price and selling price by calculating the distribution of the ratio between them.
A possible example of the price ratio distribution could be:
Loss function/Optimization
The loss function will be customized in order to compare the price ratio distribution using the model predictions with the real price ratio distribution (computed with the known selling prices), combined with evaluation of the predictions of selling price.
To achieve this, we can use the Kullback-Leibler Divergence, which quantifies the difference between probability distributions using the following formula:
Where p and q correspond to the two probability distributions to be compared.
For evaluating the selling price predictions we can use the Mean Absolute Error (MAE):
Where x represents the selling price predictions and the y represents the real selling prices.
Thus, our loss function would be:
Where r_p refers to the price ratio distribution using the selling price predictions of the model and r_g refers to the real price ratio distribution, using the samples in which we know the real selling price. The selling_pricepredicted represents the selling prices predicted by the model and the selling_pricereal represents the real selling prices.
Conclusion
The task of purchasing a property can be quite impactful in our financial life. Therefore we should put an extra effort to try to get the best deal in terms of value/quality vs price.
This post discusses an approach for determining the correct selling price, based on the different factors considered relevant. There are a lot of aspects that influence a property value, and even more that determine the selling price. Thus, we started by making an overview of the different aspects that may influence the selling price, where the market behavior, demographics and geo-spational data, unstructured data (reviews, pictures and descriptions) and economic indicators are included.
Based on the data that is normally available online we described an approach that combines both weakly supervised and semi-supervised learning, together with a customized loss function that focuses on learning the real price ratio distribution, i.e., the ratio between the listing price and selling price.
This can be a realistic approach for predicting the real selling price. Nevertheless, and, as usual, if you have any comments or ideas about Automated Valuation Models for Real Estate, make sure to reach us!
Like this story?
Subscribe to Our Newsletter
Special offers, latest news and quality content in your inbox once per month.
Signup single post
Recommended Articles
Article
A new era has arrived for NILG.AI
Sep 5, 2022 in
News
Today is NILG.AI’s fourth anniversary. Happy birthday to us! For most humans, birthdays are a synonym for getting older and leaving the good days of the youth behind. For companies, they are a moment to reflect on everything we achieved, recognize how far we have come, and envision how far we will go. So, let’s […]
Trip data is any type of data that connects the origin and destination of a person’s travel and is generated in countless ways as we move about our day and interact with systems connected to the internet. But why is trip data sensitive? The trips we take are unique to us. Researchers have found that […]
Is the fastest route always the best? This article may give you a different perspective if your answer is yes. Normally there are multiple ways to tackle a given problem or task, and the optimization field is no different, as there are different approaches we can take to find an optimal solution. The choice of […]
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.