Extracting Property Quality and Condition from Real Estate Agent’s Comments

Quality and condition are two key components used to determine the value of a residential property. A newly renovated home that has a gourmet kitchen and an updated bathroom will be much more expensive than a similar home with only above-average quality and condition. Unfortunately, these two factors are hard to come by, which is why most automated valuation models (AVMs) assume average quality and condition. As a result, properties at two ends of the quality and condition spectrum often are very difficult to value.

To obtain property quality and condition, there are two questions that need be answered. The first is how to rate and standardize quality and condition. Fannie Mae and Freddie Mac (the government-sponsored enterprises or GSEs) have established specific definitions for quality and property and require appraisers to rate them on a standardized scale from C1/Q1 (best) to C6/Q6 (worst). For detailed definitions of C1-C6 and Q1-Q6, please refer to Fannie Mae and Freddie Mac Uniform Appraisal Dataset Specification

The second question is where to obtain quality and condition information in addition to the appraisal data. Outside of GSEs who receive almost all appraisal data, there may not be information available showing good coverage of property quality and condition. However, the advances in machine learning make it possible to extract information from nontraditional data sources such as text and images from Multiple Listing Service (MLS) data. 

Below is part of a real estate agent’s comment for a property that was sold recently and appraised as a C5 rating.

“Great opportunity to own a large home on a large lot with tons of possibilities!”

Per the standardized condition ratings, a C5 needs significant repairs and it appears that the selling agent hinted at this in their comments when the property was listed for sale.

CoreLogic has developed a model that leverages various machine learning techniques and its rich appraisal and MLS data assets to extract property quality and condition information from real estate agents’ comments. Figure 1 uses a word cloud to illustrate the words most used by real estate agents when listing properties for sale that were subsequently sold and rated as C6s.

Figure 1: Frequent Comments for C6-Rated Properties 

Wordcloud related to property

By the GSE standardized condition rating, a property rated as C6 has substantial damage and requires substantial repairs and rehabilitation. It is not a surprise to see key words such as “TLC”, “needs repair”, “fixer upper”, “fantastic opportunity” and “rehab”. Counting the appearance of negative words is a simple way to get quality and condition indicators. Furthermore, a sophisticated machine learning model can derive meaning from text.

This new property condition algorithm is just one of the innovations driving our Total Home ValueX AVM. .

Related Posts

Lounge chairs poolside on a beachfront property.

2021: A Banner Year for the Luxury Home Market

Increases in global wealth has spurred demand for luxury homes in both traditional markets as well as those that have not previously been considered to be luxury markets.

Home Price Index
Home Price Index

U.S. Home Price Insights

Home prices nationwide, including distressed sales, increased year over year by 20.2% in May 2022 compared with May 2021