About a year ago, I published a blog on the impact that listing comments have on days on market: Public Listing Comments Can Have an Impact on Days on Markets. I soon realized there was confusion about the analysis, as I received inquiries asking why it was “fenc backyard” instead of “fenced backyard,” or why it was “valut ceil” rather than “vaulted ceiling,” or why the “in” was missing from “walk in closet,” as well as other anomalies with phrasing. Although it was not too difficult to guess the actual meaning of those words, it was annoying to see fragments of the words instead of the whole words. I went back and checked the analysis and discovered that the culprit was removing stop words and stemming words when processing the text.
Stop words like “will,” “above,” “a” and “the” are commonly used in language but may not have much meaning. Hence in text analysis, stop words are often removed to increase the focus on more important words. Obviously in a short listing comment there aren’t too many words that don’t mean anything. On the other hand, stemming is a process of reducing inflected words to their base form. For instance, “looked” and “looking” can be stemmed to a single word “look.” This explains how “vault ceil” came from “vaulted ceiling.” When text-mining listing comments this step is probably not necessary since every single word in a listing comment typically carries meaning. Once I determined what was being missed in last year’s analysis I narrowed the list of stop words and completely removed the stemming step. I then duplicated the analysis, but with updated data, which had more than half a million single-family transactions that were closed in the first half of 2017 across the country. The analysis revealed something new this time.
Figure 1 is a word cloud that illustrates the top word pairs that reduce the days on market for a listed property. In this figure, larger font sizes represent a greater impact on days on market. One phrase that jumps out is “will last,” which should be “will not last” as the word “not” wasn’t removed from the list of stop words. This is another example of why there aren’t many words without meaning in a listing comment. We also see similar references “won t” and “t last,” which obviously are two sequential pairs from “won t last.” Why would putting “won’t last” in the listing comments lead to the quicker sale of a property? To answer that question, let’s look at the fundamentals of property valuation. Property valuation is a professional opinion which has three key factors: market realities, property features and the potential of the property. When realtors set listing prices and when buyers look for homes, all three factors will be accounted for in property values. Let us consider two scenarios. Scenario 1: A house with great features is located in a desirable neighborhood. The local school is excellent and the house is close to amenities. The only downside is that, at any given time, there are very few of these properties listed on the market. Once a property becomes available, the listing agent realizes it will be gone soon so the agent puts “won’t last” in the comment, which reflects the market reality of quick sales for the neighborhood. Scenario 2: A realtor lists a property at a very competitive price because its physical condition is below the market average. However, once it is renovated it could become the dream home of the next buyer. Again, the realtor puts “won’t last” in the listing comment to indicate the potential of the property. Thus, “won’t last” is not a simple word game, but a reflection of favorable market reality or the potential of the property that could lead to a quicker sale. From the word cloud, we can also see words like “pride ownership” and “award winning,” both of which suggests that the potential of the property is great. However, if listing agents start to take advantage of “won’t last” by putting it in every single listing comment, it eventually will not have any significant impact on days on market.