Zillow: Addressing the Online Real-Estate Market

You may know Zillow for enabling your “real estate daydreams” or allowing you to snoop on the price of your friend or boss’s home. But where does it get this data, and how does it use that data to create value? Read on to find out.

I learned from a young age that, no matter the group, one topic that the adults around me reliably came to talk about in any social gathering was the price of local real estate. My parents explained that people’s sense of well being was strongly tied to the value of their home — it was like their stock portfolio (but often bigger!). But, at the time, there were no good ways to know the value of your home apart from community chatter.

Enter Zillow. Zillow is a third-party data aggregator of real-estate data that creates products for both buyers and sellers of homes and rentals. You’ve probably used Zillow if you recently were in the market for a new place to live. In this post, rather than emphasizing the business model, I’ll focus specifically on describing Zillow’s process for acquiring and leveraging data. I’ll close with some of the implications of Zillow’s data strategy for the best path for the organization moving forward.

Zillow Group owes $8.2 million to Rosemont photo company, jury finds -  Chicago Tribune
The Zillow Landing Page makes it straightforward for anyone to find their next home.

Aggregating Data Assets

The heart of Zillow’s strategy is to maintain a up-to-date, wide coverage, home-level data set on which it builds other data assets. Interestingly, Zillow does not invest heavily in the development of it’s own data assets. Rather, it relies on two sources: Real-estate transactions data and user-generated data.

First, Zillow expends significant effort to integrate in data from many sources of real-estate transactions. The most common is the use of Multiple Listing Services — a database system developed by the National Association of Real Estate Agents (source). This dataset comprises 580 regional databases that are developed by local organizations to aid realtors in selling homes (source). Zillow also gathers data from other non-MLS auctions and from other public events like foreclosures.

Second, Zillow allows users to post their own information. This includes “for sale by owner” properties for housing sales (source). However, this is particularly valuable for rentals, where Zillow serves as a large platform for coordinating landlords and renters (and where transactions are much more common) (source).

The heterogeneity of the datasets that form Zillow’s data foundation provides a source of competitive advantage — it’s hard to mimic the coverage of the data. However, it’s also a source of fragility for the organization, in that it makes Zillow dependent on the relationships that it has with its data providers. For example, in 2016 Zillow was forced to go out and build relationships with individual MLS’s and county records when their bulk data provider decided to not renew its agreement with Zillow (source).

Zillow Business Model
The Zillow website advertises listings in different areas to users.

Creating Value but Struggles to Capture It

Zillow then proceeds to generate data assets on top of this foundation. The most famous example is the Zestimate — a per-house prediction of the market price. Zestimates have been foundational to Zillow’s growth strategy; interestingly, they were intended to be controversial (rather than accurate) to increase word-of-mouth chatter about the website (source). Zestimates are moderately accurate, but are widely popular because of the completeness of their coverage.

Zillow then capitalizes on its voluminous traffic to create value for real-estate sellers — both real estate brokers and financial institutions. For brokers, Zillow provides Premier Agent / Broker programs, which allow agents to track leads and market their services on the platform (source). For financial institutions, Zillow aggregates many providers into a single marketplace for home buyers to shop their mortgage options (source).

There are two primary challenges facing the organization related to their data strategy. The first is reliability — Zillow struggles because of the publicity of its information means that it is accountable for the effects of inaccuracies. For example, a group of homeowners in Illinois sued Zillow over Zestimate accuracy in 2017 (source). Second is Zillow’s own ability to exploit the data effectively. For example, Zillow tried to enter the iBuyer market — where the organization bought homes for the purpose of reselling — in competition with Redfin. Interestingly, the organization could not make the model work and closed Zillow Homes in 2021 after taking large losses (source).

In summary, Zillow has built a powerful platform that coordinates a large segment of the market for real-estate via a powerful data set. However, they’ve struggled to effectively find a way to capture value from that data through building other products and services. If they can do so, however, their data assets provide a competitive moat that should protect the company moving forward — they’ll truly be home free.

Previous:

The Secret Surveillance Sauce to Domino Pizza’s Success

Next:

23andMe: DNA Genetic Testing For Health, Ancestry And More

Student comments on Zillow: Addressing the Online Real-Estate Market

  1. Thanks for the really interesting post, Daniel. I’m really curious as to why Zillow’s home buying didn’t work out, particularly compared to other companies like OpenDoor that are arguable executing more successfully on iBuying at scale.

    Did Zillow miss on execution in some way, or does their data not actually provide a competitive advantage in iBuying? It really highlights the importance for companies to understand what your data is actually communicating, where that data can be applied in strategically useful ways, and perhaps most importantly, where it cannot.

  2. Thank you for the post, Daniel. While reading it, I was wondering how Zillow tried to diversify the sources of its data to reduce the dependance on the MLS that are currently used. As you mentioned in the post, creating a unique mix of datasource can represent a significant competitive advantage.

  3. (This comment is from Kevin Lam) • Daniel – Zillow: Thank you for this insightful post Daniel! The importance of data integrity is certainly exemplified by Zillow’s lawsuits and growing pains. I wonder how Zillow might be able to better verify information and whether or not there is an opportunity to tie their dataset into other pre-existing datasets like what HouseSigma does (https://housesigma.com/web/en/).

  4. Thank you so much for sharing, Daniel! Zillow has used the technology to a sufficient advance, and relate to its audience. Zillow offers model that drove all of these destructive decisions to the corporate leadership. But it’s very difficult to govern. As you mentioned, they struggled with capturing value from the data through products and services, I wonder how the data provided by the AI how they can test it to evaluate if the modeling is producing predictions that are positive, we cannot understand why the model is making these predictions and why. The opaqueness between machine and people is matte.

  5. Hi Daniel, thanks for this interesting post. I’m guilty of the millennial trope of Zillow obsession, and I find that one of the most interesting features (if not a prominent one) is that Zillow is able to value homes that aren’t listed on its site. While it doesn’t seem terribly complex to do this using their vast stores of data, I now have a better sense of how this is done.

Leave a comment