Machine Learning at Yelp

Machine learning has been integral to Yelp's business model over the last several years and can be leveraged to help improve their declining stock price.

Yelp’s website,, is a crowd-sourced local business review site. Their business model relies on relevant reviews (on scale of 1-5 stars) which generates advertising revenue.1 The content’s search-ability is very important for businesses, an HBS study found that each “star” in a Yelp rating affected the business owner’s sales by 5-9%.2 Machine learning has been integral to their business model over the last several years and should be leveraged to help improve their declining stock price.6


Megatrend of Machine Learning and Process Improvement  

Machine learning is very important and useful to Yelp, both on the consumer side – finding relevant businesses through reviews and encouraging useful reviews – and on the advertising side – displaying relevant ads to users – as most of their revenue is generated through advertising.

Yelp’s foray into machine learning was in 2015 with deep learning-power image analysis which identifies color, texture and shape of objects in user submitted photographs with 83% accuracy and uses the identifying traits to sort them into categories. Once the reviewers’ photographs have been categorized (broad categories such as food, drinks, menu, interior), Yelp has developed deep convolutional neural networks to recognize the classes and sort the photographs that are then displayed to users (see example in exhibit 1). Subsequently, Yelp expanded its machine learning to a custom ads platform whereby advertisers can opt to have “two step” AI system recommend photos and review content to use in banner ads targeting users. This machine learning system increased the rate people click on ads by at least 15%.1


Yelp’s Strategy in the Short and Medium Term

In 2018, Yelp introduced Yelp Collections, which uses combination of machine learning, algorithmic sorting and manual curation to highlight top businesses in a particular area (see Exhibit 2).7 They are comparing the effectiveness of the three methods of Collection curation (machine learning, algorithm, human curation) and assessing the potential impact on user interaction and experience.  Additionally, weekly recommendations (entitled “Recommended for You”) are informed entirely by machine learning. Their AI engine bases the specific recommendations for each user on which businesses a user has viewed and reviews Yelp has received in the previous week. These are compared to the Collections formed through human-curated roundups and algorithmically generated lists. Lastly, Yelp’s algorithms automatically publish “top 10 list” collections for cities, determined by composite of star ratings and volume of ratings for each respective business. Yelp should use the success of these lists (measured by metrics such as click rate and new customer acquisition) in its pitches to advertisers and continue to further develop and refine them.1

In the short term, Yelp should further refine their machine learning to optimize content delivery to users. One of Yelp’s current machine learning project is creating a “Popular Dish” list on each restaurant’s Yelp page based on customer reviews. The “Popular Dish” idea is a step in the right direction. An HBS study demonstrated that Yelp customers do not use all available information in each review and about each business and thus are more responsive to quality changes that are more visible and respond more strongly when rating contains more information.3 Given the growing amounts of data, commonly referred to as data deluge, it is important to have the framework and infrastructure to present the data to users and filter out the less-helpful data.4

In the medium term, Yelp should use machine learning to ensure the validity and integrity of their reviews and more prominently displaying higher quality, thorough reviews. Furthermore, they should take steps to identifying and removing fake reviews as these can negatively (or unfairly positively) impact a business. It is difficult and time consuming to confirm a fake review but ensuring this integrity of each review is critical to their business model and is an area where machine learning should be further developed.  Ensuring high-quality content is of the upmost importance to their business model as they rely on advertising revenue.2


Future Directions

Over the past few weeks, Yelp stock as decreased nearly 30%, with the company blaming internal issues leading to a paucity of advertiser acquisition. A potential initial step would be to leverage machine learning to screen for authentic reviews as review authenticity is often cited as a reason for declining user engagement.  6

Further questions that can be considered with regards to Yelp’s use of machine learning and additional areas of use are:

How can Yelp continue leverage machine learning to improve their advertising revenue and attract new advertisers?

Given data deluge how will they continue to improve existing algorithms and accelerate the development of other algorithms?


Exhibit 1: Sample classification system of pictures using machine learning 5

Exhibit 2: Yelp Collections interface 7

(Word Count 767)



1 Kyle Wiggers. VentureBeat. May 24, 2018. Accessed 11/12/18.

Tom Gara (September 24, 2013). “Fake Reviews Are Everywhere. How Can We Catch Them?”. Wall Street Journal. Accessed 11/12/18.

3 Michael Luca. Reviews, Reputation, and Revenue: the Case of Working Paper 12-016.

4 Katrine Lake. Stitch Fix’s CEO on Selling Personal Style to the Mass Market. Idea Watch: how I did it. Harvard Business Review May-June 2018.

5 “How We use Deep Learning to Classify Business Photos at Yelp.” Oct 19, 2015. Accessed 11/12/18.

6 Market Watch. Yelp’s stock plunge exposes a fragile business model yet again. Nov 10, 2018. Accessed 11/12/18.

7 Hilary Grigonis. Yelp now uses AI to deliver personalized recommendations with Collections. Digital Trends. Accessed 11/12/18.



Risk management on the path to the machine learning prize at Workday


Can Machine Learning Help Save the Oil & Gas Industry?

Student comments on Machine Learning at Yelp

  1. Deep diving into a business defined by its user-friendly algorithms, the key to the next steps is to maintain a user-friendly focus, which is why any “sponsored” results should be clearly identified as such. This is not to say that they shouldn’t take money from advertisers to boost revenues and have these results show up higher, but in order to maintain consumer confidence, they should be fully transparent about what they are doing. A great example to follow would be Google, the mega giant has already helped established parameters for how should search results be posted and there’s no need and on the flipside a huge risk from deviating from this model.

  2. Thanks for your essay, very interesting! Two things that struck me were the data deluge and the balance between advertiser/consumer interests. On data deluge, it’s not surprising to me that people tend to focus only on one important quality aspect rather than long lists of information. When I use yelp, I often only look at the popular dish feature or the proximity relative to the highest “star” ratings. Rarely do I deep dive into this and I think Yelp should continue to focus on these types of visually appealing attributes. Yelp also needs to balance generating advertising revenue versus maintaining consumer confidence. Establishing clear parameters could help, but they could also establish short-term partnerships with restaurants and see how consumers rank them (assuming it drives more demand). If the restaurant sees high ratings, then a long-term partnership could form but if not then Yelp could potentially end the partnership to maintain consumer trust. Lastly, I do agree with your point on investing internally to monitor fake reviews. While these reviews may not have the same detrimental affect that we have seen on Facebook and Twitter recently, they still can erode the brand long-term if not addressed.

Leave a comment