ML and Chill: Machine learning at Netflix
Netflix uses machine learning based on implicit data in nearly every part of the user experience — but what risks does this approach create?
Hollywood studios have struggled with a core problem for over a century: How can an organization scale a creative endeavor? Movies and television shows require fresh ideas to be successful, yet studios must have processes in place to produce multiple hits. Further complicating this challenge, consumer preferences have shifted from movie theaters to streaming services, so studios must present their content on a user interface that retains monthly subscribers. As Wall Street Journal columnist Elizabeth Winkler writes, “The definition of success has shifted from how many eyeballs a channel can grab on a given night to how effectively a piece of content helps retain subscribers month after month.”[1] Los Gatos-based Netflix, Inc. has taken a unique approach by solving these problems with machine learning.
In the short term, Netflix uses machine learning to optimize both its frontend interface and its backend operations, creating a better experience for its users. Netflix’s user interface is essential for capturing a viewer’s attention; internally, teams at Netflix refer to the first ten seconds a user spends on the homepage as the “moment of truth,” as the user will quickly decide whether Netflix has any shows they want to watch.[2] In order to increase the likelihood a user will watch a show, Netflix has several different previews for each title, and it uses machine learning to match a preview to a user’s demonstrated preferences. For example, the preview for Good Will Hunting may change depending whether a user prefers romance, in which case it will feature a photo of costars Matt Damon and Minnie Driver, or comedy, in which case it will feature comedian Robin Williams.[3]
Machine learning can also improve a company’s backend operations, and Netflix has applied the technology to improve its streaming quality. With over 100 million users across the globe and over 1000 different devices streaming its content, Netflix needs to provide an array of streaming solutions if it wants to retain its customers.[4] For example, the network bandwidth available for a mobile user in India will be significantly different than a SmartTV user in the United States. The engineering team at Netflix employs machine learning to predict the network needs of a device, which allows the company to optimize its server load while avoiding video delays.[5] It should be noted that both of these applications of machine learning — frontend customization and backend optimization — directly impact the user experience, which in turn reduces customer churn.
In the long term, Netflix uses machine learning to determine which content to produce. The company will be releasing over 700 original shows and movies this year, and many of the greenlight decisions are influenced by the company’s machine learning algorithm.[6] Netflix’s data about its content comes from two key sources; the first is a group of “taggers” who categorize content with attributes like “gritty drama.”[7] The second source is Netflix’s userbase; the company relies on groups of implicit and explicit data. Explicit data come from stated user preferences and include thumbs up/down ratings; implicit data come from user actions and include the clickthrough and completion rates for each piece of content.[8] By harnessing this powerful set of data, Netflix has created an unmatched recommendation engine. Over 80 percent of viewership on the platform comes from algorithm suggestions, and Netflix then harnesses this data to make forward-looking decisions about which content to produce next. [9]
While Netflix has been very successful with its application of machine learning, there are still some potential pitfalls as an over-reliance on implicit data could lead to negative outcomes. An example of such an outcome can be seen at YouTube, where optimization for implicit user engagement led its algorithm to consistently steer users towards “divisive, misleading or false content.” [10] While users were more likely to watch YouTube’s extremist content for longer, this short-term optimization risked generating long-term user mistrust and government regulation. Netflix has taken steps to decrease its use of explicit data by removing all user reviews,[11] and I believe this choice creates risks similar to those YouTube faced. If Netflix wants to avoid YouTube’s over-optimization mistake, it will need to keep using explicit data.
So far the Netflix algorithm has successfully avoided these local optimums by avoiding overreliance on any type of content, and it continues to pick new shows and movies successfully. Yet as the studio grows, it will sign bigger and bigger stars, and it may need to consider the human impact of its reliance on machine learning. When a long-term business deal collides with a machine learning algorithm, how can Netflix know when to trust its algorithm instead of its business development team? The company is grappling with this question today,[12] and if recent events are any guide, Netflix’s decision will influence the entire entertainment industry.
(794 words)
[1] Winkler, Elizabeth. 2018. “Why No One Can Catch Netflix.” The Wall Street Journal. Dow Jones & Company. August 26. https://www.wsj.com/articles/why-no-one-can-catch-netflix-1535205600.
[2] Ramachandran, Shalini, and Joe Flint. 2018. “At Netflix, Who Wins When It’s Hollywood vs. the Algorithm?” The Wall Street Journal. Dow Jones & Company. November 10. https://www.wsj.com/articles/at-netflix-who-wins-when-its-hollywood-vs-the-algorithm-1541826015.
[3] Chandrashekar, Ashok, Fernando Amat, Justin Basilico, and Tony Jebara. “Artwork Personalization at Netflix – Netflix TechBlog – Medium.” The Netflix Tech Blog. December 07, 2017. Accessed November 13, 2018. https://medium.com/netflix-techblog/artwork-personalization-c589f074ad76.
[4] Ekanadham, Chaitanya. 2018. “Using Machine Learning to Improve Streaming Quality at Netflix.” The Netflix Technology Blog. Netflix. March 22. https://medium.com/netflix-techblog/using-machine-learning-to-improve-streaming-quality-at-netflix-9651263ef09f.
[5] Ibid.
[6] Ramachandran and Flint.
[7] Plummer, Libby. 2017. “This Is How Netflix’s Top-Secret Recommendation System Works.” WIRED. WIRED UK. August 21. https://www.wired.co.uk/article/how-do-netflixs-algorithms-work-machine-learning-helps-to-predict-what-viewers-will-like.
[8] Ibid.
[9] Ibid.
[10] Nicas, Jack. 2018. “How YouTube Drives People to the Internet’s Darkest Corners.” The Wall Street Journal. Dow Jones & Company. February 7. https://www.wsj.com/articles/how-youtube-drives-viewers-to-the-internets-darkest-corners-1518020478.
[11] Spangler, Todd. 2018. “Netflix Has Deleted All User Reviews From Its Website.” Variety. Variety. August 19. https://variety.com/2018/digital/news/netflix-deletes-all-user-reviews-1202908904/.
[12] Ramachandran and Flint.
Great write-up! I can see how Netflix faces some of the risks you mentioned that Youtube encountered, but one thing that does help Netflix is that the library of content is generated completely by them and is curated, while Youtube was user-generated and so beyond their control. So even if the algorithms go astray a bit, presumably the content it directs a user too is still something that Netflix thought was worth licensing.
Regarding their use of data in creating original content, one thing that Netflix has reportedly been good at is giving creative control and flexibility [1] to the production partners it signs deals with. So while they seem to be data-driven in deciding what kinds of shows need to be created, they aren’t too prescriptive once they greenlight a production, which has helped them maintain a good relationship with creative talent.
[1] “Netflix gives creative control and production flexibility. Producers also aren’t saddled with the pressure for their shows to perform under traditional television ratings metrics,” said Peter Csathy, founder of media consulting firm Creatv Media. “Given those advantages, it’s hard to say no to Netflix.” via https://digiday.com/media/netflixs-deal-terms-pose-a-conundrum-for-tv-studios/
Interesting post! I definitely think that leveraging data to develop content can be a powerful tool. However, do you think that this will limit creativity? In the Ideo case, we saw that some of the most creative ideas come from a blank piece of paper approach. There is a risk that the large amount of data may result in too strict of content development guidelines. Ideally, Netflix needs to strike the balance between data-driven, and non-data-driven content development.
Thank you for the post! If Netflix’s machine learning algorithms use past data on consumer behavior to promote related content on the site–as well as to produce new, related content–does an issue arise where the same people are always are watching the same content? Ultimately, does this result in a lack of diversity in (1) the types of content an individual is viewing and (2) the types of content that Netflix is incentivized to create? I’m curious the extent to which human judgement (i.e. the judgement to produce diverse, original, and even provocative content) is allowed to override data via machine learning to drive Netflix’s content strategy.
Great post! Regarding the use of different previews for different viewers, this creates the risk of the nature of the show being misrepresented. For example, the preview can misrepresent how diverse the cast is, as was reported on in the press recently. This risks alienating viewers and damaging Netflix’s reputation. How can Netflix balance offering useful, targeted previews whilst ensuring that viewers do not feel manipulated or pigeonholed?
Nice post! Very interesting to read about the depths on which Netflix is built today on top of machine learning and data analytics. One fear, or perhaps only question, I have is how Netflix views creativity in light of machine-driven content creation. I am a little concerned that machine learnings may capture consistent themes and over time drive creation of content that will no longer be viewed as fresh, but as a repackaging of thematic elements known to work. We’ve seen some of this in Hollywood in particular, with large studios relying more on sequels and remakes than ever before. Do you think Netflix analytics can benefit creativity or does it limit it?