The tension between people and data at Netflix
Netflix is one the largest media companies in the world, growing from a small DVD rental company to a subscription-based streaming platform and media behemoth. In 2017, Netflix reported having over 117 million subscribers in more than 190 countries. Every day, Netflix’ subscribers stream more than 140 million hours of content[1] from the company’s extensive content library, which includes both licensed and original content. Moreover, due to its digital status, Netflix has a large amount of data about its users and their behavior. This data has enabled the company to utilize machine learning to produce shows that customers want to watch. However, in the TV and filmmaking industry, many projects are based on the artistic visions of creatives and relations between the different components of the ecosystem. Therefore, Netflix’s dependency on data is creating a tension between people and data, as the company strives to dominate the global media scene.
Machine learning is when computers analyze data and learn from it using statistical probabilities[2]. Machine learning is changing the media scene, as content generators are utilizing the predictive power of computing to cluster their consumes and accurately predict their engagement with the content. Netflix has adopted machine learning early on in its journey to develop original content, as its algorithms learn from the plethora of user’s viewing behaviors that the company gathers; from how users pick a show to watch, to when do they pause, rewind, stop or binge watch[3]. Combining user behavior data with viewing preferences, Netflix proved it could engineer the perfect show, as the company was able to have an 80% success rate for original shows, almost double the success rate of traditionally produced TV shows[4].
With the great success of ‘House of Cards’ one of Netflix’ earliest original shows, the company expanded its dependency on machine learning to produce original content. The company’s investment in original content increased significantly, from almost 24 shows produced in 2015, to 700 shows in 2018 and an expected budget $8 billion[5]. However, the rise of original content production gradually turned Netflix into a Hollywood powerhouse, strengthening its position in an ecosystem based on relations. This presented a challenge to the company, as on one hand it rose to prominence by being data driven, while on the other hand it has become a leader in an industry based on people and relations. As a result, tension between the Los Angeles based content team and the Silicon Valley based technology team started becoming more prevalent. For example in 2017, after Netflix released the show ‘GLOW’ about woman wrestling in the 1980s, the data team concluded that the performance of the show was not meeting expectations and therefore did not merit a renewal for a second season. However, the content team argued that the relations with the creators of the show are important for Netflix’s position and future projects, and subsequently won the argument[6]. A similar argument about data, relations, and artistic independence happened within the company regarding the show ‘Lady Daynimite,’[7] and as the company expands its original content and strengthen its position as a production empire, this tension will only grow.
To address this issue in the short term, Netflix could reach out more to the different components of the ecosystem to raise awareness on the role of data in the content creation and selection process. However, in the medium term, Netflix could start explicitly integrating quantitative benchmarks in the production contracts, becoming more transparent with creators and enabling them to know the metrics that will qualify their show or films for renewal or further investment. On the other hand, to address the tension between data and creativity in the short run, I recommend that Netflix divides its original content budget into two segments: data driven content and people driven content. The data driven content is a continuation of the content highly shaped by machines to optimally serve the needs of the customers. On the other hand, the people driven content could be a smaller part of the budget, dedicated for creative visionaries who will take risks and produce fresh content, with a higher tolerance for failure. Yet the question that still stands is: how can a data driven company successfully develop a strong position in a relations based industry? Could Netflix utilizes data and machine learning, while at the same time give creatives the freedom to produce groundbreaking shows?
(729 word)
[1] Netflix, Inc. 2017 Annual Report, p. 1, [https://s22.q4cdn.com/959853165/files/doc_financials/annual_reports/0001065280-18-000069.pdf] accessed November 2018.
[2] Chris Meserole, “What is Machine Learning?” The Brookings Institute, October 4, 2018, [https://www.brookings.edu/research/what-is-machine-learning/], accessed November 2018.
[3] Bernard Marr, “Netflix Used Big Data To Identify The Movies That Are Too Scary To Finish,” Fortune, April 18, 2018, [https://www.forbes.com/sites/bernardmarr/2018/04/18/netflix-used-big-data-to-identify-the-movies-that-are-too-scary-to-finish/#42d62ca93990], accessed November 2018.
[4] Orcan Intelligence, “How Netflix Uses Big Data,” Medium (blog), January 12, 2018, [https://medium.com/swlh/how-netflix-uses-big-data-20b5419c1edf] accessed November 2018.
[5] Todd Spangler, “Netflix Eying Total of About 700 Original Series in 2018,” Variety, February, 2018, [https://variety.com/2018/digital/news/netflix-700-original-series-2018-1202711940/]
[6],7 Shalni Ramchandran, Joe Flint, “At Netflix, Who Wins When It’s Hollywood vs. The Algorithm?” The Wall Streer Journal, November 10, 2018, [https://www.wsj.com/articles/at-netflix-who-wins-when-its-hollywood-vs-the-algorithm-1541826015], accessed November 2018.
Very interesting article, and interesting questions to boot. I think the question of whether or not there is space for creatives to pursue untested show types in the greater environment of data-driven decision-making has no definitive answer. However, as of yet, machine learning is a largely reactive innovation. The kinds of algorithms that Netflix and others employ look at data in the past and attempt to overlay conclusions from that data onto the future of the options already in existence. For example, Netflix can look at the data and say, “because you liked show X, you should also like show Y,” but what Netflix can’t yet do is say that “because you liked show X, you might also like a show Z that does not currently exist.” That’s where the creatives still fit, in my opinion. Creatives can push boundaries and create content that users don’t yet know that they will like, and until machine learning can crack that code, both creatives and data-driven processes can co-exist.
I agree with you that in a creative industry that Netflix is in, tensions between creativity and data-driven decision making need to be managed well. However, I disagree with the proposed solution of creating a data-driven team and a creativity-based team separately in the short-term. I would argue that it will increase the distance between the two teams even more. The reason is that while I do believe there are tensions between data and creativity, I would also argue that they can be used together in a very meaningful manner. For example, creative content is not just based on a visionary defining the market- it is based on an educated understanding of what the market likes based on the past and predicting how they can utilize that knowledge to shape the market’s demands in the future. Visionaries are already using data to predict and shape the market. I would argue that in fact data and creativity can in fact be married to create even better content and there is need for those two teams to come closer and not farther. What AI can do for Netflix is predict what consumers will like based on the past data. However what it can’t do is accurately predict what consumers themselves don’t know about their preferences. This is where creativity can support data, making educated guesses about what consumers may like beyond what is already being predicted by AI. After all, the success of media powerhouses is based not just on predicting and making the right movies but also shaping the market by coming out with revolutionary ideas not evident in the data.
Interesting article. I’d like to respond to several of your recommendations:
1. Integrating quant benchmarks in production contracts: the inherent difficulty in deciding what to renew vs. cancel is that once you set a content budget, every subsequent decision is made at the opportunity cost of another potential show or movie. Say a given show hits its pre-set “target” but 20 other shows dramatically exceed theirs – if Netflix only has money for 15 renewals, what should it do? You can’t really set targets in isolation.
2. Dividing content budgets: all decisions at Netflix are, as Ted Sarandos stated in Variety last year, a combination of data and intuition. Data on previous show performance can give you a sense for the overall market size for an idea/concept, but execution through showrunners, writers, directors, and actors is critical to the outcome. Judgement is necessary.
The concept of engineering the perfect show is fascinating! Your recommendation to break future original content into a data driven and people driven bucket is something I completely agree with. I love the idea of using data to shape the design of a show. However, creativity and inspiration in the world of TV shows and movies often comes from things that have never been done before. This is where data can not advice Netflix.
Fantastic post and as a fan of GLOW I am happy that the relationship element won out.
I believe Netflix should institute several quantitative benchmarks while still operating within the relationship-driven bounds of Hollywood. The Company should establish baseline requirements for renewal that provide it with an easy out even in the face of in-demand showrunning talent while having a “zone of flexibility” that allows it to renew some shows but not others that have similar performance metrics.
On another note, I would be interested to get the author’s perspective on how Netflix’s performance metrics should compare to those of traditional television networks. One of my favorite shows growing up, Chuck, was repeatedly saved from cancellation by fan intervention (often at Subway restaurants [a major show sponsor]). However, the value that these passionate fans delivered to NBC in the form of a Subway sponsorship paled in comparison to the ad revenue from a show with better viewership. With little background, I would imagine that a small but passionate fan base delivers much higher value to Netflix than it does to a major network. I wonder how superfans of a specific Netflix show know how to campaign for their program’s renewal.
You pose a very poignant question that more smart people need to debate. I believe the tension between data and relationships is something that transcends any one business, something that is truly permeating throughout our society. This tension is something I think about a lot, something that is certainly front of mind as I progress through my job search.
To answer your question explicitly, I believe reserving certain white space for your creative visionaries is beyond essential for the livelihood of Netflix and the media industry. Consumer preferences change, and the relationship between consumers demanding content vs consumer responding to content is still not entirely clear. I believe that the greatest films, shows, and other entertainment manifest in formats and plots that are truly different. I do not see data behind these examples. I agree that data can play a role in producing content that is good enough to generate financial returns, but in a world where data is informing all content, the only way to truly differentiate may be to rely on good old creative minds.
Machines cannot make decisions in a vacuum, and I think one major limitation of applying machine learning results at Netflix thus far has been the human element. The output of ML algorithms can be hard for a human to understand, especially since their very usage is predicated on analyzing volumes of data that would be impossible for a person to comprehend. This hits at a major issue of ML, which is that at times, even the scientists who built it are unable to explain the output.1 However, that doesn’t mean that the data is wrong, or unusable.
In an artistic environment such as Netflix, it can be daunting to be told that a passion project is unfeasible because a formula determined it unworthy, but not be given a refined explanation as to why that is. Rather, it should be a human’s job to take the result from the algorithm, and contextualize it in a way that appeals to the person who is receiving the decision. For example, in the case of ‘Glow’, instead of reporting that the data deems the show to be unsuccessful, if the goal is to maintain a creative relationship with the team, then the news can be framed as ‘the current show is not reaching the right users…would you be interested in working on a show in [another data-approved genre] so that we can showcase your abilities to a larger audience?’. Ultimately, machine learning cannot stand on its own just yet, and its usage has to be carefully managed by someone who is aware of how to properly utilize its insights.
1. Knight, W. (2017, April 11). The Dark Secret at the Heart of AI, MIT Technology Review. Retrieved from https://www.technologyreview.com/s/604087/the-dark-secret-at-the-heart-of-ai/