How can ML help Fox predict box office performance?

Fox is introducing machine learning to try to estimate box office performance and make business decisions on their movies. Can an algorithm understand something as subjective as whether movie goers enjoy a film?

20th Century Fox is one of the “Big Six” major American film studios, together with Paramount, Warner, Universal, Columbia and Disney. Founded in 1935, as of 2017, it held a market share of almost 13% in Gross revenue, only behind Disney, Warner and Universal (1).

The competitive landscape of the film industry is rapidly changing with the growth of other distribution channels. Platforms like Netflix not only are a window for thousands of hours of entertainment, but they have also started to develop original content with increasing budgets (2). This, on top of an increase in piracy levels is diminishing the margin of error for traditional movie studios (3).

Machine learning can help predict which projects will do better in the box office, which has implications in distribution and advertising budget allocated, and could eventually help decide which projects to fund or discard. Investing in movies has to huge risks, especially for projects that are new and for stories that the public has not seen or heard of yet.

Understanding how consumers will react to a story has proven to be a difficult task throughout the years, in the past based on intuition of experienced industry executives. It is critical to understand the market and how it is segmented. 20th Century Fox has made progress in this by partnering with Google’s Advanced Solutions Lab. They created Merlin Video. Estimating how users will react to a story based on a script is hard because many of the emotional and appealing characteristics of the movie will not be reflected. Therefore, Merlin uses the movie’s trailer to predict how potential moviegoers will react. Since the trailer is one of the main advertising tools used by studios to generate awareness and curiosity it can be considered a good proxy to estimate box office performance.

How does Merlin Work?   

Source: Google Cloud

Merlin is powered by Cloud Machine Learning technology. Since this is a managed service, all resource provision and maintenance are automated, so the day to day operations can be handled exclusively by data scientists, without requiring intervention from other business units (4).

The team began by analyzing a large publicly available dataset of YouTube videos. The idea is for Merlin to learn a collection of filters that capture a particular kind of object sequence that can then be suggestive of specific actions. For example, a filter could learn that intermittent sequences of a car and a person could mean that the scene consists of someone driving aggressively on the street being chased.

Since there are a huge number of varieties of object sequences that can appear on a movie trailer, the approach is to identify those that may be most predictive of box office performance based on customer transaction data (5).

After analyzing a movie trailer, Merlin would return the labels most associated. These could be things like tree, man, facial hair, vehicle, forest, etc. After identifying these labels results would be compared to labels previously generated for other movie trailers, as well as with its attendance records (6).

To account for the temporal positions of these labels in the trailer (they might appear in different moments) the team developed a custom neural network. This can give information about the movie type, plot, roles of the main characters, etc. When combined with historical customer data, the output can be creating predictions for customer behavior (5).

This result can lead to significant impacts in the marketing strategy for the studio. They can better allocate advertising budget given the predicted viewership intent. And, since the data is more granular, they can divide the actual box office performance in segments, to see which of their predictions came true and which did not.

What’s next?

The process described above can make predictions on viewership interest and box office performance based on movie trailers. However, this entails having already filmed the movie in its entirety. In 2007, the average movie budget for a big studio was 65 million dollars (7).

Efforts should be made to try to use this technology earlier in the production process. This could be trying to estimate intent based on the script itself, in spite of the difficulties that this entails, mentioned previously. If this were possible, it could help make decisions on how much budget to allocate to production or which movies to finance. Technology could also be applied during the shooting of the movie, using early rough cuts as a proxy that could lead to changes in script and directing during the making of the film.

Could we find a way to estimate a potential output for a film based only on the script, without first having the creative process of a director and actors?

Can machines and algorithms correctly predict something so subjective and creative as art?


(Word Count: 795)

  1. The Numbers, “Market Share for Each Distributor in 2017” Accessed November 2018
  2. Natalie Welters, “Netflix $8 Billion Content Budget to Fund 700 TV Shows and Movies” Accessed November 2018
  3. Revulytics, “Top 20 countries for Software Piracy and Misuse 2017” Accessed November 2018
  4. Google, “How 20th Century Fox uses ML to predict a movie audience” Accessed November 2018
  5. Hsieh, Campo, Taliyan, Nickens and Panya, “Convolutional Collaborative Filter Network for Video Based Recommendation System” Accessed November 2018
  6. Motherboard, “Fox is using Google’s Machine Learning to Predict what movies you’ll Like” Accessed November 2018
  7. Investopedia, “Why movies cost so much to make” Accessed November 2018


The Unilever Foundry – bringing innovation to 400 brands, under one roof


Valve using machine learning and deep learning to catch cheaters on CS:GO (794 words)

Student comments on How can ML help Fox predict box office performance?

  1. Miguel —

    I really enjoyed your article on how machine learning is helping the movie industry, specifically Fox, predict movie success via box office sales. You pose an interesting question on how to incorporate script elements into the machine learning process. I wonder if feeding the tone through recording of the script by the potential actors could assist with understanding how consumers would respond to the movie. Another concern I have related to your second question on predicting creative content is how actor preference or genre of movie is incorporated into the machine learning algorithm. Many consumers have very unique tastes with specific preferences, so I wonder how movie producers can assess their willingness to invest in specific actors based on the predictive success of the movie. I would think that certain, more famous actors would yield higher box office sales; however, perhaps a more detailed data point could be whether that actor won any recent awards. Thanks!

  2. Great read Michael. I feel like focusing on measuring the quality of the trailer isn’t exactly representative of how a movie is actually going to turn out, either in quality or in box office performance. I worry that by overindexing on the trailer, Fox might develop really top notch trailers that don’t lead to substantially higher box office performance – or at worse, might mislead people about the overall quality of the movie. I do think that there are ways to identify how elements of a script or actors and viewer reaction and correlated – Netflix’s development of House of Cards based off user search history and preferences is a great example of this. But I worry that Fox lacks the data that makes the Netflix approach compelling, which might limit or skew the final recommendations in a way that doesn’t bear fruit.

  3. Thank you for sharing, Miguel! This is a very interesting read. I agree with your point on the machine learning technology that is applied currently is somewhat used too late in the process, as the movie has been produced and millions of dollars have been spent as sunk cost. It would be interesting to see how companies like Fox can apply machine learning before the production process and how they can balance the use of technology with creativity in the movie industry. My other concern was the use of trailers which constitute a very tiny portion of the whole movie, as an indicator for the movie’s success. I wonder how accurately can the technology predict box office performance based on this minute set of data. Another point that is worth noting is how applicable machine learning is for movies that are made in the niche segments (that would not be a box office usually, as they cater to a very specific and narrow set of audience). In addition, if machine learning was to ultimately inform film studios and producers on what scripts to write and types of movies to make, then, where would the role of film makers’ creativity and passion be in this space?

  4. Very interesting, Miguel. To me, a great next step for Fox would be to apply this technology to entire movies, to see what moments should be included in a trailer. By feeding an entire move through Merlin, Fox could potentially build better advertising campaigns. Perhaps they could better choose which moments to include, but from an audio and visual perspective, and build more effective trailers using that information.

    To answer your second question, whether machines can predict something so subjective as art, I think that there are a lot of patterns in movies that Fox could take advantage of to understand whether or not a certain movie could potentially be popular. For example, superhero and action movies often follow predictable plot lines–perhaps in these genres a natural language processor that reads scripts could be more effective than in other genres to predict the success of certain scripts. However, because movies are art, I ultimately believe an algorithm like this would have to be used in conjunction with human readers to provide another data point. This data point could help readers reconsider scripts that they did not initially identify with, or it could help readers to make go/no-go decisions about whether or not make films that they are not completely sure about.

  5. Great post, Miguel! It is very interesting to see how Fox is taking advantage of machine learning by partnering with Google. In an competitive environment where Netflix and Amazon have huge access to data due to its entirely online distribution, the traditional film studios are increasingly at risk. Therefore, machine learning can be an important tool for traditional studios to increase their efficiency and remain competitive. To your question, I believe that algorithms can deal with the subjectivity of movies, a form of art, by assessing human reactions to it, something that online players, such as Netflix and Amazon, have done so far.

  6. Machine learning is indeed could be a very useful in predicting what would the viewers like to see and what would resonate the most with them, consequently driving a revenue for the major content producers. However, one of the questions I have is how much it’s actually contribute into the development of the cinematographic art? By tailoring content according to the viewers preference, how much left to the director to create and express his/her own opinion, share the voice? Would this eventually lead to the end of such great directors as Filini, Tarkovsky etc?

Leave a comment