Baseball: Getting to Today
Baseball has a long and storied past in America. Over the past 40 years, machine learning and data have fundamentally changed the game. As teams used data to perfect the “product,” the game changed in fundamental ways. Machine learning ruined baseball. Now, it has an opportunity to save it. This paper will analyze how machine learning impacts baseball’s product development: developing players and winning games.
Professional baseball began in the mid 1850s, and the National League – a prominent baseball fixture even today – was born in 1876. For over a century and a half, fans have been captivated by the sport. Baseball has a propensity for wild, crazy, and unpredictable turns. As baseball legend Yogi Berra says, “it ain’t over ’til it’s over.”
The future of baseball shifted in 1980, when “sabermetrics” began to take hold. It is “the search for objective knowledge about baseball.” For almost 100 years, managers led based on intuition and small amounts of data. Sabermetrics promised an edge: Player A might be the best, but 80% of his ground balls are towards first base. Why not change your defense to adjust? Player B might be a top pitcher, but towards the end of the game, why not bring in three different pitchers to face three different batters, each tailored for the situation. The game began to shift, shown below.
So what? As the data got better, more teams used it. Coaches made more and more changes to players throughout the game. Games took incrementally longer, and star hitters saw their ability to produce runs diminished, as shown below. Fans get to spend more time at a ballpark to watch less action – thrilling!
The numbers support this. In 2016, the Chicago Cubs won the best championship game in history. The Cubs broke a 108-year draught, the longest in any American sport. It was the most-watched in 25 years, with 40+ million viewers, 70% higher than the prior year. How does that compare? On a down year, the NFL brings in 100+ million Super Bowl viewers.
Data ruined baseball. Now, we will discuss how data can save baseball.
What’s Next for Baseball?
First, there are baseball’s current plans. The trend towards longer games and more data-based decisions is not going anywhere. Baseball will continue to leverage machine learning to produce a better “product” – a winning team. Of course, teams cannot simply ignore data simply to make games faster. The algorithms will continue to get smarter with more data, and new sensors and data collection methods will allow more analysis.
Longer-term, baseball plans to use data to analyze more types of information. Companies are investing in technology that would allow algorithms to monitor players’ physical health and even mental health. Imagine knowing a player might not perform because of his or her brain activity or heart rate, not just the opposing pitcher for the day.
As the algorithms get better, games will get longer. Baseball will implement more and more controls to shorten games: pitch clocks, automated strike zones, and perhaps fewer pitching changes. Still, these will remain reactive and not buck the trends.
What else can MLB do? There are several opportunities. First, if data is such a big opportunity for coaches, why not for fans? We interact with the output of machine learning in a boring way: basic stats are broadcast. Why not make this data a product for customers, too? Imagine a young fan seeing all the pitching data on her iPhone at the ballpark – she votes to keep the starting pitcher in place and sees what other fans selected. The coach decides to keep the pitcher in, and he gives up a homerun. The fan felt the impact and interacted with the data. This is essentially an HBS case every single play. Data does not need to be behind the scenes.
Second, baseball should use data to make better games and schedules. Our world is dramatically different than it was in the early 1900s, yet teams still play the same in-division games for most of the year. Data can better inform not just how teams play, but whom teams play. Baseball should loosen the structure between leagues and allow data to inform what games would draw the most fans for at least some part of the season.
There is a fundamental question raised here: what is baseball’s product? Is it entertainment? It is winning? Is it the entertainment of winning? That is, if a game is more fun but your team loses, do you go home happier? Data does choose a side – it goes for winning. How do we strike the balance? And, how involved do we keep human decision-makers along the way? (783 words)
- Baseball Game Length Visual Analysis
- Roth, David. Baseball no longer a supergiant but it is still the most American of sports. The Guardian. Link
- Kurkjian, Tim. Putting the future in focus: The blueprint for baseball in 20 years. ESPN. Link.
- Keri, Jonah and Paine, Neil. How Bullpens Took Over Modern Baseball. FiveThirtyEight. Link
- Society for American Baseball Research