Spotify – Listening in on Users, and Learning as a Result

How Spotify’s data strategies make music to our ears.

Spotify at a Glance

               Spotify, a digital music streaming app with access to millions of songs listen to by millions of users daily, managed to do what many music services before it couldn’t – tap into consumer data to give listeners the songs, recommendations, and playlists they didn’t know they needed. But how does Spotify do something like this? Spotify is an incredibly data driven business, where almost every decision is data informed or driven. As Spotify continues to accrue data (more on that later), Spotify iteratively improves algorithms and machine learning processes to listen to music and users to improve user experience and its business.

Where does Spotify get its data (and why is that important)?

               When it comes to Spotify’s Discover Weekly, one of the platforms leading features centered around providing a personalized list of new recommended songs, Spotify grabs data primarily from three sources: songs, the news, and of course users.

               Each song or sound files need to be classified in order to be recommended based on the shared characteristics it might have with other songs. In particular, convolutional neural networks (CNN) are used on audio. CNNs are neural networks, machine learning algorithms modeled on the human brain and nervous system, which apply a mathematical operation (usually matrix multiplication) in between their layers called convolution. In this way, Spotify can identify songs with shared features, such as quick rhythms or acoustic instruments.

               Spotify also utilizes the news and other publications (such as blog posts, metadata, and discussions) to identify what music is being discussed and compared. Spotify does this through natural language processing (NLP). NLP is a form of machine learning where the computer finds patterns in human language and, in Spotify’s case, relations and classifications between subjects. Tracking what music or artists are being talked about, as well as their frequency and potency, allows Spotify to map the current discussion and trends in the music industry. These maps act as another layer of relation to the music in question and can allow for weighting of specific terms and subjects.

               Finally, Spotify also tracks what users, their ultimate music consumer, do as well. Everything from clicks, time spent on pages, favorited artists, and more is tracked, however when it comes to Spotify’s Discover Weekly, Spotify pays particular attention to user created playlists. This is because users effectively label datasets for Spotify by grouping together songs, thus creating naturally labeled categories, even if there may not be an obvious relation. Spotify can then use a technique called Collaborative Filtering whereby it attempts to match a “missing” song from several similar users’ more complete playlists to another user’s less complete preferences, thereby filling in the missing song as a recommendation.

               All three of these main data sources come together to inform Spotify’s Discover Weekly, as well as other product decisions and business strategies. This above example highlights just one of Spotify’s personal content recommendations products, and doesn’t highlight its insights into forecasting music trends or powerful targeted ads. As a result, the more data Spotify has, the better it’s recommendations and insights become, thereby attracting more users to a better product, and the cycle repeats.

Criticism and Issues Facing Spotify

               With all that said, Spotify’s data is not risk free, as some of its history can show us. First, Spotify has had at least a few cybersecurity breaches, with user data ending up leaked or sold through hackers or malware. Users have also consistently criticized Spotify’s privacy and data policies, be it through data collection that is perceived as too aggressive, unclear usage terms, or invasive data requests. These complaints have been alleviated somewhat by the implementation of the GDPR (General Data Protection Regulation) in the European Union, which is of particular importance to Spotify in particular given its headquartered in Sweden, an EU member country. Since the implementation of the GDPR, Spotify has received GDPR related complaints and is subject to ongoing investigation for potential GDPR violations around user data usage. The continued use of user data therefore presents a significant business risk, and asset, to Spotify.

A Hopeful Future?

               In the near future, Spotify will likely continue to face challenges to its business model, based on how it compensates artists and generates revenue. A few high-profile artists, such as Taylor Swift, have removed themselves from the platform entirely, which is a path that was unthinkable to most industry experts before it happened. As such, Spotify will and must continue to develop and deploy revenue generating projects. Some of these could include mood detecting software (and associated advertisements, insights, or music recommendations), a continued push into long format audio like podcasts and potentially books, and providing services for creators to better advertise their content and understand their listeners in ways that they had no ability to before.


Health is going Digital

Leave a comment