Duo: Your New Data Driven Language Teacher

Duolingo uses data and analytics to drive personalized language learning…and it works

Duolingo is a digital gamified language learning program with over 500 million registered users learning more than 32 languages.[1] The company started with the mission to “make education free and accessible to everyone in the world.[2] To achieve this mission and to scale extensively, Duolingo has leveraged the data generated from its digital platform and invested in artificial intelligence (AI) and machine learning (ML) to deliver an exceptional learning experience, personalized to each user.

Figure 1: Duolingo’s app with personalized lessons


Duolingo generates hundreds of millions of data points from its 42 million monthly active users every day.[3] The data is used to train and improve AI algorithms which humanize the learning experience and keep users engaged and returning to the app. It’s like having a private language tutor in the form of Duolingo’s mascot, a green owl named Duo.

Figure 2: Duolingo’s mascot “Duo”


Since there is significant heterogeneity among individuals learning languages in terms of individual’s different goals, knowledge, capability, interest, time and learning preferences, Duolingo’s strategy of collecting and processing data with AI enables the app to differentiate itself from competitors. Duolingo has detailed information on each user so that it can tailor lessons to the individual needs based on the user’s previous lessons and mistakes. Duolingo therefore provides lessons that allow users to continuously progress. In fact, Duolingo has built its AI model off of spaced repetition which determines how many times a user has seen a word and how long it will take for the user to forget that word. The spaced repetition system which began in 2013 as Duolingo’s first AI project predicts which words a user has forgotten and re-introduces those words into the user’s lesson.[4] This helps users to master new languages faster and more efficiently, based on AI’s seamless yet personalized lesson recommendations.


Duolingo also uses a computer adaptive placement test which enables new users to spend five minutes taking a quick quiz when they sign up for a new course, effectively suggesting the most suitable part of the course for the user based on their proficiency.[5] By leveraging the data and AI models, Duolingo can quickly capture the interest of new users to ensure that they are sufficiently challenged and excited by the course that they have signed up for.


Birdbrain, Duolingo’s machine learning model complements the app’s personalized learning system by predicting how hard specific lessons will be for a user. Based on that prediction, which is enhanced and trained by Duolingo’s half a billion lessons completed each day, user’s lessons are calibrated and either made harder or easier based on the user’s success. The blame algorithm is used to try to understand why users are making mistakes. Smart tips, another feature based on machine learning, attempts to give immediate tips based on the algorithm’s prediction for the root cause of a mistake.[6]


Apart from the AI and ML that Duolingo leverages to improve the quality of lessons, one of Duolingo’s best features is its app notifications which are backed by AI to prompt users to open the app to practice at times when they are most likely to respond to the notification. By leveraging data to optimize notification time, Duolingo has been able to increase user retention by 2% for new users in the period of one day to one week after download. [7]


Figure 3: Duolingo’s AI backed notifications

Implementation of AI and ML has not been an easy path at Duolingo. In particular, the company struggled to find talent that could advance Duolingo’s data analytics efforts, while still understanding the psychology and cognitive nature of language learning. However, Duolingo’s investments in AI and ML have enabled the company to capture value by increasing retention rates through content and incentives that cater to each individual. This higher retention rate translates to increased revenue through conversions from free to paid users and more targeted in-app advertisements. The success of AI and ML at Duolingo is underscored by the company’s 2020 revenues of $180 million, a 13x increase from 2017. [8] As Duolingo continues to establish itself as a valuable data-driven learning platform, talent acquisition should become less of a challenge with more data scientists excited by the opportunity to work for a $1.5 billion educational technology company.[9]

Figure 4: Duolingo’s growth in revenue as a result of AI and ML models [10]


Moving forward, it will be crucial for Duolingo to continue to focus its efforts on recommending the most effective order for users to learn languages. Additionally, there are many opportunities for Duolingo to further leverage AI and ML to offer users the opportunity to have conversations and live learning experiences with bots. This use of AI and ML will better simulate an immersive language learning experience. With years of data available, Duolingo has created a competitive advantage and is positioned to continue to differentiate itself and to expand its AI and ML capabilities to provide a superior language-learning experience.



[1] https://www.businessofapps.com/data/duolingo-statistics/

[2] https://www.forbes.com/sites/bernardmarr/2020/10/16/the-amazing-ways-duolingo-is-using-artificial-intelligence-to-deliver-free-language-learning/?sh=6be1ca745511

[3] https://www.businessofapps.com/data/duolingo-statistics/

[4] https://venturebeat.com/2020/08/18/how-duolingo-uses-ai-in-every-part-of-its-app/

[5] https://www.wired.com/brandlab/2018/12/ai-helps-duolingo-personalize-language-learning/

[6] https://venturebeat.com/2020/08/18/how-duolingo-uses-ai-in-every-part-of-its-app/

[7] https://venturebeat.com/2020/08/18/how-duolingo-uses-ai-in-every-part-of-its-app/

[8] https://www.businessofapps.com/data/duolingo-statistics/

[9] https://www.businessofapps.com/data/duolingo-statistics/

[10] https://www.businessofapps.com/data/duolingo-statistics/




Netflix: a streaming giant’s big data approach to entertainment


LUCA: Telefónica’s shady side hustle

Student comments on Duo: Your New Data Driven Language Teacher

  1. Great article, Tiffany! I’m impressed by all the ways Duolingo incorporates data analytics in its product to optimize the user experience and maximize retention. I guess one thing that would be interesting to understand is how Duolingo strikes a balance between prioritizing progress (by making lessons harder, more challenging) and focusing on retention (by making sure the user doesn’t give up). At the end of the day, it’s not in Duolingo’s best interest to make users multilingual too fast! I agree that the solution to keep users (once they are comfortable speaking a language) would be to have more AI-driven ‘practice’ features, that simulate human conversations.

Leave a comment