Duolingo, one of the leading free language learning services, utilizes machine learning across thousands of exercises to help users learn and retain new languages.
One of the most important aspects of Duolingo’s product influenced by machine learning is measuring how easy it is to retain each word. Simply put, some words are easier or harder to remember, and Duolingo uses machine learning to more precisely identify this trend. Duolingo looks across every exercise ever completed and determines how easy or difficult a word is to remember based on how well students fared next time they saw that word. With enough data, it predicts how quickly students forget that word, and then predicts when it should test you on this word again. If you get the word right or wrong the next time you see it, Duolingo recalibrates the formula based on how other students like you learned the word over time, and tweaks how quickly you’ll be tested on the word again.1
This approach is important because everyone learns languages differently. Duolingo uses many methods, from matching exercises to rote translation, to help students learn. By studying how well students retain words based on different methods of learning, it can also optimize the algorithms to teach certain words using certain activities (or a clever combination of activities). Even the order in which concepts are taught can be optimized – should one learn adverbs or past tense verbs first?2
Duolingo has also developed an English proficiency test that measures a test taker’s English abilities. To build their test, Duolingo analyzed the Common European Frame of Reference, the global standard for measuring language proficiency. It processed tens of thousands of passages corresponding to different language levels and used machine learning techniques to develop algorithms about which words or concepts were most important to test at each level.3
Looking forward, Duolingo is focused on driving more users to the platform. Machine learning techniques work best when they have a robust data set to work from. While some languages pairs, like English to Spanish, are used at a higher frequency, other pairs have fewer users which makes it more difficult to train the algorithms – what might have been an easy cognate for an English speaker learning Spanish might be a totally different looking and more challenging word when the speaker’s native language is Hungarian.
Duolingo also wants to improve their ability help students reach more advanced language proficiency. To address this, they’re developing the Stories feature, which lets students review long form passages with more complex words and answer questions about what they’ve heard or read. Different from the traditional Duolingo approach, there are a set number of stories and topics which aren’t algorithmically generated. This highlights the limitation of Duolingo’s approach with advanced levels – few users make it to the end of the lessons, so there is less data to track for more advanced words and topics. As the volume of data thins and as the language (and how people acquire it) becomes more complex, Duolingo has tweaked their algorithmic-based, machine learning-driven approach that made its core offering so successful.4
Long term, Duolingo is exploring how they take their insights with language learning and apply them to other subjects, from physics to reading. It’s extremely difficult for many across the world to access a high-quality education, and while services like Duolingo don’t address the structural issues that prevent people from accessing high quality education, Duolingo could help provide a similar machine learning-driven approach to helping those without resources learn more than just language.5
There is still opportunity to improve Duolingo’s approach. Duolingo focuses its machine learning on what users put into the app, or how well users recall a word when it comes up on the screen. But as any language learner knows, stepping up to counter and ordering something in a new language is very different that remembering flashcards. For users who are trying to use their new language as they learn it, Duolingo can feel disconnected from this real-life experience. But real life provides an incredibly rich data set for Duolingo to analyze and finding a way to integrate it into their algorithms would be a huge next step. Whether it’s allowing users to input words that they forgot each day or even allowing Duolingo to capture and analyze your spoken conversations, the platform could better track how users are actually behaving and provide even better recommendations about what to learn or improve next.
One critical question is how improved translation services will change the role and necessity of language learning. As Google Translate and other similar services get better at accurately translating longer and more complex passages, will people feel compelled to learn new languages? Does Google Translate and similar apps represent a competitive threat for Duolingo? (796 Words)