Lessons from Duolingo’s Effort to Support Free Language Learning from Crowdsourcing
@duolingo crowdsourced text translation platform allows users to learn a new languages online for free
Duolingo has become the most popular way to learn languages online. With over 100 million users, it is a free technology driven language education platform that includes a website, app, a crowdsourced text translation platform, and a language proficiency assessment center.
A 2013 Forbes article describes the company succinctly: “Duolingo famously marries two objectives: the app uses Rosetta stone-style interfaces and quizzes to teach users a new language, for free. As part of the learning process, many users will translate English-language articles into their native tongues and for that, Duolingo makes money” (1).
Indeed, from the beginning, Duolingo’s founder Luis von Ahn (founder of reCAPTCHA a very clever crowdsourcing trick unto itself) has been committed to providing free language education to the world. Unlike the non-profit Khan Academy, and despite receiving significant VC support from the beginning, von Ahn planned to fund the platform’s free education by building a sustainable business (2). And he believed that that model would come in the form of students translating real-world documents as a way to practice foreign language skills. Duolingo could then sell those translations. The business model is based on a mutually beneficial arrangement wherein students receive high-quality free language education and businesses receive translation services powered by the students.
Duolingo found early traction with their business model. The 2013 announcement of partnerships with CNN and Buzzfeed, two highly regarded news organizations, seemed to legitimize von Ahn’s vision of creating a sustainable business model through crowdsourcing (2). According to Forbes, Duolingo charged CNN and Buzzfeed for each word translated through its platform (1).
As with any crowdsourcing model, scale is essential to Duolingo. With more active students, Duolingo is better able to deliver high-quality translations in volume within hours. Duolingo uses an algorithm to combine the multiple efforts of students translating each phrase to produce a crowdsourced translation as accurate as those of a professional translator. Each article is translated by 30 to 40 people (1). For example, as part of a Duolingo lesson, students can read and translate trending Buzzfeed articles. Once nearly 50 students have translated the article, Duolingo combines their work into one synthesized translation and provides it to Buzzfeed for a fee.
At first glance, the platform seemed well positioned to exploit the fundamental benefits of crowdsourcing: It appears to match the cheap supply of a crowd with demand that might otherwise turn to more centralized and expensive options. By taking a closer look, however, and with the benefit of time, it’s clear Duolingo’s approach to crowdfunding was challenged by some important issues. These challenges are instructive to a broader crowdsourcing analysis.
- Quality control: Translation requires quality control. Someone has to check the quality and accuracy of each translated article before it appears, for example, on CNN. The question then is should the quality checker be on the Duolingo side or the publisher’s side? As a publisher, I’d be concerned about the quality of Duolingo’s output and would be inclined to have someone check it. This in turn diminishes the value of the service that Duolingo provides. The question is whether crowdsourcing tools can deliver at a high enough quality. Duolingo attacked this by synthesizing the translations of many people. This is a central challenge to crowdsourcing.
- Timeliness: When publishing the news, there is a need to match supply and demand in a timely manner. News publishers in particular work on tight timelines to publish breaking news. What if there are not enough users translating an article to ensure a high-quality and timely translation? The crowd is extremely powerful, but it can also be unwieldy and difficult to organize. Where the output of a crowd needs to be generated quickly, it steepens the challenge of corralling the crowd’s intelligence. In contrast, the content of other successful crowdsourcing platforms like Yelp and Wikipedia is more “evergreen” and therefore can constantly be revised and updated by other users. The evergreen nature of this content is better suited than a breaking news story for crowdsourcing because many new stories lose relevance and value within a short time of being published.
- Incentivization and transparency: Finally, in terms of incentives, Duolingo’s model lacks certain incentives that can motivate “workers”. Users studying a new language are apt to have very different levels of engagement in terms of quantity and quality. Is the desire to learn a language motivating enough to ensure high quality and consistent participation? Moreover, Duolingo did not exploit a rating system for top translators nor did it share fees with with translators. The company’s system is generally opaque. This leaves users unaware of how their translations are being used. To fully motivate a crowd, quality incentives and transparency around compensation structures – however defined – is crucial.
As a result, I’d argue that Duolingo has not fully exploited its crowdsourcing opportunity. If we compare Duolingo to other successful crowdsourcing platforms like Yelp and Wikipedia, the fundamental difference seems to be around timing and alignment of interests. Yelp and Wikipedia users are motivated to review a place or share knowledge about a topic that interests them. In contrast, Duolingo’s partnership with breaking news organizations requires instant translations of content that might not be well suited with what type of vocabulary a student is learning at any given time.
In June 2015, Duolingo raised an additional $45 million investment round led by Google Capital. Previous investor such as Union Square Ventures, NEA, Kleiner Perkins Caufield & Byers, Ashton Kutcher and Tim Ferris also participated in the round. Duolingo’s total funding to date is $83.3 million, and the company says its valuation is now around $470 million (4).
But what is next for Duolingo? Since the 2013 announcement, it does not appear that the firm has signed up any big name publishers and there is a scarce amount of data about the success of the Buzzfeed and CNN partnership. These publisher partnerships were not mentioned in the last press release for the recent round of fundraising. Instead, Duolingo seems to be more focused on the certification opportunity to monetize. This past September, Duolingo and Uber announced a new partnership called UberEnglish Program, which uses the Duolingo Test Center English Certificate to verify the English skills of Uber drivers without interviewing drivers individually. The Duolingo Certificate Exam costs $20 and takes 20 minutes on a user’s mobile phone or desktop (5).
Student comments on Lessons from Duolingo’s Effort to Support Free Language Learning from Crowdsourcing
Great post! It’s interesting that Duolingo plans to generate revenue from the efforts of its community of translators without providing any direct incentives to them. One idea for Duolingo would be to adopt a TopCoder model and run translation competitions, offering the community a chance to win prizes and building on Duolingo’s strong gamification culture. It would also attract more skilled translators and ensure a certain level of quality to the company hosting the competition. From the perspective of CNN, Buzzfeed and other companies that use Duolingo’s translation services, I would be concerned about privacy issues for content that is not yet meant to be shared publicly. How does Duolingo respond to concerns about sensitive content and ensure that companies maintain privacy?
This is fantastic! I have been following Duolingo and think it’s an incredibly innovative company. Your observation on the success of BuzzFeed and CNN partnership is spot-on that there hasn’t been much reviews or any new businesses. The Uber partnership is really interesting because it seems Duolingo is going beyond the original intended goal of being a translation service to businesses – perhaps another case in point that the translation model is not generating enough revenue for them. On the point of quality control, I remember seeing a Ted Talk by Von Ahn that the output of aggregating student translation actually rivals the quality of using a professional translator. Your view on incentivizing translators through a rating system or a certificate is also very relevant to the firm. Great insights!