23andMe: Healing the World through Crowdsourcing

Biotech firm 23andMe used crowdsourcing to collect vast body of data and fuel first-of-a-kind depression study

What is 23andMe?

23andMe is a California-based biotech company that provides direct-to-consumer genome tests. The company sells those analyses to customers in the form of two products: an ancestry service, and a health service. The ancestry service “reports on your ancestry composition, Haplogroups, Neanderthal ancestry, your DNA family and provides a DNA relative finder tool to enable you to connect with people who share DNA with you” (1). The health service tells the customer about their “carrier status, wellness, and traits” (1).

The process is very simple. Once you purchase the product online, 23andMe sends a kit through which you provide your saliva sample, and mail it back to their labs. A few weeks later, you’ll have access to your results. 23andMe affirms that they follow high quality standards, and FDA-approved, scientifically- and clinically-valid processes (2).

Through August of last year, 23andMe had sold over a million kits (3). As an aside at the time of the purchase, 23andMe seeks permission from all customers to use their DNA (as well as survey questions about their health) in further research. That’s crucial, because more than half of the customers in their database have opted in (3). That’s high-quality data from over half a million people around the world, ready to be analyzed.


Linking genetic differences to common diseases:

One way to analyze such data would be through genome-wide association studies—examinations that help identify genetic differences between people with a disease and a control group of healthy people. Such studies have yielded important advances in research of various diseases (e.g. diabetes) (3). However, many of those studies fall short for one simple reason: lack of a large-enough sample size, or in other words, lack of data.

Enter 23andMe.


Crowdsourcing data leading to scientific breakthroughs:

Last summer, Pfizer, in partnership with 23andMe, conducted the largest ever genome-wide association study into the genetic causes of depression. For the first time, the study successfully detected “15 regions of human genome linked to a higher risk of struggling with serious depression” (3).

23andMe was able to provide a sample of 141,000 people diagnosed with depression, and another 337,000 to be used as a control group (3). For context, the next-largest depression study ever included only a tenth of those figures! (3).


Reasons for success, and potential pitfalls of similar crowdsourcing efforts:

There are a couple factors working in 23andMe’s favor, allowing the company to collect such a vast body of data:

  • Offering entertainment value for the masses: the vast majority of 23andMe’s customers purchase the product for its unique entertainment value (mainly for discovering one’s ethnic background). And they pay a hefty sum for it: $199 for the ancestry and health services. That’s clever, because in most cases, people participating in scientific research and providing data are the ones getting compensated, not the other way around. 23andMe has succeeded in attracting (lots of) customers seeking entertainment, using their data for research…while making money at the same time.
  • Building a reputable brand: 23andMe is operating in a heavily regulated field, which can be a curse, but also a blessing. Having the FDA’s stamp of approval and partnering with companies like Pfizer are huge boosts to the credibility of their brand.
  • Operating in a competition-light space: another benefit of operating in a heavily regulated (and highly technical) field is the added barriers of entry against competition. Lacking numerous alternative options for such a unique product, customers are willing to pay $199, and (more importantly for 23andMe) volunteer their answers and their data.

So, will this be a never-failing formula for gathering massive amounts of data for similar genetic research? That’s unlikely, according to Stanford psychiatric and gene researcher Douglas Levinson (3). The success of this method needed 141,000 customers who self-identified as sufferers of depression. However, for other ailments that are less common and more socially stigmatized, similar efforts might fail due to people’s unwillingness to volunteer (3).


  1. https://www.23andme.com/
  2. https://www.23andme.com/howitworks/
  3. https://www.technologyreview.com/s/602052/23andme-pulls-off-massive-crowdsourced-depression-study/


Wikipedia – The Free Encyclopedia


DeNA Case: Failure of HBS Alumna

Student comments on 23andMe: Healing the World through Crowdsourcing

  1. Great post, Ali. I love this business and am hoping they’ll be successful. One of the biggest risks I see for them (or any imitators) is actually around privacy. It’s not inconceivable to me that access to large-scale genetic data could quickly put you in legal and ethical gray areas. For example, do you want marketers of specific drugs knowing that you have a genetic proclivity to a disease that they treat? Or do you want employers to know that you have a predisposition to specific types or levels of intelligence, or a particular risk of someday collecting disability? The Spiderman quote “with great power comes great responsibility” comes to mind.

  2. Ali – thanks for the great post! I’ve actually been toying with the idea of ordering a kit and completing the test myself. Will be sure to “opt in” on the research when I do so.

    I have a question regarding the research. You said that success of this method needed 141,000 customers to self-identify as sufferers of depression – does that mean that 23andMe’s research is based on self-identification of a disease? If the research is based on those who self-identified as sufferers of depression, I am concerned about the future of 23andMe’s ability to conduct research in a different way than Douglas Levinson is concerned. My concern is that either people will have misdiagnosed themselves (ie they are not clinically depressed but believe themselves to be without doctor confirmation) OR that they will have labeled themselves as not a sufferer of depression when in deed they clinically are.

    Did you find anything in your research that clarified how 23andMe ensures its research isn’t exposed to the error of miss self diagnosis? Thanks for clarifying!

  3. Thanks for your post Ali! 23andMe announced this past 2016 that they have decided not to focus on next-gen sequencing as a path forward for more holistic sequencing of genes. How do you think this will impact credibility with the scientific community. https://www.buzzfeed.com/stephaniemlee/23andme-anne-wojcicki-next-generation-sequencing?utm_term=.hibE7YgEM6#.eiRALe0AgP

    In terms of regulation, in the past FDA has raised questions around 23andMe actually got shut down by the FDA – while FDA approves 23andMe’s current product, how will future reiterations on their product be impacted by potential regulations by the FDA?

    I’d also like the point out that genomics start-ups were actually the highest funded digital health category of 2016 at $410M – how do you think this will impact competition in the space for 23andMe – especially for companies that are focused on platform based genomics testing (Helix) and other focused on next gen sequencing (ColorGenomics, Arivale, Genospace)?

  4. Hi Ali, thanks for your post! How does 23andMe work to ensure that self-identified disease is actually clinically-diagnosed disease? I echo Megan’s concerns that there is a chance for ill-defined data to creep into studies and potentially obfuscate findings. Furthermore, is there a broad sampling error created when all the data comes from individuals who opt-in to a $199 genetic test?

  5. Ali, great post. I am intrigued by 23andMe’s strategy, especially as it pertains to the growing momentum behind precision medicine. Increasingly, we believe that genomic data will enable patients to be treated as individuals, with their diseases (especially cancer) perceived as unique and related more to genomics than initial emergence. I see 23andMe’s partnership with Pfizer as interesting, but only a small part of their potential to contribute to the advancement of science and medical treatment.

    However, it does seem that there are some issues around data security – especially when dealing with something as personal as one’s genome/DNA. How does 23andMe get consent from customers to use their data for large-scale studies? And do you think there might be obstacles to growing this beyond their current market penetration, as people become more suspicious of how their data is being used to generate value for 23andMe (when they paid out of pocket to give that data to 23andMe, rather than the other way around)?

  6. Thanks for the post.
    Do you think that 23andMe should consider broader expansion internationally, both in terms of sample genetic data and customer base? I think one of the disadvantageous aspects of 23andMe is that the results tend to be more comprehensive for Caucasian people in the US. As an Asian, for example, it may not even be worth it to purchase an ancestry report from 23andMe due to the lack of data on non-Caucasians in the data set. I think this article does a good job of pointing out some of the problems of 23andMe concerning lack of diversity in its data sets: https://qz.com/765879/23andme-has-a-race-problem-when-it-comes-to-ancestry-reports-for-non-whites/

  7. Thanks for the post, Ali! I’m curious about your thoughts on the health services testing side of the business. Medical professionals are hesitant about serious health advice being delivered over the internet/not in person. They feel that there is a certain and more delicate way in which humans should be advised on their ongoing and future health. The 23andMe health test reveals predisposition to some pretty serious diseases — and this information is probably best delivered by a doctor, in-person. Do you think 23andMe has a role in ensuring medical diagnoses are delivered in a sound way (i.e., in person and from a medical professional that is trained in these interpersonal dynamics)?

  8. Great post! I especially like the points you made in the last sector of this article. I wonder if the existing consumers can get updated reports as the database grows. Apparently the bigger this database is, the more accurate 23andMe’s report will be (presumably). And to some extend the early consumers paid to benefit later comers.

Leave a comment