Crowd-sourcing the Secret of Life: 23andMe and Open Innovation

How direct-to-consumer genotype company 23andMe uses open innovation to drive genetic research

The future of health through crowd-sourcing

The 1997 film Gattaca explores a world in which social class and upwards mobility is determined by the quality of one’s genetic code. While the film implies a far-off dystopian future, genetic sequencing companies like 23andMe are edging us closer to that reality.

23andMe provides direct-to-consumer genome tests. Customers receive a kit in the mail to provide the Company with a saliva sample. Customers are then provided ancestry information as well as health information about their genetic predisposition to certain diseases, carrier status, and other genetically-influenced wellness traits [1].

While ancestry and wellness kits are helpful and entertaining, 23andMe’s core competency is crowd-sourcing and aggregating genomic data. In fact, many liken the Company to “the Google of personalized healthcare” and there are many instances of the Company allowing business partners (such as pharmaceutical companies) access to its database [2]. With a dataset of 5 million participants, 23andMe’s success is driven by its ability to source content through open innovation [3].

Expanding the business model

Management clearly recognizes the value in its crowd-sourcing model and recently launched a study with Pfizer to find genetic correlations among those suffering from depression. The Company solicited data from customers through surveys – asking them about diseases they’ve been diagnosed with and other key medical data. As one of the only organizations with enough data to yield a statistically significant result, the Company was able to cross-reference over 140,000 instances of self-reported depression and compare the genotypes of those individuals. As a result, Pfizer and 23andMe were able to publish a study linking 15 regions of the human genome to a higher likelihood of depression [4].

Following the success of its depression study, the Company has doubled-down on open-sourced genetic diagnostics and introduced a conditions page as part of its service package. Focusing first on 18 conditions, the website allows consumers to rank various treatments based on personal experience of efficacy. The goal of the program is to identify why and how certain treatments impact patients differently based on genotype [5]. In February 2018, the Company announced it was launching a study with 100,000 participants to study the efficacy of various diets based on genetic makeup [6]. There have even been reports 23andMe’s data being used in law enforcement matters, such as identifying victims or alleged criminals through DNA tests [7].

Proponents of the company laud its “democratization” of healthcare research and disruption of the “wall of the white coat” [8]. Much of this praise is deserved – following the lead of companies like Facebook and Instagram, the Company has leveraged society’s increasing willingness to waive privacy rights for targeted content and insights. On a macro-level, the Company has spearheaded genetic research and humanity’s collective understanding of genetic disorders at a rapid pace.

Concerns with the product

While management is committed to their open-innovation techniques to aggregate data and identify health trends, this business model is not without its concerns. The Company is walking a tightrope of privacy and commerciality – while participating in research studies is completely voluntary, customers will never know the various ways in which their personal data could be used and manipulated. The risk of cyber-attack or accidental release of highly confidential material is real, with one such leak and subsequent federal inquiry occurring earlier this year [9].

Beyond privacy concerns are academic concerns about the quality of the research. Self-administered tests and surveys open the door for errors and bias. Scientists have found wide variability in results across genetic databases, calling into question the accuracy of the entire process [10]. Beyond error risk, many have criticized open-source medical data as irresponsible. For example, after the Company launched its condition page for self-reported efficacy of treatments, many psychologists noted the risk of such information discouraging patients from using treatments they would otherwise be well-suited for [11].

There is no doubt that 23andMe has leveraged crowd-sourcing in a way that has enabled cutting edge medical research. The Company itself is leading the charge on additional uses of open innovation, a megatrend becoming increasingly relevant in today’s economy. However, there are elements of the Company vision that are concerning and management will have to answer these basic questions – how should we protect individual privacy, what is the bar for good research, and how can this type of data be used in tandem with existing resources in a responsible manner? (736 words).


[1] Our Health + Ancestry DNA Service. (n.d.). Retrieved from 23andMe:

[2] Grothaus, M. (2005, January 5th). How 23andMe Is Monetizing Your DNA. Retrieved from Fast Company:

[3] About Us – 23andMe Media Center. (2018). Retrieved from 23andMe:

[4] Regalado, A. (2016, August 1st). 23andMe Pulls Off Massive Crowdsourced Depression Study. Retrieved from MIT Technology Review:

[5] Cohen, J. K. (2018, April 22). 23andMe wants to help users ‘crowdsource’ treatment information for 18 conditions. Retrieved from Becker’s Hospital Review:

[6] Raphael, R. (2018, February 27). To Get Bigger, 23andMe Is Watching A Hundred Thousand People Diet. Retrieved from Fast Company:

[7] Brown, K. V. (2018, January 14). DNA Detectives Are Searching for Killers in Your Family Tree. Retrieved from Bloomberg:

[8] Ross, B. (2018, January 15). 23andMe CEO Goes Beyond “Wall Of A White Coat”. Retrieved from Bio IT World:

[9] Baram, M. (2018, June 5). The FTC is investigating DNA firms like 23andMe and Ancestry over privacy. Retrieved from Fast Company:

[10] Sample, I. (2011, May 30). Genetics tests flawed and inaccurate, say Dutch scientists. Retrieved from The Guardian:

[11] Brodwin, E. (2018, April 30). 23andMe launches depression and ADHD “condition pages” that worry experts. Retrieved from Business Insider:


Can an algorithm replace “the pill”?


What if you showed up at the gym and your phone told you exactly what workout to do to achieve optimal gains?

Student comments on Crowd-sourcing the Secret of Life: 23andMe and Open Innovation

  1. It’s fascinating to speculate how the face of healthcare will change in the next 5-10 years as consumers drive more of the decision-making and demand more personalized care.
    At the same time, I agree with your concern that projects like 23andMe’s genetic data mining could trigger patient privacy scandals in the future. I believe the answer to avoid such an issue is by (1) using transparency and upfront disclosure with customers about how their data will be used; and (2) sharing that value / those insights with users of the product (e.g., new medication that may be particularly effective for their genotype).

  2. Extremely interesting topic- from a consumer perspective, after purchasing 23andMe I remember feeling similar privacy concerns. It will be interesting to see how this industry becomes regulated, and whether or not the access to customer genomes will continue to be proprietary to 23andMe or whether customers should be concerned about the company selling off data to other institutions. For example, would this information ever be subject to usage for criminal investigations? I personally feel the consumer benefit still outweighs these types of concerns, but I also understand the skeptics who question whether this company was founded with the genuine intention to help people understand their genetics or whether it is a part of a larger scheme to crowdsource genetics for their own purposes.

  3. If I were a researcher, I would be very skeptical about the research results published by 23andMe, because it’s solely dependent on the self-reports of customers, which can be biased and full of error even with the best intention of customers. The fact that the research based on these unreliable inputs is used for other customers to reference on efficacy of treatments is even more disturbing – what if you are conveying a wrong message, discouraging patients to seek proper treatment? One potential way to mitigate this risk is perhaps by working with hospitals and cross-checking the self-reported information, or give more detailed instructions/templates to customers for reporting purpose, which would increase the burden of customers and make the product less attractive. Overall I definitely see a trade-off between accuracy and efficiency.

  4. As a consumer, one additional concern that jumps out at me is the potential use of an individual’s genetic information in a discriminatory fashion. For example, insurance companies may in the future ask consumers to report outcomes of genetic testing in order to assign health-care premiums. I find it alarming that 23andMe data has already been used in law enforcement matters and that the company allows pharmaceutical partners to access its database. I am also interested to learn more about what 23andMe doing on the cybersecurity front and what measures they have in place to protect their customers privacy.

  5. Your essay correctly points out that privacy in this context is important AND that individuals increasingly waive their right to privacy. The latter point is especially interesting to me because it seems as though people could actually end up “opting-in” to the distopian future the Gattaca film depicts. Your piece made me think about the possibility that we could reach that point not through some sort of data breach but through the choice of individual people.

  6. Interesting topic! I hadn’t thought about 23andMe being a use of crowd sourcing before. I was particularly interested in the correlations being run between certain genomes and self-declared illnesses. Do you worry about the biases associated with self-reporting medical conidiations and how that may skew the data being examined? Depression in particular is an illness that I imagine goes frequently undiagnosed or people chose to report incorrectly.

  7. Interesting topic! I hadn’t thought about 23andMe being a use of crowd sourcing before. I was particularly interested in the correlations being run between certain genomes and self-declared illnesses. Do you worry about the biases associated with self-reporting medical conidiations and how that may skew the data being examined? Depression in particular is an illness that I imagine goes frequently undiagnosed or people chose to report incorrectly.

  8. Fascinating topic. I completely agree with your concerns around consumer privacy–this is an issue companies will likely never overcome. We’ve seen recent data breaches with Equifax, which resulted in the release of millions of consumer credit history and information. An event of similar scale for 23andMe would likely be impossible to recover from. Another concern I had when reading your piece was around data error and the risk involved in identifying criminal suspects. The company could be susceptible to both Type I and Type II error and would result in a serious ethical dilemma on whether or not using 23andMe data should be allowed in assisting with criminal cases. All in all, I think there are countless opportunities and benefits from 23andMe’s open innovation strategy, especially within the Healthcare sector as you mentioned in your writing.

  9. This is a particularly interesting topic in the context of health insurance and long-term care insurance. As people gain more information about their genetic sequencing from tests like 23andMe, they will be more informed about the risks they are facing in the future. This will better inform their decisions about what type of health and long-term care insurance to purchase, since they are more aware of their risk profile. While this will be beneficial to consumers, it also runs the risk of increasing premiums for insurance since, presumably, riskier consumers will be more frequently requesting insurance, while less risky consumers will be less likely to purchase insurance.

    Additionally, as consumers are more informed, there will be information available that may influence an insurance company’s decision to allow someone to purchase insurance. For example, a woman purchasing long-term care insurance knew she had inherited a gene which increased her likelihood of developing Alzheimer’s in the future, but she did not disclose that information to her insurance provider, which made her take multiple memory tests prior to allowing her to purchase insurance [1]. In the future, insurance companies may require individuals to use genetic testing like 23andMe to determine their risk profile in order to evaluate what their insurance premium should be. Hopefully, this does not happen, since pre-existing conditions should not influence an insurer’s decision to insure someone and/or how much to charge them.

    [1] Kolata, Gina. “New Gene Tests Pose a Threat to Insurers.” The New York Times, The New York Times, 12 May 2017,

  10. Great topic. I agree that it is important to question the quality of 23andMe’s research studies. I would be particularly concerned about selection bias, as individuals choosing to participate in 23andMe may be more homogenous in genetic composition than an entire population. Thus, the generalizability of the studies to other individuals remains low at this time. I would agree that the results must be interpreted responsibly, and question whether the results of these studies are actionable in today’s world. The medical conditions studied in 23andMe result from an interplay of genes and the environment not yet well understood, and so it remains unclear whether individuals are harmed or benefit by making lifestyle changes once aware of their genetic predisposition to certain conditions.

  11. Thanks for sharing your interesting take on an increasingly popular topic. While I applaud 23andMe for being a first-mover in this space, I share many of the same concerns that you do. I’ve actually used their kit for both ancestry and health. The ancestry results are much easier to believe and likely more accurate than the health data provided. Even as someone who has learned about genetics in medical school and has a developing understanding of medicine more generally, I find it very challenging to know what to do with the health data provided. I can only imagine how much more difficult it would be for someone without a medical background. The entire field of genetics is still very specialized with a small number of experts who really understand how to use this rich genomic data and doctors in general aren’t adequately trained to offer evidence-based advice on these subjects. Because of these issues, 23andMe has an ethical obligation to clearly communicate the limitations of its data, which could potentially undermine its business strategy. Finally, the accuracy of 23andMe’s health data will continue to be a serious question in my mind, as will any studies supported by pharmaceutical companies like Pfizer who stand to gain from identifying more people at-risk or suffering from conditions like depression (Pfizer produces Zoloft, a common depression treatment). Nonetheless, I see 23andMe as continuing to be an important player in this space. My hope is that they do so in a responsible manner.

  12. Unfortunately, as we move from to the ‘future of health through crowd-sourcing’ and a greater trend for participatory medicine, I fear that many of the issues encountered in traditional research models are again rearing their head.

    In addition to the ethical and privacy issues discussed – a key question for me is how this model can ensure it has a demographically representative data sample? Even in traditional, cancer research, we know that despite the importance of diversity, trials are more likely to be conducted on people that are more male, white and younger than the demographic distribution of cancer sufferers (1).

    The model of 23andMe, with a relatively high price for its recommended direct to consumer genetic test ($199 at the time of writing (2)), seems less likely to facilitate equal representation. Moreover, we know that awareness of direct to consumer genetic testing is lower amongst populations with lower incomes and lower numeracy skills (3).

    Given the likely unrepresentative sample of 23andMe’s genomic sample, and a business model that seems unlikely to facilitate equal participation, I would be concerned by the inequity inherent in seeking innovation and research advances using primarily their data.

    1. Murthy VH, Krumholz HM, Gross CP. Participation in Cancer Clinical Trials: Race-, Sex-, and Age-Based Disparities. JAMA. 2004 Jun 9;291(22):2720–6.
    2. 23andMe. DNA Genetic Testing & Analysis – 23andMe [Internet]. [cited 2018 Nov 15]. Available from:
    3. Agurs-Collins T, Ferrer R, Ottenbacher A, Waters EA, O’Connell ME, Hamilton JG. Public Awareness of Direct-to-Consumer Genetic Tests: Findings from the 2013 U.S. Health Information National Trends Survey. J Cancer Educ. 2015 Dec 1;30(4):799–807.

  13. Great topic that raises a whole host of issues around both privacy and accuracy. While privacy is of course a great concern, especially given recent hacks into databases such as Equifax, I personally have greater concern over accuracy and the potential to lead to incorrect treatment of given conditions based on what is a relatively cursory sequencing of patient DNA. I think there are 2 key potential sources of this inaccuracy. The first is the fact that patients are self-reporting results (as mentioned in the article and several of the comments above), which can not only be affected by cognitive biases, but also by simple lack of medical knowledge given the lack of provider involvement. Secondly, many of the correlation studies being performed fail to account for non-genetic factors that can be the more relevant drivers of a given condition, such as diet, lifestyle, etc.

    Lastly, as the B2C market for genetic testing continues to grow, I wonder when the scale will be such that the company is in direct competition with more technically sophisticated diagnostics such as BRCA testing to identify genetic mutations linked to breast cancer (provided by companies such as Myriad Genetics) or companion diagnostics associated with the treatment of certain cancer mutations. Given the clinical data supporting companies such as the above, I question whether 23AndMe will hit a brick wall with respect to its sequencing capabilities and therefore validation for treatment suggestions derived from its database, and as a result I have a tough time seeing the company ever being much more than simply a consumer product.

  14. Gattaca (spoiler alert!) ends on an uplifting note, though the world it depicts is certainly dystopian. As inequality of every stripe grows in this country, what do you think are the critical steps we can take to avoid a Gattaca-like future? I wonder if we would in aggregate be better off *not* knowing our relative advantages, disadvantages, and position vs. our fellow man – if we’d treat everyone better if we knew less about each other. If 23andme started saying: your DNA/ epigenetics says you have the potential to be x% smart, y% conscientious, etc…I think we’d be better off as a society without that information. But it may be hard to stop.

  15. I think that 23andMe is an incredible company, using crowd sourcing and open innovation in healthcare. I personally have used the platform to learn more about my genetic history and have found the findings fascinating. However, I do have concerns about the risks of data security. It was not until after using 23andMe that I was aware of the fact that law enforcement and federal government can pressure the company to share customers’ DNA [1]. Going forward, I think this risk is critical for 23andMe to implement more transparency around how customers’ data can be used. There should be clear disclosure to consumers in advance of using the 23andMe platform as to their policies on data security and privacy.


  16. This is a very unique take on crowdsourcing. I have been personally reluctant to use 23andMe for the exact privacy issues you highlight above.

    One thing I find particularly concerning about this type of innovation is the influence it may have on an individual without any doctor intervention. For example, certain deficiencies or predispositions that may be highlighted by the results of this test can drive individuals to change their behavior in a way that is even more harmful. I believe in the value of using the results of these tests alongside the consultation of a doctor as I believe professional human judgement should always be a component of medical/health decisions. By completely removing a medical professional from the chain, is 23andMe actually empowering uneducated consumers with too much information?

    I also find the potential downsides of misreporting to potentially be too high in this case. Humans have a tendency to misreport their own fitness, digestive and mental health activities and if this data is then fed into algorithms or studies that are used for future medical advice, we could be fueling a faulty system. As they say, “bad data in, bad data out.”

  17. Challenging privacy constraints is always a worthy conversation. Society wants to have access to enhanced products and sometimes this can best be achieved through people’s willingness to share personal data which can then be used to create a database that is analyzed. However, whenever receiving crowd sourced information, it is crucial to verify the reliability of the source because results are heavily dependent on the information provided. 23andMe should have an authentication process to ensure the efficacy of their results because incorrect generalizations can lead to detrimental effects on one’s health or measures one takes after being wrongfully notified of a predisposition to a certain ailment. Ensuring that the data submitted to 23andMe is kept confidential is a difficult issue. Their has been numerous cyber attacks on corporations that has led to leaked data and I believe that if this information gets into the wrong hands, people can be preyed upon because of their health vulnerability.

  18. At the risk of echoing all the sentiments above, I would just re- iterate that this is indeed a topic that needs to be addressed over the next few years as we do not have a clear answer to it just yet. If anything, I have so many questions. Beyond the potential concerns for the insurance industry and/ or finding potential partners, where would this leave us when it comes to choosing babies. With the rise of IVF and embryo screening for traits parents find ‘desirable’, what will our future world even look like? Beauty will be in the eye of the parent. As more genes associated with the likelihood of disease are uncovered, the possibility of a truly preventive medicine is within the grasp of many parents. But with that possibility come risks. How well will any one test deliver on its promise of a healthy child? Will parents feel obligated to use genetic testing without adequately understanding its benefits? What kinds of genetic tests will parents want? Indeed, recent findings suggest that an increasing number of parents using IVF are choosing embryos according to sex, and it’s possible to imagine them one day choosing embryos based on other nonmedical traits, such as hair color, height, or IQ. Would such choices reflect the less desirable aspects of our human nature?

  19. You pose a very tough question. In today’s day and age, we need to trust private companies to handle our data security. When it comes to DNA, there is an even greater risk than other personal data. If companies like 23andMe allow us to create anonymized accounts, it will certainly help, but as these platform grows, these companies can be an even greater target. As the platform grows, data will become more aggregated and anonymized, but the early adopters need to trust the companies using their data.

  20. As a recent 23 and Me participant, it is worrisome to learn about all that the Company is likely doing with my personal data. However, the potential for 23 and Me’s partnerships with healthcare / pharmaceutical firms like Pfizer is quite compelling. If we can leverage the digitization of genetic information to better match individuals to clinical treatments, then the argument can be made that the benefits of reducing our privacy outweigh the risks outlined in the post. Another area where this dilemma may manifest itself is in online advertising, where the more that a user’s profile is known, the more relevant the advertising that they are exposed to gets. With the digitization of most industries well underway, the abundance of available data will only grow amplifying the volume of cybersecurity risk in most markets. While this presents a concern, I do not believe that we should allow it to derail our efforts to innovate.

Leave a comment