Visit hbs.edu

The Myth of Machine Unlearning: The Complexities of AI Data Removal

In an era where artificial intelligence (AI) increasingly shapes our digital landscape, the concept of “machine unlearning” (ML) has emerged as a potential solution to various challenges in AI governance. First authors A. Feder Cooper, Faculty Associate at The Berkman Klein Center for Internet & Society at Harvard University; Christopher A. Choquette-Choo, Research Scientist at Google DeepMind; Miranda Bogen, Director, Center for Democracy & Technology (CDT) AI Governance Lab; Matthew Jagielski, Research Scientist at Google DeepMind; Katja Filippova, Research Scientist at Google DeepMind; Ken Ziyu Liu, PhD Student at Stanford University; plus Seth Neel, Assistant Professor at Harvard Business School (HBS) and Principal Investigator, Trustworthy AI Lab at HBS’ Digital Data Design (D^3) Institute and 28 colleagues (see the Meet the Authors section for details), conducted recent research that reveals that the promise of machine unlearning may be more complex than initially thought. The article, “Machine Unlearning Doesn’t Do What You Think: Lessons for Generative AI Policy, Research, and Practice,” explores the key insights from their study, shedding light on the limitations and misconceptions surrounding machine unlearning in generative AI systems.

Key Insight: The Complexity of Information Removal

“Deleting information from an ML model is not well-defined. First, information cannot be deleted from an ML model in the same way that it can from a database.” [1]

The authors highlight that removing information from an AI model is fundamentally different from deleting data from a traditional database. The process of machine learning intertwines information within the model’s parameters in complex ways, making targeted removal challenging. For instance, attempting to remove a specific individual’s data from a model trained on thousands of examples is not straightforward. This complexity is exacerbated in generative AI models, which produce information-rich outputs based on vast amounts of training data.

Key Insight: Misalignment Between Technical Methods and Policy Goals

“Unfortunately, the fit between unlearning and policy is not so straightforward in practice. Machine unlearning is a set of technical methods and here, as always, there are critical gaps—gaps that are too often overlooked—between what technical methods do and what policy aims to achieve.” [2]

The researchers emphasize a significant disconnect between the technical capabilities of machine unlearning and the expectations set by policymakers. While unlearning methods may address specific technical challenges, they often fall short of meeting broader policy objectives. For example, attempts to use unlearning to comply with privacy regulations like the “right to be forgotten” in the EU’s General Data Protection Regulation (GDPR) may not fully align with the law’s intent, as the methods may not guarantee complete removal of an individual’s influence on the model.

Key Insight: The Challenge of Output Suppression

“Since all of these methods focus on suppressing outputs, their success is most often evaluated by examining how they affect the types of generations that are produced in some downstream task.” [3]

The research team points out that many unlearning methods aim to suppress certain types of outputs rather than truly removing information from the model. This approach is often evaluated by testing the model’s responses to specific prompts. However, this method of evaluation may not capture the full extent of the model’s knowledge or capabilities. In a real-world example, a model might be trained to avoid generating certain types of content, but this doesn’t necessarily mean the underlying information has been removed from its knowledge base.

“A common theme for these areas is the underlying assumption that using unlearning methods to constrain model outputs could potentially act in the service of more general ends for content moderation—to prevent users from generating potentially private, copyright-infringing, or unsafe outputs.” [4]

The research team explores how machine unlearning intersects with critical areas such as privacy protection, copyright compliance, and safety measures. They argue that while unlearning is often proposed as a solution in these domains, its effectiveness is limited. For instance, in the context of copyright, determining whether an AI model’s output infringes on existing works is a complex legal issue that goes beyond simply removing specific training data. And from a technological standpoint, it is difficult to prevent generative AI models from producing “substantially similar” content, a key issue in copyright law.

Why This Matters

Understanding the limitations of machine unlearning is crucial for business leaders, policymakers, and judges navigating the AI landscape. The authors’ research reveals that current unlearning methods are not a panacea for addressing all concerns related to AI governance. There is no single approach, through either technology or policy, that will work in all situations. This insight is particularly important for companies developing or implementing AI systems, and regulators and judges creating and interpreting rules. It will be important for all parties to have realistic expectations of whether systems take reasonable steps to prevent harm rather than achieve perfect results.

References

[1] A. Feder Cooper, Christopher A. Choquette-Choo, Miranda Bogen, Matthew Jagielski, Katja Filippova, Ken Ziyu Liu, et al., “Machine Unlearning Doesn’t Do What You Think: Lessons for Generative AI Policy, Research, and Practice”, arXiv preprint arXiv:2412.06966v1 (December 9, 2024): 2.

[2] Cooper et al., “Machine Unlearning Doesn’t Do What You Think: Lessons for Generative AI Policy, Research, and Practice”, 2.

[3] Cooper et al., “Machine Unlearning Doesn’t Do What You Think: Lessons for Generative AI Policy, Research, and Practice”, 11.

[4] Cooper et al., “Machine Unlearning Doesn’t Do What You Think: Lessons for Generative AI Policy, Research, and Practice”, 13.

Meet the Authors

* First Authors

** Lead, correspondence

  • *A. Feder Cooper, Co-Founder of the GenLaw Center, Postdoctoral Researcher at Microsoft Research, Postdoctoral Affiliate at Stanford, Human-Centered Artificial Intelligence, and Faculty Associate at The Berkman Klein Center for Internet & Society at Harvard University
  • *Christopher A. Choquette-Choo, Research Scientist at Google DeepMind
  • *Miranda Bogen, Director, Center for Democracy & Technology (CDT) AI Governance Lab, and Affiliate, Princeton Center for Information Technology Policy (CITP)
  • *Matthew Jagielski, Research Scientist at Google DeepMind and PhD Student at Khoury College of Computer Sciences, Northeastern University
  • *Katja Filippova, Research Scientist at Google DeepMind
  • *Ken Ziyu Liu, PhD Student at Stanford University
  • **Katherine Lee, Staff Research Scientist at Google DeepMind and Co-Founder of The GenLaw Center
  • Alexandra Chouldechova, Principal Researcher at Microsoft Research
  • Jamie Hayes, Senior Research Scientist at Google DeepMind
  • Yangsibo Huang, Research Scientist at Google
  • Niloofar Mireshghallah, Postdoctoral Scholar at the Paul G. Allen Center for Computer Science and Engineering at University of Washington
  • Ilia Shumailov, Senior Research Scientist at Google DeepMind
  • Eleni Triantafillou, Senior Research Scientist at Google DeepMind
  • Peter Kairouz, Research Scientist at Google Research
  • Nicole Mitchell, Research Scientist at Google Research
  • Percy Liang, Associate Professor of Computer Science at Stanford University
  • Daniel E. Ho, Professor of Law, Stanford Law School
  • Yejin Choi, Wissner-Slivka Professor at the Paul G. Allen School of Computer Science & Engineering at the University of Washington and Senior Director of Large Language Model (LLM) Research at NVIDIA

Engage With Us

Join Our Community

Ready to dive deeper with the Digital Data Design Institute at Harvard? Subscribe to our newsletter, contribute to the conversation and begin to invent the future for yourself, your business and society as a whole.