Memorization Risk in Clinical AI: MIT’s Warning to the Healthcare Industry

Academic medicine is currently experiencing a subtle uneasiness that has nothing to do with hospital budgets or insurance codes. It has to do with memory, more especially the type of memory that machines aren’t meant to have.

Together with partners at the Broad Institute, MIT researchers have begun to pull at a thread that the majority of the AI healthcare sector would presumably prefer to remain untangled. When the thread is pulled, an unsettling revelation emerges: the foundation models that hospitals are rushing to implement might subtly remember the individuals whose records taught them.

Field	Details
Study Title	Privacy Risks in Foundation Models Trained on Electronic Health Records
Lead Researcher	Sana Tonekaboni, Postdoctoral Researcher
Senior Author	Marzyeh Ghassemi, Associate Professor
Institution	MIT Computer Science and Artificial Intelligence Laboratory (CSAIL)
Affiliated Center	Abdul Latif Jameel Clinic for Machine Learning in Health
Presented At	NeurIPS 2025 (Conference on Neural Information Processing Systems)
Core Concern	Memorization of patient records by AI foundation models
Data Source Studied	De-identified Electronic Health Records (EHRs)
Reported Breaches (24 months)	747 incidents affecting 500+ individuals each, per U.S. Department of Health and Human Services
Funding Partners	Schmidt Center, NSF, Gordon and Betty Moore Foundation, Google Research
Policy Implication	Standardized, context-specific privacy testing before model deployment

Sana Tonekaboni, a postdoctoral researcher at the Eric and Wendy Schmidt Center, led the work, which was presented at NeurIPS 2025. Associate Professor Marzyeh Ghassemi served as the report’s senior author. Although their discovery—that AI models trained on de-identified electronic health records can memorize and subsequently reveal patient-specific information—seems technical at first, it has no technological implications. It directly touches on a topic that predates computing. Doctors are asked to keep what they hear confidential in the Hippocratic Oath, which was written long before neural networks were imagined. It turns out that pledge did not include algorithms.

The foundation models that hospitals are licensing from large AI firms are meant to be generalizable. Extract trends from millions of records. Forecast sepsis, identify uncommon illnesses, and recommend dosages. That’s the marketing, and it’s usually true.

However, at some point during the training process, a model can stop generalizing and begin recalling, retrieving nearly exact lab data for a specific patient when prompted appropriately. This may occur more frequently than the industry would like to acknowledge, and the MIT team developed a set of useful tests to determine precisely how much information an attacker would need to accomplish this.

Depending on how you interpret it, the response might be either comforting or concerning. Ghassemi raised an important point: why would an attacker bother targeting the model if they already knew the dates and results of twelve of your lab tests? What they desired would already be theirs. The true danger arises when tiny details, like as an age, a zip code, or an uncommon condition, are sufficient to pull a whole record from the algorithm’s hidden weights. The most vulnerable patients are those with uncommon illnesses. A uncommon instance essentially signs its own name, while a common case blends in with the crowd.

Walking through the team’s logic gives the impression that healthcare AI has almost never been assessed through the perspective of privacy, but rather primarily through the lens of accuracy. De-identification has frequently been viewed as a solved problem, and speed-to-deployment has been the primary pressure. This assumption is out of date, according to the MIT report. They contend that privacy assessment must be context-specific. It’s one thing to reveal someone’s gender. Testing procedures haven’t caught up, and disclosing an HIV status, a drug use history, or a previous pregnancy termination is a completely different form of harm.

It’s difficult to ignore the larger pattern as you watch this develop. Regulators move slowly, while tech companies move quickly. Hospitals, on the other hand, are in the middle, anxious for solutions to relieve stressed employees but legally obligated to uphold a confidentiality requirement that predates the internet. In only the last 24 months, the HHS has recorded 747 health data breaches, the majority of which were related to hacking incidents. It feels, to be honest, like inviting a new class of breach into the system when foundation models that are prone to memorization are introduced into that environment without thorough testing.

The MIT group is not advocating for a stop. They are advocating for standards—testing frameworks that are developed collaboratively with medical professionals, privacy specialists, and legal scholars rather than being added after the fact. Another concern is whether the industry pays attention. The warning seems to have come at the perfect time, right before the technology gets too ingrained to conduct an honest audit. It’s yet unclear if anyone slows down enough to take action.

What's Hot

The Post-Doctoral Pipeline That Connects EPFL, MIT, CMU, and the University of Michigan — and Why It Matters

Why AI Models That Can Explain Themselves Are Harder to Build Than Anyone in Silicon Valley Admits

Inside the World of Approximation Hardness: Where Researchers Prove That Problems Cannot Even Be Approximately Solved

Memorization Risk in Clinical AI: MIT’s Warning to the Healthcare Industry

Why AI Models That Can Explain Themselves Are Harder to Build Than Anyone in Silicon Valley Admits

The AI Therapist: Can a Machine Cure the American Mental Health Crisis?

AI-Generated Science: When Algorithms Start Publishing Their Own Research

The Post-Doctoral Pipeline That Connects EPFL, MIT, CMU, and the University of Michigan — and Why It Matters

Why AI Models That Can Explain Themselves Are Harder to Build Than Anyone in Silicon Valley Admits

Inside the World of Approximation Hardness: Where Researchers Prove That Problems Cannot Even Be Approximately Solved

The Mathematics of Machine Translation: How Algorithms Learned to Understand Context

How a Graduate Student’s Dissertation on Coding Theory Became a Foundational Reference for 5G Network Designers

The Obscure SIAM Journal Paper That Quietly Changed How We Think About Signal Recovery

The AI Therapist: Can a Machine Cure the American Mental Health Crisis?

Top Insights

The Post-Doctoral Pipeline That Connects EPFL, MIT, CMU, and the University of Michigan — and Why It Matters

Why AI Models That Can Explain Themselves Are Harder to Build Than Anyone in Silicon Valley Admits

Inside the World of Approximation Hardness: Where Researchers Prove That Problems Cannot Even Be Approximately Solved

The Mathematics of Machine Translation: How Algorithms Learned to Understand Context

How a Graduate Student’s Dissertation on Coding Theory Became a Foundational Reference for 5G Network Designers

What's Hot

Memorization Risk in Clinical AI: MIT’s Warning to the Healthcare Industry

Related Posts