New framework for auditing machine unlearning

Admin

New framework for auditing machine unlearning

Researchers have introduced Regularized f-Divergence Kernel Tests, a new framework presented at AISTATS 2026 that is designed to improve how machine unlearning is audited. The work focuses on checking whether an AI system has truly forgotten specific training data without requiring retraining from scratch.

Machine unlearning is used to remove specific parts of training data from AI systems and is relevant for regulatory compliance, including GDPR’s “Right to be Forgotten”, as well as AI safety and model quality. The source notes that as models process larger and more sensitive datasets, verifying unlearning has become a strict requirement where developers must mathematically prove privacy.

That verification is difficult because auditors often cannot access the model’s internal workings or the original training data. Instead, they have to test the system by querying it and analyzing output samples. One common approach is two-sample testing, which checks whether two sets of observations come from different underlying distributions.

In practice, auditors may compare outputs from a model that never saw a specific record with outputs from a model that supposedly “forgot” it. If the outputs are statistically different within a defined threshold, the unlearning failed. But the source says these methods become harder to apply as models grow larger and more complex, because they lose statistical power and require many samples, making real-world testing computationally very expensive.

The new framework is intended to make auditing ML models more sensitive, flexible, and accurate. The source says the tests naturally control for false positives for any sample size, and that the risk of false negatives reliably converges to zero as the number of available data samples increases.

Source: research.google.

Companies can share verified announcements through Newz9’s international press release submission page.