Microsoft’s Copilot AI assistant has accidentally revealed information from over 20,000 private GitHub repositories. These repositories belong to major companies like Google, Intel, and even Microsoft itself.
Originally, these repositories were public. However, they were made private after developers noticed they contained sensitive information, such as authentication credentials. Despite being set to private, the data still remains accessible through Copilot.
AI security firm Lasso uncovered this issue in late 2024. After discovering in January that Copilot continued to access these private repositories, Lasso wanted to assess the full impact of the problem.
Zombie Repositories
Lasso researchers, Ophir Dror and Bar Lanyado, noted, “We were shocked to see how easily this data could be accessed. Even if a repository was public for only a short time, tools like Copilot can index it.” They focused on identifying “zombie repositories”—those that were once public but are now private.
After finding that Copilot was even exposing one of Lasso’s private repositories, the researchers traced the issue back to Bing’s caching system. The search engine had indexed the repositories when they were public and didn’t remove them when the status changed to private. Since Copilot uses Bing for searches, this outdated information remained accessible.
After Lasso reported the issue in November, Microsoft took steps to correct it. They confirmed that private data was no longer available through Bing’s cache. However, Lasso made another startling discovery: Copilot still provided access to a repository that had been made private as a result of a lawsuit. This repository allegedly contained tools designed to bypass safety measures in Microsoft’s AI offerings. Although the repository was removed from GitHub, the tools were still available through Copilot.