Microsoft’s AI researchers have inadvertently exposed dozens of terabytes of sensitive data, including things like private keys and passwords, when they published training data for open source on GitHub.
GitHub users were instructed to download the models from an Azure Storage URL, but the security firm Wiz discovered that this URL provided access to the entire storage account, resulting in the exposure of private data. The exposed data included 38 terabytes of sensitive information, including personal backups of two Microsoft employees’ personal computers, passwords for Microsoft services, keys, and over 30,000 internal messages from hundreds of Microsoft employees in Microsoft Teams.
The URL was misconfigured and allowed “full control,” which meant that anyone who knew where to look could delete, replace, and insert malicious content into them. Microsoft is said to have fixed the issue after being informed by Wiz, and no customer data is reported to have been exposed. The company has now increased monitoring of what it uploads to GitHub to detect the exposure of passwords and other secrets.