Microsoft AI researchers unintentionally exposed terabytes of internal sensitive data

Artificial Intelligence

Microsoft AI researchers unintentionally exposed terabytes of internal sensitive data

admin

September 18, 2023

Microsoft AI researchers unintentionally exposed terabytes of internal sensitive data

Microsoft AI researchers unintentionally exposed tens of terabytes of sensitive data, including private keys and passwords, while publishing a storage bucket of open source training data on GitHub.

In research shared with TechCrunch, cloud security startup Wiz said it discovered a GitHub repository belonging to Microsoft’s AI research division as a part of its ongoing work into the accidental exposure of cloud-hosted data.

Readers of the GitHub repository, which provided open source code and AI models for image recognition, were instructed to download the models from an Azure Storage URL. Nonetheless, Wiz found that this URL was configured to grant permissions on all the storage account, exposing additional private data by mistake.

This data included 38 terabytes of sensitive information, including the private backups of two Microsoft employees’ personal computers. The info also contained other sensitive personal data, including passwords to Microsoft services, secret keys and greater than 30,000 internal Microsoft Teams messages from tons of of Microsoft employees.

The URL, which had exposed this data since 2020, was also misconfigured to permit “full control” reasonably than “read-only” permissions, based on Wiz, which meant anyone who knew where to look could potentially delete, replace and inject malicious content into them.

Wiz notes that the storage account wasn’t directly exposed. Fairly, the Microsoft AI developers included an excessively permissive shared access signature (SAS) token within the URL. SAS tokens are a mechanism utilized by Azure that permits users to create shareable links granting access to an Azure Storage account’s data.

“AI unlocks huge potential for tech corporations,” Wiz co-founder and CTO Ami Luttwak told TechCrunch. “Nonetheless, as data scientists and engineers race to bring recent AI solutions to production, the large amounts of information they handle require additional security checks and safeguards. With many development teams needing to control massive amounts of information, share it with their peers or collaborate on public open source projects, cases like Microsoft’s are increasingly hard to observe and avoid.”

Wiz said it shared its findings with Microsoft on June 22, and Microsoft revoked the SAS token two days in a while June 24. Microsoft said it accomplished its investigation on potential organizational impact on August 16.

In a blog post shared with TechCrunch before publication, Microsoft’s Security Response Center said that “no customer data was exposed, and no other internal services were put in danger for this reason issue.”

Microsoft said that in consequence of Wiz’s research, it has expanded GitHub’s secret spanning service, which monitors all public open source code changes for plaintext exposure of credentials and other secrets to incorporate any SAS token that will have overly permissive expirations or privileges.

LEAVE A REPLY Cancel reply