No, Microsoft isn’t using your Office docs to train its AI

But people are right to be worried about companies scraping user data without permission.

by Jess Weatherbed

Nov 27, 2024, 11:15 AM UTC

Photo illustration of a computer with a brain on the screen.

Illustration by Cath Virginia / The Verge | Photos by Getty Images

Jess Weatherbed is a news writer focused on creative industries, computing, and internet culture. Jess started her career at TechRadar, covering news and hardware reviews.

Microsoft says it isn’t using customer data from its Microsoft 365 apps to train its AI models. The clarification addresses reports circulating online in the last few weeks claiming Microsoft required Word and Excel users to opt out of training the company’s AI systems.

The confusion arose from a privacy setting in Microsoft Office that toggles “optional connected experiences” — a feature that helps users “search for online pictures” or “find information available online,” according to Microsoft. This toggle is switched on by default, and fails to mention AI training in the disclosure. Similarly, a Microsoft learning document posted on October 21st, 2024 seems to have contributed to the confusion by describing a long list of connected experiences in Office that “analyze your content” without explicitly excluding Large Language Model (LLM) AI training.

“In the M365 apps, we do not use customer data to train LLMs,” the Microsoft 365 X account said, responding to claims. “This setting only enables features requiring internet access like co-authoring a document.” Microsoft’s communications head Frank Shaw also chimed in on Bluesky to debunk the claims.

Adobe faced a similar backlash earlier this year after its user terms were widely misinterpreted to mean the company was training generative AI on the work of its users. Adobe swiftly updated the language in its terms of service to clarify this wasn’t the case.

The Adobe and Microsoft incidents suggest that people are increasingly concerned with their personal data being used by tech companies to train their AI models without express permission. It’s an understandable concern given companies like Meta, X, and Google opt their users into AI training by default, and the vast quantities of online content being scraped for that purpose.

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.