GitHub added a new Section J to its Terms of Service this week. It says that starting April 24, inputs, outputs, code snippets, and associated context from Copilot Free, Pro, and Pro+ users will be used to train AI models, unless those users manually opt out.
Notice the carve-out: Business and Enterprise customers are excluded. So are students and teachers accessing Copilot Pro for free. If your GitHub account is a member of or outside collaborator with a paid organization, your interaction data is excluded from training too, even if you're on a personal Copilot plan. The protection follows the org relationship, not just the subscription tier.
The people who pay the most get contractual data protections. Everyone else gets a default toggle flipped to "on" and 30 days' notice.
What GitHub is actually collecting
The blog post from Mario Rodriguez, GitHub's Chief Product Officer, lays out the scope. The interaction data collected includes prompts sent to Copilot, generated suggestions, accepted or modified outputs, code context around your cursor, comments and documentation, file names, repository structure, navigation patterns, and thumbs up/down feedback.
This is not metadata. This is the substance of how developers write code.
Here's the part that redefines "private": if you have model training enabled and you're actively using Copilot in a private repository, code snippets from that repo can be collected during your session. GitHub's FAQ clarifies they don't pull code from private repos "at rest," but that distinction matters less than it sounds. If Copilot is running while you code, your private repo content is in play.
The legal scaffolding
The updated Terms of Service are doing real structural work here. The new Section J consolidates all AI-related terms and creates an explicit license grant: unless you opt out, you're granting GitHub and its affiliates (read: Microsoft) the right to collect and use your inputs and outputs to develop, train, and improve AI models.
The Privacy Statement update is equally pointed. For users in the EEA and UK, GitHub is claiming "legitimate interest" as the lawful basis for processing data for AI development. That's a GDPR mechanism that lets companies process personal data without explicit consent, provided their interest doesn't override the user's fundamental rights. It's legal. It's also the kind of basis that data protection authorities tend to scrutinize when it involves large-scale data collection from millions of users.
GitHub also expanded data sharing with affiliates. The Privacy Statement now allows Microsoft to use shared data for "developing and improving artificial intelligence and machine learning technologies." The company says third-party AI model providers don't get access. But Microsoft isn't a third party here. Microsoft is the parent company that owns GitHub and builds the models Copilot runs on.



