The enterprise engineer's decision tree for Copilot data controls

Your org uses Copilot. GitHub just changed the data rules. Now what?

On March 25, GitHub added Section J to its Terms of Service, a new provision consolidating all AI-related data terms into one place. The policy grants GitHub and its affiliates a license to use Copilot interaction data for model training, with collection starting April 24, 2026. The default for Free, Pro, and Pro+ users is opted in. Business and Enterprise customers are contractually excluded.

The news coverage has been wall-to-wall. What it hasn't covered well: the actual decision tree for engineering orgs that need to figure out their exposure, verify their settings, and decide whether the opt-out toggle is sufficient for their threat model. That is what this guide covers.

Mapping the data surface

Before you toggle anything, understand what GitHub is actually collecting. GitHub's official blog post from Mario Rodriguez (GitHub CPO) lists the interaction data categories:

Inputs sent to Copilot, including code snippets shown to the model
Code context surrounding cursor position
Comments and documentation written by the developer
File names, repository structure, and navigation patterns
Interactions with Copilot features (chat, inline suggestions)
Feedback signals (thumbs up/down ratings)
Outputs accepted or modified by the user

This data may be shared with GitHub affiliates (Microsoft). It will not go to third-party AI model providers, per GitHub's FAQ.

The critical distinction that keeps getting glossed over: private repository code "at rest" is not used for training. But code from private repos sent as context during active Copilot sessions qualifies as interaction data. GitHub's FAQ spells it out: "code snippets from private repositories can be collected and used for model training while the user is actively engaged with Copilot while working in that repository."

So "private repo code is safe" is only half true. The code sitting on disk is safe. The code that flows through Copilot while you type is not, unless you opt out or are on a Business/Enterprise plan.

The tier matrix

Not every Copilot plan is affected equally. Here is the breakdown based on GitHub's changelog and FAQ:

Training-eligible (opt-out required): Copilot Free, Copilot Pro, Copilot Pro+

Contractually excluded: Copilot Business, Copilot Enterprise, students/teachers on free Copilot Pro access

The edge case that matters: If a developer's GitHub account is a member of or outside collaborator with a paid organization, GitHub excludes their interaction data from training regardless of their personal subscription tier. This is from GitHub's FAQ: "If a user's GitHub account is a member of or outside collaborator with a paid organization, we exclude their interaction data from model training."

This means org membership itself is a data protection boundary. But it also means developers who are not org members, using personal Copilot accounts while touching your codebase, fall outside that boundary.

The four-point audit

1. Confirm your org's Copilot tier

Open your GitHub organization's admin settings. If you are on Copilot Business or Enterprise, you are contractually excluded from Section J's training provisions. A quick confirmation: the training opt-out toggle does not even appear in Business/Enterprise org-level settings, as documented by DevelopersIO.

If your org uses individual Pro or Pro+ subscriptions instead of an org-managed Business plan, you are exposed. Every developer account needs individual opt-out action.

2. Hunt for shadow subscriptions

This is the risk most orgs miss. A developer had Copilot Pro before you rolled out Enterprise seats. They never switched over. Their personal account is not an org member. They push to your repos daily.

Their interaction data from sessions in your codebase? Training-eligible.

Audit: cross-reference your org member list with Copilot seat assignments. Any contributor working in your repos who is not on an org-managed seat needs attention.

3. Verify individual opt-out status

For accounts on Free/Pro/Pro+: navigate to github.com/settings/copilot. Under "Privacy," check "Allow GitHub to use my data for AI model training." The Register notes this is also accessible at /settings/copilot/features.

If a developer previously disabled the older "prompt and suggestion collection" setting, that preference carries forward automatically. No re-action needed.

4. Understand the opt-out boundaries

Opting out stops future collection. It applies to both GitHub and affiliates. But it does not:

Purge data collected before the opt-out
Prevent GitHub from processing code context to run the service (Copilot must see your code to suggest completions)
Undo the original training corpus: as The Register noted, OpenAI's Codex was "a GPT language model fine-tuned on publicly available code from GitHub"

Decision tree: what to actually do

Your move depends on your compliance requirements and threat model.

Already on Business/Enterprise? Section J does not apply to org-managed seats. Your action item: audit for shadow personal subscriptions (point 2 above). Enforce that all developers contributing to your repos are org members using org-provisioned Copilot seats.

Running individual Pro/Pro+ subscriptions across a team? Two paths:

Path A, opt out and stay: Each developer toggles the setting individually. Cost: $0. Risk: there is no org-level enforcement. You are trusting every developer to opt out, and any new hire defaults to opted in.

Path B, upgrade to Business: Moves your team under the enterprise contract with a Data Protection Agreement. The cost delta (roughly $10/seat/month more than Pro) buys contractual certainty instead of hoping everyone found the toggle. For most engineering orgs, this is the obvious answer.

Zero-egress compliance requirement? Neither the toggle nor the enterprise contract prevents GitHub from processing your code server-side during Copilot sessions. The opt-out only prevents training use. If your compliance framework requires that proprietary code never leaves your infrastructure, evaluate:

Self-hosted code completion via Ollama, vLLM, or llama.cpp running CodeLlama, StarCoder, or DeepSeek Coder. Full air gap. You own the GPU costs.
Tabnine Enterprise, which offers VPC-deployed models.
Continue.dev, an open-source IDE extension that connects to local model servers.
Amazon Q Developer, governed by your AWS agreement instead of GitHub's ToS, though still cloud-hosted.

When NOT to rely on the opt-out

The toggle is a settings preference, not a contractual guarantee for individual-tier accounts.

Do not treat it as sufficient when:

Your industry requires Data Processing Agreements (HIPAA, SOX, FedRAMP, ITAR). Toggles do not satisfy auditors. You need the Business/Enterprise DPA or self-hosting.
Developers work across personal and org accounts. The org-level protections only cover sessions where the developer is authenticated as an org member using an org-managed seat.
Your IP is your product. Even with opt-out, code context flows to GitHub's servers during active sessions. Opting out prevents training, not processing.
You need retroactive deletion guarantees. The policy says future collection stops. It makes no clear commitment on purging previously collected interaction data from pipelines already in flight.

How the community responded

The Register reported that the GitHub community discussion had 59 thumbs-down emoji votes and 3 rocket ships. Of 39 posts, only Martin Woodward (GitHub VP of Developer Relations) endorsed the policy. GitHub pointed to similar opt-out policies at Anthropic, JetBrains, and Microsoft as precedent.

Your calendar

Now through April 23: Run the four-point audit above
Before April 24: Ensure every affected developer has toggled the opt-out at github.com/settings/copilot, or upgrade to Business
April 24: Collection begins for accounts that have not opted out
Ongoing: New hires default to opted in. Build this check into your onboarding

GitHub's stated rationale: internal testing with Microsoft employee data produced measurable improvements in suggestion acceptance rates, per Rodriguez's blog post. Whether that tradeoff works for your org depends on what code you are protecting and who is asking about it at your next compliance review.

Nate Hargrove covers guides for The Daily Vibe.