Getting started with gstack: how to set up Garry Tan's open-source AI coding factory

Garry Tan has been running Y Combinator while reportedly shipping more code than most full-time engineers. In the GitHub README for gstack, he claims 600,000+ lines of production code in 60 days (35% tests), generating 10,000-20,000 lines per day, part-time. Either the numbers are inflated or his toolkit is doing something worth paying attention to.

gstack, which Tan open-sourced in March 2026 at garrytan/gstack, is a collection of workflow skills that plug into Claude Code. It doesn't replace Claude Code or add a new model layer. Instead, it imposes structure on how you interact with it: separate slash commands for planning, code review, browser-based QA, and shipping. A generalist AI assistant becomes a set of specialists, each activated on demand.

The repo has grown fast. According to gstacks.org, it has over 16,000 GitHub stars and 1,800 forks. The toolkit now lists 15+ commands total, though the original launch centered on 8 core workflow skills. Here's how to get it running and what it actually does.

What you need before you start

gstack has a short dependency list, but a few items might catch you off guard:

Claude Code -- required, obviously
Git -- standard
Bun v1.0+ -- this is the one people miss. Bun is a fast JavaScript runtime (not Node). gstack uses it for compiled binaries, native SQLite access (for reading Chromium's cookie database), and its built-in HTTP server. Install at bun.sh if you don't have it.
Node.js -- Windows only

If you're on macOS or Linux with Claude Code already set up, the only probable blocker is Bun. Check with bun --version before you proceed.

Installing gstack (the two-command approach)

Installation works by pasting a prompt directly into Claude Code. You open Claude Code in any project, paste the install command from the gstack GitHub README, and Claude handles the rest: cloning into ~/.claude/skills/gstack, running ./setup (which compiles the browse binary and registers skills), and updating your CLAUDE.md file.

For team or project-level installs, a second command copies gstack into .claude/skills/gstack within your repo so teammates get the same setup on git clone. Everything lives inside .claude/. Nothing touches your PATH or runs in the background.

If you plan to contribute to the repo or need full git history, skip the --depth 1 flag and do a standard clone instead.

The 8 core skills and what they actually do

When gstack launched, it shipped 8 main slash commands. The toolkit has expanded since, but the original 8 are where most of the value lives.

Planning:

/plan-ceo-review is product-level planning. It challenges your feature framing, asks what the product is actually for (not just how to add the obvious feature), and identifies the larger product hiding inside your request. Think of it as the question a good founder would ask before any engineering starts.

/plan-eng-review handles architecture. It generates sequence diagrams, state diagrams, and data-flow charts, and surfaces failure modes, edge cases, and required test coverage before a line gets written.

Code review:

/review is focused on production risk. Per the MarkTechPost analysis of the repo, it looks for N+1 queries, race conditions, trust boundary violations, missing indexes, and broken retry logic. It also triages Greptile code review comments automatically if your team uses Greptile.

Shipping:

/ship handles release hygiene: sync main, run tests, resolve review issues, push the branch, open a PR. This is for a ready branch, not for deciding what to build.

Browser automation:

/browse gives Claude Code a live browser. It logs in, clicks through your app, takes screenshots, reads console errors, and catches breakage. The key technical piece is that this isn't a fresh Chromium instance on every call. gstack runs a persistent headless Chromium daemon over localhost HTTP. Per the MarkTechPost analysis, cold start runs a few seconds; subsequent commands run in roughly 100-200ms. Login state, cookies, tabs, and localStorage all persist across commands. The daemon shuts down after 30 minutes idle.

/setup-browser-cookies imports cookies from your local browser into the headless session so the agent can access authenticated pages without re-logging in every time.

QA:

/qa analyzes the branch diff, identifies the affected routes, and runs browser-based tests against those specific flows, not a generic full-app scan. The repo's example shows /qa inspecting 8 changed files and 3 affected routes, then testing those routes against a local app instance. The goal is tying source changes to actual application behavior.

Retrospective:

/retro pulls commit history and summarizes what shipped, what broke, and patterns worth noting across a development cycle.

Since launch, Tan has added significantly more commands including /office-hours (founder-style product challenge), /cso (runs OWASP + STRIDE security audits), /canary, /design-consultation, and safety hooks like /careful, /freeze, and /guard for controlling how aggressively Claude executes changes.

When to use it vs vanilla Claude Code

Plain Claude Code works fine for isolated tasks: write a function, explain an error, generate a test. Problems emerge when you're managing a feature from concept through deployment. Context bleeds between planning and implementation. Code review depth varies. You re-explain context on every prompt.

gstack's core value is explicit mode-switching. /plan-ceo-review runs a different mental model than /review. Planning challenges whether you're building the right thing. Review assumes you are and checks whether you'll regret it in production. Keeping those separate reduces the AI equivalent of mixing planning and execution in the same conversation thread.

For a solo developer or small technical team shipping features regularly, the structure makes sense. For one-off scripts, quick documentation, or exploratory prototyping, the overhead probably isn't worth it.

gstack adds least value when:

You don't have a staging URL (the QA skills need something to hit)
You're Windows-only without WSL (browser binary support is macOS and Linux, x64 and arm64)
No one on your team is willing to maintain the CLAUDE.md configuration

Honest limitations

Tan himself has described the /office-hours skill as "only a 10% strength version of what a real YC partner can do for you," per discussion in the r/ycombinator subreddit. That's a useful framing for the whole toolkit: these skills impose structure, they don't substitute for engineering judgment.

The 600K lines / 10-20K lines per day figures come from Tan's own README and public posts. They're unverified, and lines of code is a notoriously poor proxy for meaningful output. High LOC numbers often include generated boilerplate, tests, and autogenerated files. Read the claim as evidence that structured AI workflows can enable high velocity, not as a validated productivity benchmark.

A few practical constraints worth knowing up front:

Bun is a real dependency, not optional. If your team's tooling doesn't include Bun, the browser features won't work.

The browser binary is macOS and Linux only. Windows users get planning, review, and ship skills but not the browser-based QA workflow.

There's a legitimate concern about feedback loops. Community discussion on Reddit has noted that a workflow asking Claude to review Claude's earlier output has a sycophancy risk baked in. The /review skill is only as useful as Claude's willingness to push back on its own prior work. The structured prompts in each skill help mitigate this, but it's worth keeping in mind and not treating /review output as equivalent to a senior engineer's eye.

gstack won't convert a junior engineer into a senior one. The planning and review skills surface the right questions. Answering them well still requires domain knowledge.

Getting started today

The lowest-friction path: install globally, try it on a single feature branch, and evaluate before committing to team adoption.

Install Bun if needed: curl -fsSL https://bun.sh/install | bash
Open Claude Code in any project
Paste the install prompt from the gstack GitHub README
Run /office-hours and describe what you're building
Run /plan-ceo-review on a feature you've been putting off
After making changes, run /review before opening your PR

The feedback loop across steps 4-6 is the actual product. Whether it's worth adding to your team's workflow depends on how disciplined your current Claude Code usage already is. If you're already using Claude separately for planning vs. implementation, gstack formalizes something you're doing manually. If you're vibe-coding in a single long conversation, gstack forces you to slow down and separate modes, which is either exactly what you need or a constraint you'll resent.

MIT license, free to fork, actively maintained. Low bar to find out which camp you're in.

Sage Thornton covers guides and developer tools for The Daily Vibe.

Getting started with gstack: how to set up Garry Tan's open-source AI coding factory

What you need before you start

Installing gstack (the two-command approach)

The 8 core skills and what they actually do

When to use it vs vanilla Claude Code

Honest limitations

Getting started today

Related Articles

Your organization has no AI model selection process. Here is how to build one in 30 days.

How to choose between Claude, ChatGPT, and Gemini in 2026

Claude Code's source code leaked via npm for the second time. Anthropic built a whole system to prevent exactly this.