Paperclip: How to Build an AI-Agent Company from Scratch

Most AI orchestration tools are frameworks. They give you primitives: tools, chains, handoffs. You wire everything together. If something breaks at 3 AM, you debug it yourself.

Paperclip is not a framework. It is an operating system for a company made of AI agents.

Built by a pseudonymous founder known as Dotta, the project landed on GitHub and picked up reportedly over 30,000 stars in under three weeks — one of the faster adoption curves in the multi-agent space. That kind of velocity usually means the project either nailed the framing or got lucky with timing. Looking at what Paperclip actually does, it is probably both.

What Paperclip actually is

Paperclip is a Node.js server with a React dashboard that coordinates teams of AI agents toward shared business goals. According to its GitHub repository, the one-sentence pitch is: "If an agent is an employee, Paperclip is the company."

That framing matters. Paperclip is explicit that it is not:

An agent framework (it does not tell you how to build agents)
A workflow builder (no drag-and-drop pipelines)
A prompt manager
A code review tool
Useful for single-agent setups

What it is: an organizational layer that sits above your agents. You bring Claude Code, Codex, OpenClaw, Bash scripts, custom HTTP services — anything that can receive a heartbeat. Paperclip provides the company those agents work inside.

The GitHub repo describes the core thesis directly: if you have one agent, you probably do not need Paperclip. If you have twenty, you definitely do.

Architecture overview

The stack is deliberately unglamorous and self-hostable:

Runtime: Node.js 20+, pnpm 9.15+
API server: localhost:3100 by default
Database: Embedded PostgreSQL (auto-provisioned, no setup required for local runs)
Frontend: React dashboard for monitoring, governance, and task management
Agent protocol: Heartbeat-based — agents wake on schedule, check their queue, act, report back

Locally, a single Node.js process manages the embedded Postgres and local file storage. For production, you point it at your own Postgres and deploy wherever you want — Zeabur, Vercel, a VPS. There is no Paperclip account required and no cloud dependency baked in.

The architecture leans on a few core concepts that differentiate it from simpler orchestrators:

Atomic execution. Task checkout and budget enforcement happen atomically. Two agents cannot grab the same task simultaneously, and a budget check cannot race against a spend event. This matters more than it sounds — without it, you get duplicate work and cost surprises.

Persistent agent state. Agents resume their task context across heartbeat cycles instead of cold-starting each time. If an agent was mid-analysis when the heartbeat interval fired, it picks back up with full context rather than reconstructing from scratch.

Goal ancestry. Every task carries its full goal tree: company mission down through project objectives to the specific task. Agents always see the "why" behind what they are doing, not just a title. This is the architectural piece that makes goal alignment tractable at scale.

Runtime skill injection. Agents can learn Paperclip-specific workflows and project context at runtime without retraining. A SKILL.md file is referenced in the docs, letting you inject structured instructions into any agent's context when it picks up work.

How agent roles work

Paperclip models agents the same way you would model a human org. Each agent gets:

A title and role (CEO, CTO, engineer, marketing, support — whatever you define)
A reporting line (who they are accountable to)
A monthly token/cost budget that hard-stops execution when exhausted
A task queue with dependencies and priorities
An audit trail of every conversation, decision, and tool call

Heartbeats are the scheduling primitive. An agent does not run continuously by default — it wakes on a configured interval, checks its work queue for assigned tasks, executes, and reports back. You can also trigger agents on events: task assignment, @-mentions, or external webhooks.

Delegation flows hierarchically. A CEO agent can create tasks, assign them to a CTO agent, and the CTO can sub-delegate to engineering agents. The org chart is not decorative — it drives who can assign work to whom and what governance approvals are required before an agent can act.

Governance controls sit at the human level. You are the board. You approve new agent hires, can override strategy, pause any agent, or terminate one entirely — from a mobile-friendly dashboard.

Practical setup walkthrough

Getting a local Paperclip instance running is fast:

npx paperclipai onboard --yes

Or manually:

git clone https://github.com/paperclipai/paperclip.git
cd paperclip
pnpm install
pnpm dev

The API server starts at http://localhost:3100. The embedded Postgres spins up automatically. Requirements are Node.js 20+ and pnpm 9.15+.

From there, the workflow follows three phases:

1. Define the goal. Set a company mission at the top level — something like "Build the #1 AI note-taking app to $1M MRR." This anchors every subsequent task with goal context.

2. Hire the team. Create agent definitions pointing to your actual agents. A CEO agent might wrap OpenClaw. A coding agent might wrap Claude Code or Codex. A marketing agent might wrap a custom Python script with web search capabilities. Any service that can receive a heartbeat and respond with status qualifies.

3. Approve and run. Review the proposed strategy the agents surface, set per-agent monthly budgets, and hit go. The dashboard shows active tasks, cost burn, conversation threads, and decision history.

Clipmart is on the roadmap — the project describes it as a marketplace for pre-built company templates: full org structures, agent configs, and skills importable in one click. That feature is not yet available as of this writing.

How it compares to other orchestrators

The multi-agent space has several serious frameworks, and Paperclip occupies a distinct position in that landscape.

CrewAI is the closest conceptual cousin. Both use role-based agents and think in terms of teams. But CrewAI is a Python framework — you write code to define crews, tasks, and processes. Paperclip is a deployed service with a dashboard. CrewAI is better if you want programmatic control and are comfortable writing Python abstractions. Paperclip is better if you want an operational system you can manage without constantly writing code.

LangGraph sits at the other end of the spectrum. It gives you directed graph primitives for stateful workflows — explicit nodes, edges, conditional branching, fine-grained control. LangGraph rewards developers who want to model complex deterministic pipelines. Paperclip sacrifices that granular control for a higher-level abstraction that non-developer operators can actually use.

AutoGen (Microsoft) focuses on multi-agent conversational systems — agents that talk to each other to solve problems. It is excellent for research prototyping and dynamic problem-solving loops. Paperclip is not trying to solve conversational coordination; it is solving organizational coordination. Different problem.

The comparison that matters most: all three of those frameworks require you to build the orchestration layer yourself — the scheduling, cost tracking, persistence, governance, and audit trail. Paperclip ships those out of the box. That tradeoff is the whole bet.

Honest assessment: limitations and when this makes sense

Paperclip is a few months old. The adoption velocity is real but the production track record is short. A few things to pressure-test before committing:

Maturity risk. Features like Clipmart are still roadmap items. The codebase has not been stress-tested at scale by the broader community yet. Check the GitHub issue tracker before building critical workflows on it.

Agent quality still dominates. Paperclip coordinates agents — it does not make them better. If your underlying Claude Code or Codex instance makes bad decisions, Paperclip faithfully coordinates a bunch of bad decisions. The org chart metaphor is only as good as the agents inside it.

Heartbeat latency. The scheduling model is interval-based. For workflows that need sub-minute responsiveness, the heartbeat cadence needs careful tuning. Synchronous critical paths require different design.

No built-in inter-agent messaging. Agents coordinate through tasks and the goal hierarchy, not direct peer-to-peer communication. For emergent, conversational problem-solving between agents, you would need to architect that separately.

When Paperclip makes sense: You are running five or more agents and losing track of what they are doing. You need per-agent cost controls. You want persistent task state across restarts. You are managing multiple separate projects or companies. You want audit trails for compliance or debugging. You want non-developers to be able to monitor and govern agent operations.

When it is overkill: You have one or two agents doing well-defined tasks. You need custom programmatic logic in your orchestration layer. Your workflows are short-lived and stateless. You are still figuring out what your agents should do.

According to Flowtivity, which has been running AI agents in production for over a year, the core coordination problem Paperclip targets — getting multiple agents to share context, respect budgets, and maintain quality — is genuinely difficult and the right problem to solve. Whether Paperclip is the right solution at production scale is a question the community will answer over the next 12 months.

For now, it is the most complete out-of-the-box answer to the question architects keep running into: I have agents, but I do not have a company.

Nate Hargrove covers guides and technical deep dives for The Daily Vibe.

Paperclip: How to Build an AI-Agent Company from Scratch

What Paperclip actually is

Architecture overview

How agent roles work

Practical setup walkthrough

How it compares to other orchestrators

Honest assessment: limitations and when this makes sense

Related Articles

RSAC 2026 turned "agentic security" into a product category. The hard problems are still unsolved.

Buy-side and sell-side agents are talking directly. DSPs weren't invited.

TTD rivals are pitching transparency. Buyers are checking their watches.