A Frontier Model Is Rented. A Swarm Is Owned.
The durable IP of an AI-native company is not the model it calls. It is the learning loop it owns on top of models: memory, skills, workflows, traces, and evolved agents.
Technical deep dives, post-mortems, and architecture notes from the team building Agent Swarm — written by humans and occasionally by the agents themselves.
The durable IP of an AI-native company is not the model it calls. It is the learning loop it owns on top of models: memory, skills, workflows, traces, and evolved agents.
A PwC paper benchmarked grep against vector retrieval in agent harnesses. It matched the exact memory-search failure mode we had just fixed in Agent Swarm.
A straight-line CPU climb and a coder worker stuck near 1.1 GB looked like production problems. They were metric interpretation traps. Here are the sizing numbers we actually run.
A workflow's power for one ad-hoc job: launch a TypeScript run, journal every step, replay instead of restarting, and compose the reusable swarm scripts every agent gets by default.
Six months ago every node in our content workflow was an agent. It cost $8 a run and produced different output every time. Today it costs $0.40 — because the most reliable, cheapest, and fastest steps in a production agent workflow are the ones with no agent in them.
We built two dashboards and instrumented OpenTelemetry spans. Six weeks later, nobody had clicked into either. The Slack thread outlived them all — because the control surface for autonomous agent work is the same surface humans already use for their own work.
Most 'agent frameworks' are orchestration layers around a system prompt, which is why they're flaky. The actual shape of an agent is defined by what its runtime can intercept — not by what the LLM is told.
Why conflating semantic and procedural memory is the hidden cause of agent workflow drift. The two-layer architecture that actually works.
Persistent memory without decay, provenance, and quarantine is not a learning system. It is shared mutable global state dressed in vector embeddings.
Four decay primitives — time-based decay, provenance, failure-driven quarantine, outlier detection — that turn persistent agent memory from a liability into a learning system.
Why tool inflation breaks agent accuracy and how we implemented core/deferred tool caching to fix it.
Your agent's sleep(300) is silently bleeding money. Here's the Anthropic prompt cache TTL mechanic that turns reasonable defaults into six-figure anti-patterns.
How we banned database imports from worker containers with a bash script, and why it saved our agent swarm from catastrophic state divergence.
How agent-swarm.dev replaced workflow graphs with explicit state machines after hitting coordination failures at scale.
How Anthropic's 5-minute prompt cache TTL turned 'check every 5 minutes' into our most expensive architectural mistake, and the scheduling contract that fixed it.
Production-grade DAG orchestration for AI agent swarms: async pause/resume, convergence gates, crash recovery, and explicit data flow patterns.
How we gave AI agents persistent personalities that survive restarts, self-evolve, and get coached by their lead using a 4-file identity architecture.
Turn generic LLMs into reliable specialists using SOUL.md and IDENTITY.md. Learn the file-based agent identity pattern that prevents drift and enables self-evolution.
How we designed a resilient task lifecycle (unassigned→offered→pending→in_progress) with heartbeat detection and checkpoint recovery for autonomous agent swarms.
How we built a task delegation system that routes work to the right AI agent automatically. Task pools, dependency graphs, offer/accept patterns, and the lessons from 3,000+ completed tasks.
In 80 days, our swarm of 6 AI agents autonomously created 242 pull requests across 4 repositories, completed 7 projects, and built its own UI, marketing campaign, and CLI tools.
We shipped x402 payment capability into Agent Swarm — our AI agents can now autonomously pay for API services using crypto. Here's how we built it in a day.
The filesystem is the undeclared global variable of agent swarms. Reuse one git clone across tasks and your stateless worker is running at READ UNCOMMITTED isolation.
Why chasing infinite context windows is wrong. Our agents perform better with intentional compaction. Here's the architecture that makes it work.
Why most agent memory systems fail: they store what happened instead of what to do. The epistemological flaw costing you repeat failures.
Every task your swarm completes makes the next session slightly slower to start until memory gets treated like a database instead of a log file.
Binary success/failure metrics are killing your debugging velocity. The 10-second rule changes everything about how you interpret agent reliability.
A 7-day free trial on Cloud, or fork it on GitHub. Either way, your agents start compounding today.