Use CasesMedium Impact·Thursday, March 26, 2026

AI Agent Runs on $7/Month VPS Using IRC as Transport

A developer built a two-agent AI system on cheap VPS hardware using IRC as the messaging layer, with tiered Claude inference capped at $2/day.

What happened

George Larson published a working two-agent architecture: a public-facing agent (nullclaw) compiled as a 678 KB Zig binary using ~1 MB RAM, connected via IRC with a web chat frontend; and a private agent (ironclaw) handling email and scheduling over Tailscale using Google's A2A protocol. The system uses Claude Haiku 4.5 for cheap conversational inference and Sonnet 4.6 only for tool-calling, with a hard $2/day API spend cap. A2A passthrough means a single API key and billing relationship covers both agents regardless of which one initiated the request.

Why it matters to you

personalized

This is a fully working reference implementation for ultra-low-cost agent infrastructure. The key technical insight is using IRC as a durable, low-overhead transport instead of WebSockets or HTTP polling — the Zig binary at 678 KB RAM proves you don't need a Node.js/Python runtime stack to run a production agent. The tiered inference pattern (Haiku for conversation, Sonnet only on tool calls) is the real cost-control mechanism worth stealing.

What to do about it

Replicate the tiered inference pattern in your own agent: route all conversational turns to Haiku 4.5 and gate Sonnet 4.6 calls behind a tool-use classifier — measure token cost difference on your last 1,000 API calls to quantify savings.

Try this now

Claude API10 min

1
Open your terminal and set your Anthropic API key: export ANTHROPIC_API_KEY=your_key

Community

8 comments