Skip to content

writing

Blog

Notes from production — AI systems, performance, and the infrastructure underneath.

MCP is everywhere now — and so is its oldest constraint. How a transparent caching proxy gets any MCP server past the 25,000-token response limit.

  • #mcp
  • #ai-infrastructure
  • #caching
  • #open-source

How a 24/7 AI agent fleet stays affordable on one subscription: deterministic code handles every tick, and the model only runs on real signals.

  • #ai-agents
  • #automation
  • #llmops
  • #self-hosting