Swapnil Surdi

I build production AI systems — RAG pipelines, agentic fleets, and the backend infrastructure that keeps them fast, cheap, and reliable.

[email protected] github linkedin

100K+

req/day @ 99.9%

30s → <1s

RAG retrieval

−30–50%

LLM cost (mcp-cache)

8min → 30s

MRI loads

02 — activity

Activity

github · swapnilsurdi

16 repos · 9 stars · 698 contributions/yr

lessmore

claude code · this machinepeak 467m

5.3b tokens total · ~146m/day (30d avg)

lessmore

as of 10 jun 2026

03 — selected work

Selected work

All projects →

▣ live · 3 nodes · 22 containers

LaunchLab Fleet

Three recycled laptops, each operated by its own headless Claude Code agent: a private 22-container homelab that monitors, heals, and reports on itself.

288 watchdog runs/day, zero tokens
4.18s → 18ms status query
22 containers

▣ npm · @hapus/mcp-cache · ★9

MCP-Cache: a transparent cache for any MCP server

A transparent proxy that caches oversized MCP tool responses and hands the model query tools — so any MCP server works past the 25K-token wall.

25K → unlimited token wall
−30–50% LLM API cost
<200ms cached query

▣ production · HIPAA · 4 yrs

Agentic RAG in regulated healthcare

Production agentic RAG over docs, code, Confluence, and Jira for a HIPAA/ISO 13485 platform — compliance retrieval 30s → sub-second, verification 60% faster.