Skip to content

Swapnil Surdi

I build production AI systems — RAG pipelines, agentic fleets, and the backend infrastructure that keeps them fast, cheap, and reliable.

[email protected] github linkedin

100K+

req/day @ 99.9%

30s → <1s

RAG retrieval

−30–50%

LLM cost (mcp-cache)

8min → 30s

MRI loads

02 — activity

Activity

github · swapnilsurdi

16 repos · 9 stars · 698 contributions/yr

maysmtwtfs2026-05-01: 0 contributions2026-05-02: 0 contributions2026-05-03: 0 contributions2026-05-04: 3 contributions2026-05-05: 16 contributions2026-05-06: 0 contributions2026-05-07: 0 contributions2026-05-08: 4 contributions2026-05-09: 7 contributions2026-05-10: 4 contributions2026-05-11: 14 contributions2026-05-12: 15 contributions2026-05-13: 8 contributions2026-05-14: 5 contributions2026-05-15: 4 contributions2026-05-16: 6 contributions2026-05-17: 6 contributions2026-05-18: 8 contributions2026-05-19: 40 contributions2026-05-20: 12 contributions2026-05-21: 4 contributions2026-05-22: 25 contributions2026-05-23: 0 contributions2026-05-24: 1 contribution2026-05-25: 23 contributions2026-05-26: 4 contributions2026-05-27: 12 contributions2026-05-28: 19 contributions2026-05-29: 12 contributions2026-05-30: 11 contributions2026-05-31: 36 contributionsjunsmtwtfs2026-06-01: 18 contributions2026-06-02: 28 contributions2026-06-03: 17 contributions2026-06-04: 11 contributions2026-06-05: 33 contributions2026-06-06: 14 contributions2026-06-07: 3 contributions2026-06-08: 2 contributions2026-06-09: 1 contribution2026-06-10: 1 contribution2026-06-112026-06-122026-06-132026-06-142026-06-152026-06-162026-06-172026-06-182026-06-192026-06-202026-06-212026-06-222026-06-232026-06-242026-06-252026-06-262026-06-272026-06-282026-06-292026-06-30
lessmore
claude code · this machinepeak 467m

5.3b tokens total · ~146m/day (30d avg)

maysmtwtfs2026-05-01: 1,397,858 tokens2026-05-02: no activity2026-05-03: no activity2026-05-04: 35,733,883 tokens2026-05-05: 12,358,174 tokens2026-05-06: no activity2026-05-07: no activity2026-05-08: 32,943,538 tokens2026-05-09: 310,103,454 tokens2026-05-10: 380,237,558 tokens2026-05-11: 138,700,920 tokens2026-05-12: 29,992,332 tokens2026-05-13: 38,959,876 tokens2026-05-14: 46,230,135 tokens2026-05-15: 46,173,627 tokens2026-05-16: 4,947,316 tokens2026-05-17: 5,625,430 tokens2026-05-18: 191,318,015 tokens2026-05-19: 260,879,517 tokens2026-05-20: 70,397,308 tokens2026-05-21: 24,518,188 tokens2026-05-22: 158,798,428 tokens2026-05-23: 53,917,295 tokens2026-05-24: 55,627,671 tokens2026-05-25: 219,370,163 tokens2026-05-26: 124,184,958 tokens2026-05-27: 311,607,762 tokens2026-05-28: 466,697,725 tokens2026-05-29: 234,821,816 tokens2026-05-30: 222,506,163 tokens2026-05-31: 333,351,372 tokensjunsmtwtfs2026-06-01: 248,734,105 tokens2026-06-02: 323,497,818 tokens2026-06-03: 122,616,939 tokens2026-06-04: 115,710,613 tokens2026-06-05: 269,559,487 tokens2026-06-06: 79,369,337 tokens2026-06-07: 157,012,781 tokens2026-06-08: 40,876,417 tokens2026-06-09: 59,038,968 tokens2026-06-10: 54,454,806 tokens2026-06-112026-06-122026-06-132026-06-142026-06-152026-06-162026-06-172026-06-182026-06-192026-06-202026-06-212026-06-222026-06-232026-06-242026-06-252026-06-262026-06-272026-06-282026-06-292026-06-30
lessmore

as of 10 jun 2026

03 — selected work

Selected work

All projects →

▣ live · 3 nodes · 22 containers

LaunchLab Fleet

Three recycled laptops, each operated by its own headless Claude Code agent: a private 22-container homelab that monitors, heals, and reports on itself.

  • 288 watchdog runs/day, zero tokens
  • 4.18s → 18ms status query
  • 22 containers

▣ npm · @hapus/mcp-cache · ★9

MCP-Cache: a transparent cache for any MCP server

A transparent proxy that caches oversized MCP tool responses and hands the model query tools — so any MCP server works past the 25K-token wall.

  • 25K → unlimited token wall
  • −30–50% LLM API cost
  • <200ms cached query

▣ production · HIPAA · 4 yrs

Agentic RAG in regulated healthcare

Production agentic RAG over docs, code, Confluence, and Jira for a HIPAA/ISO 13485 platform — compliance retrieval 30s → sub-second, verification 60% faster.

  • 30s → <1s compliance retrieval
  • 60% faster verification

04 — writing

MCP is everywhere now — and so is its oldest constraint. How a transparent caching proxy gets any MCP server past the 25,000-token response limit.

  • #mcp
  • #ai-infrastructure
  • #caching
  • #open-source

How a 24/7 AI agent fleet stays affordable on one subscription: deterministic code handles every tick, and the model only runs on real signals.

  • #ai-agents
  • #automation
  • #llmops
  • #self-hosting

Looking for the full picture — roles, stack, and the numbers behind the work?

View resume →