Leonard Austin
· 25 MIN READ

I Vibe Coded a Full-Stack Kubernetes App in 21 Days


tl;dr — I built a full-stack Kubernetes desktop application — Go backend, React/TypeScript frontend, brochure website — in 21 days using AI-assisted “vibe coding.” 74,675 lines of source code. 461 commits. 129 pull requests. 100% merge rate. I never read a single line of the source code. Here’s exactly how it worked, what I actually typed, and what I learned.

What I Built

Clusterfudge is a native macOS desktop app for managing Kubernetes clusters. Think of it as a Lens competitor: cluster overview, pod management, log streaming, exec terminals, Helm chart management, YAML editing, resource wizards, a troubleshooting page, and an AI debugging terminal that calls Claude Code to diagnose live pods. Plus a brochure website.

The stack is Go (backend, via Wails framework), React with TypeScript (frontend), and Astro (brochure site). It’s a monorepo with 654 files.

I didn’t write any of it. I described what I wanted in plain English and the AI wrote the code, ran the tests, committed to branches, opened PRs, and iterated until everything worked.

By The Numbers

Metric Value
Development span 21 days
Source lines of code 74,675
Total commits 461
Pull requests (all merged) 129
AI sessions 135+
Human prompts typed 1,025+
AI responses generated 14,346+
Tokens processed 1.27 billion+
API calls 14,145+
Conversation log data 130 MB+

Source lines of code counts Go and TypeScript source files only. The total codebase — including configuration, generated code, tests, and the brochure site — grew to 128K lines; see the Codebase Growth chart in the raw data below.

Some derived metrics that tell a sharper story:

Metric Value
Lines of code per day 3,556
Commits per day 22
PRs per day 6.1
Git insertions per human prompt 167+
AI responses per human prompt 14.0

That last number means for every sentence I typed, the AI generated fourteen responses — reading files, editing code, running tests, committing, pushing. I wasn’t having a conversation. I was issuing work orders.

Note: Development happened across two machines. Machine 1 logged 77 sessions (396 prompts, 584M tokens, 63.7 MB). Machine 2 logged 58+ sessions (629+ prompts, 684M+ tokens, 66.3 MB). The “+” markers indicate conservative lower bounds — some sessions were too short to instrument, and the two machines’ Claude Code Desktop app sessions aren’t included.

The Real Workflow

Here’s what my day actually looked like:

  1. Morning: Open Claude Code, point it at a task list in a markdown file, say “/dev-loop”
  2. Walk away: Let it work autonomously. Some sessions ran 8-10 hours
  3. Check in: “PR open?” / “is the CI passing?”
  4. Test the app: Build and run the actual application, find UX issues by using it
  5. Report bugs: Describe what’s wrong in plain English, no code references
  6. Repeat

I was a product manager who could QA. The AI was the engineering team.

The first 10 days followed a phased architecture approach. I wrote nine phase documents (or rather, described what each phase should contain and had the AI write them), then pointed the AI at each one sequentially. PRs #6 through #16 implemented the entire application in order: Go backend, frontend shell, resource views, log streaming, Helm management, polish, and competitive feature parity. Each phase was a “fire and forget” — one prompt, hours of autonomous work.

What I Actually Typed

1,025+ human prompts across 135+ sessions on two machines. The median prompt was 12 words. 24% were five words or fewer. Here are the patterns that emerged.

Fire and Forget

The dominant pattern. Point at a task list, say “go”:

“please /lenny-dev-loop tier 1, tier 2 and then tier 3.”

“please /dev-loop for docs/AUDIT.md”

A single prompt could trigger the AI to read task files, implement features across dozens of files, run tests, fix failures, commit, push, open PRs, and work through an entire checklist. One prompt. Hours of work.

Delegate the Delegation

In the later days, the fire-and-forget pattern evolved. Instead of pointing at a task list, I told the AI to figure out its own parallelism:

“let’s go the whole hog and do it all, feel free to spin up a team.”

“Please spin up a team to check every single element and page for light mode as it seems like there are lots of bits that are suitable for light mode.”

“please go ahead and fix everything in the order that you think makes the most sense. Feel free to spin up a team as you see fit.”

The AI would spawn multiple sub-agents, divide the work, and coordinate. I wasn’t just delegating tasks — I was delegating the delegation itself.

Skill-Based Delegation

As the project matured, I stopped writing raw prompts for repetitive workflows and started invoking specialized skills:

“please use /frontend-design to review and see how we can improve things”

“maybe it’s worth using the /frontend-design skill to get a list of ideas”

“can you create me a new /lenny-blog-post”

Each skill encapsulates a multi-step workflow — research, plan, implement, test, commit, PR — behind a single invocation. The evolution from raw prompts to skill-based delegation mirrors how you’d build internal tools for a growing team.

QA by Using the App

I tested the running application and reported bugs the way a user would:

“the sidebar on the right that appears when click on a pod, for example. Can we make the size of it draggable (so a user can increase/decrease) the width and remembered when it’s opened in future again.”

“Deployments->Right hand panel, History table row has cursor pointer but doesn’t do anything?”

No file names. No component names. No line numbers. Just “this thing is broken” and the AI figured out the rest.

Status Pings

Brief check-ins. Often one or two words:

“PR open?”

“committed and pushed?”

“is the CI passing?”

“passed yet?”

I managed the AI like a junior developer — periodic check-ins without micromanaging.

Course Corrections

Quick redirects when the AI went the wrong way:

“No, I want to keep shortcuts. Just not make them editable.”

“yes revert, then come up with a plan to tackle H8, H10 and H15. Then /dev-loop them”

Correct once, then immediately delegate back. I rarely needed to correct the same thing twice.

Fact-Checking the AI

The AI occasionally made claims about the product that weren’t true. Catching these required actually knowing the product — another reason the human-as-QA role matters:

“wait, the app has been tested for 5k resources?”

It hadn’t. The AI had fabricated a marketing claim for a blog post. I caught it because I’d been using the app daily. Similarly:

“are we 100% confident that everything in the /features page is accurate and it’s working and live in the desktop app?”

Trust but verify. The AI is good enough that you stop reading the code — but you never stop testing the product.

Strategic Product Decisions

The highest-leverage prompts. A single sentence could trigger thousands of lines of change:

“can you move all the other files into desktop/ folder so we can turn this into a mono repo.”

That one prompt produced PR #82: 8,769 additions across 607 files. A complete monorepo restructure from a single sentence.

Ops Debugging by Pasting Errors

Days 18–21 introduced a completely different prompt style. Instead of describing UX bugs, I was pasting CI/CD failures, build logs, and GitHub Actions output directly into the chat:

“on my mac runner Build (macos-arm64) the follow step takes ten+ minutes. actions/setup-go@v5 Setup go version spec 1.25.0

“What secrets do I need to add to the repo for the release actions”

“ok, secreet have all been added. They are the same as the ../iddio-mono project. Just want to check that is ok?”

One prompt was a 60-line Cloudflare build log, pasted raw. The AI parsed it and fixed the misconfigured build command. Release engineering, CI debugging, and infrastructure work followed the same pattern as feature development: describe the problem, let the AI fix it. The domain shifted from product to ops, but the workflow didn’t.

Typos Everywhere, and It Didn’t Matter

The prompts are littered with typos — “brqnch”, “commmit”, “truely”, “writen”, “packge”, “remainnig thins”, “fone size”, “nuch movement”, “Triangel”, “termology”, “stevedor”, “cna” — and the AI understood every single one without asking for clarification. I typed fast and sloppy because it didn’t matter. This is a genuine productivity feature.

The Build-Then-Fix Loop

Here’s something the numbers reveal clearly: ~30% of PRs were bug fixes or review follow-ups. The workflow wasn’t “build it perfectly the first time.” It was:

  1. Build it fast
  2. Test it by using it
  3. Report what’s broken
  4. Fix it
  5. Repeat

This is the “70% problem” in practice. The AI gets the broad strokes right on the first pass. The remaining 30% is iterative refinement — the human QA-ing, redirecting, and fine-tuning.

One session captures this perfectly. I typed 37 prompts. The AI generated 653 responses. Each fix revealed the next bug, like peeling layers of an onion:

“the app doesn’t work. The homescreen just has the spinner going round and not stopping.”

[AI fixes it]

“ok, there are no more errors, but it is showing zero for everything. Also I can’t copy any text or right hand click.”

[AI fixes it]

“still getting the error undefined is not an object”

[AI fixes it]

“ok, ive merged the PR. Let’s fix some other stuff. I can’t drag or resize the window”

That session generated 128,000 output tokens — roughly 96,000 words of code and reasoning — from a human who typed maybe 500 words total.

The Other Extreme: Rapid-Fire Design Sessions

Not every session was fire-and-forget. The welcome screen redesign on Day 19 was the opposite: 24 prompts in 80 minutes, each one a micro-adjustment:

“can we split it into two columns”

“ok, I think we need to move to a horizontal tabs with a slide in/out for each page”

“great stuff. Let’s centre it and set a max width”

“let’s centre align the nav items as well. Also let’s reduce the max width for just AI Assistant and Kubeconfig to 50% of what it currently is.”

“the green is a little too light, can we go a bit darker”

“can we move the toast down slightly, so it doesn’t sit above the very top title bar.”

This was standing over the AI’s shoulder, directing every decision in real time. One prompt every three minutes, each building on the visual result of the last. The contrast is stark — some sessions produce 8,769 lines from a single sentence, others produce fine-grained tweaks from a rapid stream of feedback. Both are vibe coding. The mode switches based on what the work needs.

The Moments That Made Me Laugh

The Naming Rabbit Hole

I spent an entire 30-minute session brainstorming product names. It went from licensing strategy to nautical terms to pirate words to foreign languages to shipping container terminology:

“what does kubernetes mean?”

“what other Nautical terms are there?”

“what about pirate words?”

“ok, let try foreign words”

“are there any other shipping container termology?”

“What about clusterfudge”

The name stayed as KubeViewer at the time. The same brainstorm surfaced again in a different session weeks later — “what about stevedore” — before being rejected again. Some decisions just keep circling back. Eventually, Clusterfudge won out.

Pixel Surgery

The most precise prompts were UI adjustments measured in individual pixels:

“Close icon move: 1px up, 2px left. Minus icon move: 1px up. Triangel icons move: 1px up, 1px right”

Then immediately:

“ok, that was way too nuch movement, can we go back and do half as much?”

Then:

“Almost perfect. Can we move the x a tidy bit to the right, and triangles to the left?”

Three rounds of pixel nudging to position window control icons. Vibe coding’s equivalent of standing behind a designer’s shoulder saying “a bit to the left… no, back a bit.” It ended with: “icons are a perfect size now. Please commit and push.”

AI Building AI (Then Debugging the AI Debugger)

The app has a built-in AI debugging terminal that calls Claude Code to analyze Kubernetes pods. Eight sessions were spawned not by me, but by the application itself — code I’d never read, invoking AI to debug live clusters. Genuinely recursive: I used AI to build a feature that autonomously calls AI.

Then Days 18–19 added another layer: I had to QA the AI feature itself. The prompts read like debugging inception:

“without the ability to type anything at the end. Also seems like I don’t see the claude code welcome screen, it seems like it skips that bit”

“The js library we are using for the terminal doesnt seem great when entering interactive shells like gemini and claude”

“flashing, text in wrong order, repeated text”

I was using AI (Claude Code in my terminal) to fix the AI feature (the in-app debugger) that calls AI (Claude/Gemini/Codex) to diagnose Kubernetes pods. Three levels deep. At one point I had the desktop app open with a live AI debugging session, Claude Code fixing the terminal rendering in a split pane, and the AI inside the app re-running to verify the fix worked. Turtles all the way down.

Accidental Sessions

One session contains exactly one human message: “ccx”. Another includes the prompt “ta]”. Both were accidental keypresses that spawned AI responses. Even typos get logged — and cost tokens.

The Token Economy

1.27 billion tokens sounds absurd. It is. Here’s the breakdown across both machines:

Metric Value
Total tokens ~1,268,000,000
Cache read tokens ~1,228,000,000 (97%)
Cache write tokens ~36,500,000 (3%)
Output tokens (actual code) ~2,600,000 (0.2%)
API calls 14,145+

97% of tokens were cache reads — the AI re-reading the same codebase context thousands of times as it iterated. Only 2.6 million tokens were actual generated output. The rest was the AI maintaining its mental model of a growing codebase across 14,000+ API calls.

To put 1.27 billion tokens in perspective, that’s roughly equivalent to reading 3,800 novels. The AI effectively read the codebase cover-to-cover thousands of times during development.

What This Would Have Cost on the API

At pay-as-you-go API pricing (Claude Opus 4.6 on OpenRouter: $5 per million input tokens, $25 per million output tokens), 1.27 billion tokens would have run up a serious bill. In an agentic coding workflow, input tokens dominate heavily — the AI reads far more code than it writes. Assuming an 80/20 input/output split:

Split (in/out) Input Cost Output Cost Total
90/10 $5,715 $3,175 ~$8,890
80/20 $5,080 $6,350 ~$11,430
70/30 $4,445 $9,525 ~$13,970

The most likely estimate: ~$11,000 for 21 days of development.

Several factors could shift this significantly. Prompt caching — if sessions reused large system prompts — can drop input costs by 90%. The batch API cuts everything by 50% for non-interactive work. And if any of those tokens included extended thinking (billed as output at $25/M), the output share climbs and the bill could push toward $15–20K+.

I did this on a Max Plan — a flat monthly subscription. The entire 21-day project cost a single month’s fee. At API rates, the same work would have cost roughly 50–70x more. The Max Plan doesn’t just change the economics of vibe coding — it makes this style of development viable in the first place. Without flat-rate pricing, you’d self-censor every “let it run for 8 hours” session. The moment you start watching the meter, you stop letting the AI work autonomously, and that’s where most of the value is.

Context Window Pressure

The AI context window peaked at 168K tokens — approaching the ~200K limit. Seven sessions hit this ceiling. As the codebase grew, the conversation history competed with the source code for context space. The following data is from Machine 2 through Day 17:

Context Size API Calls Share
Under 50K tokens 1,163 26%
50-100K tokens 1,635 35%
100-150K tokens 1,339 29%
Over 150K tokens (near ceiling) 475 10%

Most calls used 50-150K of context. The sessions that hit the ceiling were all long feature-implementation sessions where the AI was working with both the full codebase and a deep conversation history.

On Day 19 (March 18 at ~10am), Claude’s context window expanded from ~200K to 1 million tokens. The timing was remarkable — the rename to Clusterfudge (PR #90) landed just before the switch, and then the floodgates opened: blog posts, multi-AI provider support, MIT license, OSS sync, CalVer release workflow, welcome screen redesign, website audit, and mobile fixes all shipped in the hours after. The ceiling that seven sessions had hit in the first 17 days simply stopped existing.

The efficiency gain shows up clearly in the data. Commits per API call — a rough measure of how much work each AI round-trip produced — jumped dramatically:

Day Commits API Calls Commits/Call
Mar 6 (pre-1M) 43 801 0.054
Mar 8 (pre-1M) 18 532 0.034
Mar 10 (pre-1M) 26 1,285 0.020
Mar 18 (1M arrives) 52 768 0.068
Mar 19 (post-1M) 49 233 0.210
Mar 20 (post-1M) 70 123 0.569

March 20 was nearly 30x more efficient than March 10 in commits per API call — 70 commits and 19 merged PRs from just 123 API calls, with a 128K-line codebase. The nature of the work shifted too (big features → smaller polish PRs), so it’s not a clean comparison, but the trend is hard to ignore. With 5x the context headroom, the AI spent less time re-reading and more time shipping.

What I’d Do Differently (Tips for Vibe Coding at Scale)

Write phase documents first

The phased approach was the single best decision. Nine documents describing what each phase should contain, implemented sequentially. The AI had clear scope, clear deliverables, and could work autonomously for hours. Without phase docs, I’d have been micromanaging every feature.

Use audit documents as task lists

Three times during development I had the AI audit its own work — scanning for placeholder data, broken routes, dead code, security issues. Each audit produced a markdown checklist. Then I pointed the AI at the audit document and said “fix everything.” Self-auditing is one of the most powerful vibe coding patterns.

Build, then QA by actually using the app

Don’t try to get it right the first time. Build fast, then test the running application yourself. The prompts that produced the best results were the ones describing real UX problems I found by using the thing: “the spinner never stops”, “I can’t drag the window”, “this dropdown needs styling.” The AI is better at fixing problems you can describe than predicting problems you can’t.

Keep prompts short

My median prompt was 12 words. The most effective prompts were one-sentence directives. The AI doesn’t need context — it has the codebase. It needs direction. “Make every table column sortable. Commit to a branch and open a PR” is a perfect prompt. It’s a complete work order in two sentences.

Let it run

Some of my sessions ran 8+ hours. The temptation is to check in constantly. Don’t. Let it work. Check when you see a PR notification. The “fire and forget” pattern produced the most code per prompt of any approach.

Use the interrupt

When the AI starts heading the wrong direction, interrupt immediately. Don’t wait for it to finish a wrong approach. I interrupted 21 times on one machine alone. It’s not rude — it’s efficient.

Embrace the build-then-fix loop

30% of PRs were bug fixes. That’s not a failure rate — it’s the process. Build at 70%, QA it yourself, fix the remaining 30%. The cost of iteration has collapsed. Perfectionism on the first pass is waste.

Don’t read the code

This sounds counterintuitive. But the moment you start reading source code, you’re doing the AI’s job. Your job is to use the application, describe what’s wrong, and make product decisions. If you can test it, you can fix it — without ever opening a file.

Days 18–21: From App to Product

The first 17 days built an application. The next four turned it into a product. 176 commits. 40 pull requests. 21 additional AI sessions logging 111 million tokens across 1,124 API calls. The work shifted from features to everything around features — naming, licensing, release engineering, website polish, and launch readiness.

The Rename (Day 19)

KubeViewer officially became Clusterfudge. One prompt. PR #90 touched 140 files — every import path, every config reference, every UI string. The AI did a clean rename across the entire monorepo without breaking a single test. The brainstorming session from Day 12 finally paid off.

Going Open Source (Day 19)

Three PRs in rapid succession: add an MIT license (#93), build an OSS sync workflow to push a clean public repo (#94), and move internal GTM documents out of the public directory. The AI separated private strategy docs from public source code, set up an orphan branch sync to a separate GitHub repo, and wired it all into CI. Open-sourcing a project is exactly the kind of tedious, error-prone work that AI handles perfectly — lots of file moves, config changes, and workflow YAML that needs to be exactly right.

Release Engineering (Days 19–21)

This was the most surprising productivity gain. In three days, the AI:

  • Built a CalVer release workflow that auto-generates changelogs by prompting itself to summarise the diff (#95)
  • Configured self-hosted macOS runners and fixed Go toolchain PATH issues
  • Added Linux cross-compilation and APT repository publishing to the release pipeline
  • Wired an update checker into the title bar so users see new versions (#127, #128)
  • Cut four tagged releases (v2026.0319.1034, v2026.0319.1610, v2026.0319.1825, v2026.0320.2328)

Release engineering is traditionally the work nobody wants to do. Workflow YAML, signing, packaging, distribution. The AI treated it the same as any other task — read the docs, write the config, iterate until CI passes.

Website Polish (Days 19–21)

The brochure site got dark/light mode with system theme inheritance, mobile responsiveness fixes, a demo GIF, copy-to-clipboard on the install command, and an interactive canvas hero background. Eleven PRs of pure front-end polish across the Astro site — the kind of work that makes a product feel real.

The Biggest Day (Day 21)

March 20 was the single busiest day of the entire project: 70 commits and 19 merged PRs. Launch prep creates a long tail of small fixes — a broken dark mode icon, a missing cursor pointer on a button, a namespace default that should be “all” instead of “default.” Each fix was its own branch, its own PR, its own merge. The AI handled the volume without slowing down.

The day ended with the start of the next feature: port forwarding. PR #126 planned the architecture. PR #130 built the dialog and wired it into the pod list. Even on launch day, the AI was already building the next thing.

Multi-AI Provider Support (Day 19)

The built-in AI debugging terminal was hard-coded to one provider. One prompt turned it into a pluggable system supporting multiple AI backends, plus added a local terminal mode for users who want to run their own tools. PR #92 — 573 insertions across 19 files. The AI refactored its own AI integration.

Blog Posts Written by AI (Days 19–20)

The meta moment: I pointed the AI at the codebase and asked it to write blog posts about the features it had built. PR #91 produced “Building a Pod Security Scanner You Actually Use.” PR #107 produced “How the Troubleshoot Engine Turns Status into Diagnosis.” The AI wrote marketing content about code it had written, for a product it had built. Turtles all the way down.

The Uncomfortable Truth

I built a 74,000-line, full-stack Kubernetes desktop application with a brochure website, release pipeline, and open-source distribution in 21 days. Every line AI-generated. 129 pull requests, all merged. Three self-audits. Security hardening. Unit tests for 44 components. Dead code cleanup. Four tagged releases. An APT repository. Auto-update notifications.

The uncomfortable truth isn’t that this was possible. It’s that this was easy. The hardest parts were product decisions — what to build, what to cut, what to name it. The engineering was the cheap part.

A year ago, this project would have taken a small team several months. Today, one person with AI tools and product instinct does it between checking emails. The leverage is absurd.

If you’re an engineer, the implication is clear: your value is in knowing what to build, not how to build it. Taste, product thinking, and the willingness to QA your own work are the skills that matter. The code writes itself.

If you’re a founder, the implication is bigger: the cost of building just dropped by an order of magnitude. The bottleneck is no longer “can we afford to build this?” It’s “should we build this?” That’s a taste question, not an engineering one.

1,025+ prompts. 21 days. 74,000 lines. 129 pull requests. 1.27 billion tokens. Four production releases. And the name? Clusterfudge. It made the cut after all.


Raw Data

Codebase Growth

How the codebase grew day by day. The initial import was templates and design system files; the real build started March 4.

         Net Lines of Code
         0        25K       50K       75K       100K      125K
         |---------|---------|---------|---------|---------|
Feb 28   ██████████▏                                          27K
Mar  1   ████████████▎                                        35K
Mar  2   █████████████▊                                       39K
Mar  3   ███████████████▏                                     43K
Mar  4   ████████████████████████████████▊                     80K  ← Phase 1-9
Mar  5   ████████████████████████████████▉                     81K
Mar  6   ███████████████████████████████████▍                  87K
Mar  7   █████████████████████████████████████████▊           102K  ← Peak commits (48)
Mar  8   ██████████████████████████████████████████▏          103K
Mar  9   ██████████████████████████████████████████▍          104K
Mar 10   ██████████████████████████████████████████▏          103K  ← Dead code cleanup
Mar 11   ██████████████████████████████████████████▋          105K
Mar 12   ██████████████████████████████████████████▊          105K
Mar 15   █████████████████████████████████████████████████▏   114K  ← Brochure site
Mar 16   █████████████████████████████████████████████████▍   116K
Mar 17   █████████████████████████████████████████████████▎   117K
Mar 18   █████████████████████████████████████████████████▎   117K  ← Rename + OSS
Mar 19   ████████████████████████████████████████████████████▉ 125K
Mar 20   ██████████████████████████████████████████████████████ 128K  ← Peak PRs (19)
Mar 21   ██████████████████████████████████████████████████████ 128K

Context Window Pressure Over Time

As the codebase grew, the AI needed more context to hold it in memory. Average context per API call tracked codebase size closely. The ~200K token limit created a hard ceiling that sessions started hitting once the codebase crossed ~80K lines.

         Avg Context (tokens)                                  Codebase
         0     25K    50K    75K   100K   125K   150K   175K
         |------|------|------|------|------|------|------|
Mar  2   ██████████████▍                                       39K lines
Mar  5   ██████████████████▊                                   81K lines
Mar  6   █████████████████████▎                                87K lines
Mar  7   █████████████████▍                                   102K lines
Mar  8   █████████████████████▎                               103K lines
Mar  9   ████████████████████▉                                104K lines
Mar 10   ██████████████████████▉  ← peak avg context         103K lines
Mar 11   ████████████▋            ← short sessions only       105K lines
Mar 16   ███████████████████▋                                 116K lines
         |------|------|------|------|------|------|------|
                              ▲ peak: 168K tokens (7 sessions hit ceiling)

Daily Activity

Commits, API calls (per machine), and output tokens by day. M1 and M2 refer to the two development machines.

Date Commits API Calls (M2) API Calls (M1) Output Tokens Codebase
Feb 28 1 27K
Mar 1 7 35K
Mar 2 7 106 12K 39K
Mar 3 5 43K
Mar 4 35 80K
Mar 5 24 343 47K 81K
Mar 6 43 801 163K 87K
Mar 7 48 295 54K 102K
Mar 8 18 532 104K 103K
Mar 9 11 324 83K 104K
Mar 10 26 1,285 214K 103K
Mar 11 9 159 25K 105K
Mar 12 3 4 1K 105K
Mar 15 1 114K
Mar 16 43 794 1,506 117K 116K
Mar 17 5 117K
Mar 18 52 768 117K
Mar 19 49 233 125K
Mar 20 70 123 128K
Mar 21 4 128K

API calls and token data are from Machine 2 through Mar 16, and Machine 1 from Mar 16 onward. Machine 1 logged 40 sessions across Days 17–21 with 2,630 API calls and 243 million tokens. Some days have commits but no API call data because those sessions ran on a machine without detailed logging.

The Full PR List

Every pull request, in order. All 129 merged. Zero rejected. Four PR numbers (#110, #117, #131, #132) were either closed or still open — the rest shipped.

PR Lines Changed Files What It Did
#1 +4,204 9 UI templates with demo data
#2 +1,790/-55 40 Collapsible sidebar, 18 resource pages
#3 +681/-12 9 Table header hover, hex grid gaps
#4 +1,224 1 Phase 9 spec document
#5 +139/-139 5 Rename frontend/ to ui/
#6-#8 +5,942/-22 44 Phase 1: Go backend foundation
#9 +9,479/-1,901 27 Phase 2: Project setup
#10 +7,254/-71 35 Phase 3: Core Kubernetes backend
#11 +5,569/-369 50 Phase 4: Frontend shell
#12 +5,956/-205 99 Phase 5: Resource views
#13 +2,456/-120 28 Phase 6: Log streaming, exec terminal
#14 +2,950/-59 32 Phase 7: Helm, YAML editor
#15 +3,050/-713 46 Phase 8: Polish, packaging
#16 +4,968 64 Phase 9: Competitive features
#17-#20 +1,352/-531 64 Review feedback, CI fixes
#21 +3,190/-2,441 79 Replace all placeholder data
#22-#28 +2,237/-1,541 90 Audit fixes, security hardening
#29-#35 +2,763/-477 83 ESLint, settings wiring, build
#36 +151/-157 33 Make all columns sortable
#37-#43 +2,270/-247 76 Error handling, hex grid, no-explicit-any
#44-#45 +4,818/-6 50 Unit tests (44 components + 4 handlers)
#46-#55 +10,139/-583 98 Feature parity: secrets, CRDs, port forwarding, Helm repos
#56-#60 +2,652/-245 52 Competitor analysis, resizable panels, welcome redesign
#61-#67 +1,363/-3,677 45 Cursor fixes, beta toggle, dead code removal
#68-#74 +993/-357 40 Documentation, audit items, backups
#75 +0/-0 1 Fix Cmd+Tab icon size
#76-#79 +2,110/-54 33 AI debugging terminal
#80-#86 +10,685/-419 681 Wire placeholder pages, brochure site, monorepo
#87 +2,796/-2,738 23 Vibe code stats analysis
#88 +664/-397 19 Bottom tray redesign as dock with pod picker
#89 +169/-50 10 Fix terminal resize, exec pipe error, log timestamps
#90 +724/-707 140 Rename KubeViewer → Clusterfudge
#91 +325/-238 13 Blog post: Pod Security Scanner
#92 +573/-262 19 Multi-AI provider support, local terminal
#93-#94 +472/-693 49 MIT license, OSS sync workflow
#95 +938/-4 7 CalVer release workflow
#96 +808/-161 11 Welcome screen redesign
#97-#99 +4,762/-7,024 26 Website audit fixes, mobile responsiveness
#100-#101 +8,286/-8,415 23 Demo hero image, website cleanup
#102 +208/-66 27 Light mode audit across full UI
#103-#106 +66/-34 12 PATH fix, binary size copy, version refs, blog date
#107-#108 +129/-234 7 Blog post: Troubleshoot Engine, docs review
#109, #111–#116 +1,227/-1,786 52 Restore blog posts, README overhaul, Lens comparison, mobile fixes
#118-#119 +12/-12 3 Dark mode icon fix, cursor theme toggle
#120 +1,132/-3 12 Demo cluster setup with Kind + Podman
#121-#123 +36/-1,156 19 Beta nav fix, default all namespaces, hero text
#124-#125 +108/-25 6 Update asset names, AWS CLI PATH fix
#126 +390/-57 6 Port forwarding architecture plan
#127-#129 +185/-366 8 Update checker fix, title bar notification, release v2026.0320
#130 +338/-2 4 Port forward dialog
#133 +1/-1 1 Lint rule strengthening

The “s-curve” is visible. Early PRs were massive foundation work (+9,479 lines). Middle PRs were targeted fixes (+151 lines to make columns sortable). Late PRs swung big again for the brochure site (+8,769 lines). The final stretch (Days 18–21) shifted to product work — rename, OSS, releases, website polish — with bursts of small, focused PRs. March 20 alone produced 19 merged PRs and 70 commits, the single busiest day of the project.