← Insights

March 2026

The Practice: What 78 Days of AI Engineering Taught Me

Not the data — the lessons. How showing up every day with a specific problem changed the way I think about building software.

PracticeAI EngineeringBuild-in-Public
01

The Numbers

Total Tokens
27.4M
processed by Claude Code
Sessions
1,200
over 78 days
Active Days
68 / 78
87% day coverage
Current Streak
42 days
42-day longest

37 times the text of War and Peace — processed in 78 days of daily practice.

I only know these numbers because Claude Code tracks them. I didn't set out to hit a token milestone. I set out to ship software for clients and family projects that had been stuck in “someday when I can afford a developer” for years.

The numbers are a byproduct of daily practice. That's the part worth examining — not the aggregate, but what produced it: 1-2 hour sessions most days, compounding over a quarter. The longest single session was 4 days, 14 hours, 7 minutes. But that's the exception. Most of the output came from showing up every morning with a specific problem.

02

What I Actually Built

Not toy apps. Not demos that need to be rebuilt properly later. Five production systems, all deployed, all serving real data to real users.

R9 Bull Sales

26,000+ auction lots · OLS hedonic pricing model · ML price prediction · Clerk auth · Docker + Caddy + GitHub Actions · 285 tests

R9 Herd

Prisma v7 + PostgreSQL · Breeding cycle AI logging · Calving board · 8-generation COI calculations · 453 tests

R9 Lineage

192,000+ animals · 250,000+ pedigree edges · Maternal line scoring engine · Semen company scrapers · 273 tests

CSL Capital Pipeline

BigQuery ML risk scoring on a $40M loan portfolio · 840 API calls → 6 · 30 minutes → 26 seconds · Proprietary performance data

OC Tendencies

113,000+ plays · 44 coordinators · Situation-specific play-calling aggregations · Live at nektarinsights.com

Each has auth that works, deployments that don't break, and test suites that catch regressions. The CSL Capital pipeline runs in production on a live loan portfolio. The R9 apps run on a DigitalOcean droplet with Docker Compose and Caddy. Real infrastructure, not scaffolding.

03

The Real Ratio

The productivity claims around AI coding have gotten out of hand. “100x developer.” “1,000x velocity.” I'm not going to say that.

Here's what I actually observed: roughly 6 to 8 times my hours in equivalent output. My roughly 270 hours produced what a senior developer would need 1,500 to 2,000 hours to build — a full year at market rates, valued at $230,000 to $400,000.

Six to eight times sounds modest next to the marketing claims. But that is the difference between shipping and not shipping. For someone without a development team, it's the difference between having production software and having a problem.

The ratio is not constant

Early in a new codebase: maybe 3x. After three weeks of deep context — where I've loaded the schema, the architecture, the edge cases — closer to 10x. The ratio compounds with practice. That's the key thing.

None of this is autonomous. I was present for every architecture decision, every schema choice, every judgment call about whether an output was actually correct. The AI generated the syntax. I directed the system. The leverage is real — but it requires a human in the loop who understands the domain.

04

What Changed About How I Think

The most significant change was not in the tools. It was in the feedback loop.

Week 1
Translating

I was converting. Domain knowledge on one side. Technical specification on the other. The bottleneck was me — getting specific enough that the system could execute clearly. The gap between what I knew and what I could articulate was enormous.

Week 3
Compressing

The loop tightened. I stopped translating between 'what I need' and 'what to ask.' The two began to collapse into one motion. I could open a session, describe a problem in plain language, and get something I could evaluate and refine.

Week 6
Pattern-matching

I could hand a hard problem to a session and know by the end whether the output was right — not because I read every line, but because my mental model of the system had gotten accurate enough to recognize the shape of correct.

This is the compounding no one talks about. The AI doesn't improve. You get faster at extracting value from it. The investment is in developing that feedback loop, and it only tightens through daily practice.

The difference between someone who uses AI occasionally and someone who uses it every day is not incremental. It's a different category of competency — the same way someone who plays an instrument for 15 minutes every day gets better faster than someone who plays for two hours once a week, even if the second person is putting in more total time.

05

Vibe Coding vs. AI-Native Engineering

There's a term in circulation: “vibe coding.” It means prompting your way to an output without understanding the underlying system — following the AI's suggestions and accepting what comes.

It works great for prototypes. It fails for anything you have to maintain or trust. What I'm describing is different.

Vibe Coding
  • Prompt → accept → move on
  • No test coverage
  • AI decides the architecture
  • Prototype quality
  • Works until it doesn't
AI-Native Engineering
  • Harness first, then routes
  • Smoke test before reading the code
  • Architecture decisions are yours
  • Production quality
  • 1,000+ tests across 5 apps

The discipline starts before any code is written. Before I ask for the first route or function, I ask for the test harness: an in-memory database, mocked external APIs, a framework that runs the full pipeline locally in seconds without credentials. That harness is the foundation. Without it, you can't iterate freely, you can't run CI, and you can't trust that anything actually works.

After that: smoke test before reading generated code. Does it start? Does the main path run? If the smoke test fails, the code isn't real, and there's no point reading it. If it passes, keep going.

The architecture decisions stay mine. The AI doesn't decide the schema. It doesn't decide the auth pattern. It doesn't decide how errors are handled at the boundary between the system and external APIs. Those decisions require domain understanding the AI doesn't have. The 1,000+ tests, the CI/CD pipelines, the Docker deployments, the database migrations — these are the evidence of engineering discipline applied to AI-generated code. The tools are generative. The practice is engineering.

06

Multi-Model Routing

Claude Code is where I live. But it's not the only tool in the workflow. Over the same period I ran 136 Codex sessions across 31 of 40 days — a second model, used very differently.

Claude Code — Daily Driver
Sessions1,200
Days active68 of 78
Tokens27.4M
Median sessionMulti-turn
Codex — Escalation Specialist
Sessions136
Days active31 of 40
Tokens~1.1M (est.)
Median session1 message

The median Codex session is one message. That's not a daily driver — that's a surgical tool. The routing rule: if Claude produces a wrong diagnosis or gets the same bug wrong twice, stop and escalate to Codex before a third attempt. Don't retry the same reasoning model on the same problem.

The 42 web searches and roughly 2,900 tool calls in those Codex sessions are mostly codebase exploration — reading files, tracing call stacks, isolating the actual root cause before any fix is proposed. The models are not interchangeable. They have different reasoning patterns, and routing between them deliberately is part of the practice.

Work with Nektar

Nektar embeds in your domain — spending enough time understanding how your business actually works that when we build the analytical layer, it reflects your reality rather than a generic interpretation of a spec. If you have deep expertise and a problem that's never been worth the cost of solving, let's talk.

Book a free data audit