The Counterintuitive Decision

Everyone is building AI to replace the human in the loop. This week I deliberately put one back in. [Why — and what it taught me about wrong solutions and AI]

May 06, 2026

Okay so life happened… you know - the backyards, plantings, mulching, many trips to the store… yes all of that.

Which means not a lot of writing about AI tax assistant happened… But here is what DID happen in the middle of it.

I decided to try and parse my W-2 by asking AI to write the entire system. Well, I did ask for the breakdown and all - if you have followed this thread so far, you know the drill [if not, see this Post]. For the coding, I could do one of those slick vibe coding apps or Claude code, etc. but I wanted to do “basic”. So I gave chatGPT (free) some instructions and had it write some code. And it did, and it worked - a little bit.

The thing to understand about AI (which floats around quite a bit in noob AI circles but I totally did not appreciate) is that when AI reads $i-i!t, it doesn’t create remember it automatically. Between every session, it forgets everything — the code, the context, the decisions, all of it. I realized that this was going to become complex really quickly because the code needs to be iteratively built. SO, that meant I needed to devise a way to create memory for AI. AND of the many ways I tried to do it, and here are the TWO REAL learnings:

LEARNING ONE: Use Claude Project, upload your key files as context - that way it does not have to be in the temporary memory, because even in version where you are logged in, having that kind of complexity (i.e., code) will mean that the context becomes super (and I mean SUPER) heavy exponentially fast.

Creating a project means the context is relatively small. [See sidebar1 below]

LEARNING TWO: Create a State file and update it with information from each session, with progress, roadblocks, decisions, architecture, etc. This is the memory you wan to create. Upload it as a file in the Project documents. Ask AI to read this at the start of every session. [See sidebar 2 below]

[Sidebar1: Look up what AI context is if this is not familiar - it wasn’t to me before I dove in to this thing

Sidebar2: One could try to solve this problem by buying a subscription that allows a lot more resources. I haven’t bought a paid subscription yet. Frugality is going to be a consistent theme. I think doing more with less makes things more interesting and brings out human ingenuity.]

Okay, so back to the parsing of W-2s. Like I said, reading PDFs is not new, this problem has been solved. When I asked AI to solve it, I expected a good solution first up. But AI did not do a good job of it. It wrote a pretty convincing set of codes and added pretty sleek bells and whistles (fully annotated like a good dev, with a test file, and test code). I read bits of the code (I know, I know - I was being lazy) and it looked like it was going to work. It did with the test code. AI said you’re done man! Well, I wasn’t…

The moment I tested it on my W-2, it was utter chaos. What came back looked like this:

158AMOUNT 39AMOUNT 158AMOUNT 39AMOUNT…

My W-2 is printed multiple times side-by-side — Employee Copy, Federal Copy, State Copy. With that my journey of debugging with AI begun. A long runway of problem solving has given me (and most probably a lot of others like me) a really good intuition that is hard for AI to beat. At least in this case it was. AI tried to tinker around with the reading of the file - a bandage here, an antibiotic there - but it did not work.

This is when I had the biggest insight so far. AI does not solve problems holistically. It solves for the most proximate problem, which may not always be the best for the entire system.

This is also when I landed on LEARNING ONE and LEARNING TWO - we were trying to solve a complex problem and AI kept hitting its compute upper limit. It would keep asking me to start a new session. And loose ALL the context. SO I figured out the workarounds I mentioned above.

After a few iterations, I decided to take matters into my own hand. I stopped asking AI to fix its own fix, and told it to go look at how the world had solved this problem. Why try to reinvent the wheel for a pretty well-solved problem? I don’t know why AI did not do it already (or why this was not built into it’s intuition. Anywho, after this it came back with better solutions and a better diagnosis of the problem.

The parser didn’t see columns. It read left-to-right across all copies at once. Labels nowhere near their values. Numbers split across three separate lines. The regex would never be able to parse that reliably. Then it found a better library (pdfplumber) to solve this problem - pdfplumber reads with X/Y coordinates, so every word knows exactly where it sits on the page. It also built a local LLM fallback to parse data which did not quite work well too. I set that aside as a more complicated issue to handled later. The improvement was material but It did not take us to 100% not even 90%. My W-2 is quirky - I’ll give it that…

Then I met my friend and senior from Uni, Mithun, the other day at an Alumni meet. Mithun is a devoted engineer, quite unlike me. Turns out he is solving a similar problem, in a very different setting, but at the most abstract level (which I thinks engineers have an innate ability to extract - #plugforengineers) very similar to mine. Mithun advised me to use an AI model specifically built to read PDFs (the name escapes me and I need to call Mithun #mentalnote). Having two AIs multitasking and a system optimizing the operation would be a pretty cool set up.

But then I had another idea, why can’t I just have a human in the loop. Three things prompted this idea.

Thing ONE: I just need it to read mine.

Right now, I don’t need a complex system that reads all the bloody W-2s in the www (whole wide world). I just need it to read mine. Scale is a future problem. Complexity is a today problem.

Thing TWO: Focus is a finite resource.

While having two AIs would be beautiful — but it would need configuration and tuning on my munchkin Mac Pro. Mithun asked why I did not buy one of the other laptops that were more AI friendly. I DID consider it. Or more like AI did. But I have become quite the Apple guy now - the ecosystem works, this machine works, and that decision has been made. As Kahnemann says, System 2 is a finite resource — I need it on the parsing problem, not on switching hardware or creating a cool two AI architecture.

Thing THREE: I am one of the best PDF readers in the world.

(and by “I”, I mean “We”, as in The Human EYE)

Why can’t I help? It would decrease the complexity drastically in this system and adds a maker-checker control on the most critical input to the entire system. So that’s what I am building right now. The base parser reads my W-2 but it is not 100% accurate. The human validation layer goes on top of it next.

In all of this, I did not write a single line of code.

They say AI is going to take away coding jobs. Yes, to some extent, BUT NOT REALLY, not anytime soon. I needed my engineering chops to figure this out and spar with it quite a bit. It DID make it possible for me to get on with it. BUT I was the one who got on with it.

I would say: Me-the vision/the hustle; AI-the expertise/the straight and narrow.

This is Part 5 of an ongoing series on building a private, local AI tax assistant — one hour a week, on consumer hardware, without sending financial data anywhere.

Part 1: Building a Private AI Tax Assistant: In public, on a MacBook!

Part 2: The Infrastructure Tax

Part 3: The Blueprint

Part 4: It Actually Works. Kinda.

If you’re building something similar or have any questions/ideas to share, I’d love to hear from you. Cheers!

I. Thinking on strategy, innovation, and philosophy — for people who think seriously about how to build things and make decisions.

Margin Notes

Discussion about this post

Ready for more?