- The Blueprint
- Posts
- My 3-Tool Rotation for AI Engineering
My 3-Tool Rotation for AI Engineering
After 10 months of daily AI experimentation, here's my stack.
Hey!
Chris here. Welcome to Blueprint—the newsletter to help you build a winning engineering team.
I'm often asked about my AI tool stack.
People want a list of tools they can pick up and start using tomorrow.
But with how quickly these tools change, the best engineers work on a rotation now.
My workflow changes almost every week, and sometimes multiple times a day. Models update without warning, so the tools that worked perfectly on Monday might be slower or broken by Friday.
Let me show you what's actually working for me right now—and more importantly, how I think about adapting when everything changes constantly. 👇️
📒 DEEP DIVE
My 3-Tool Rotation for AI Engineering
How I use each one (and the escalation system for knowing when to switch).

I started experimenting with AI engineering tools at the literal beginning of this AI era.
In February of this year, I was copying and pasting code into ChatGPT. Since then, I've been through Cursor, Claude Code's early versions, and probably a dozen other tools that either didn't stick or were replaced by better ones over time.
Right now, I'm rotating between 3 primary platforms:
1. Claude Code
This became my primary workhorse after the Opus 4.5 release. The model's tool usage is exceptional, especially at command-line execution.
While I initially found the planning mode annoying, I did a 180 once I finally committed to using it properly.
The planning mode asks every question I would have forgotten to ask. It predicts roadblocks and sequences work intelligently. Then you can flip it into "no prompt confirmations needed" mode and watch it execute.
I recently tested it on a feature the model estimated would take 10 weeks. After flipping it into no-confirmation mode, I left, made lunch, and came back 1.5 hours later.
It ran that entire time without asking me a single question. It wrote code, tested it, and validated it, all completely autonomously. I'd never seen anything like it.
2. Codex
I'm talking about the CLI-based engineering agent built on GPT-5.1 Max High. It's my diagnostic engine.
When Claude gets stuck, I move the entire context into Codex. It thinks and works more slowly, but finds the right answer every time.
3. Gemini
Gemini isn't a primary workhorse (yet), but the Nano Banana multimodal features have been really impressive.
I can take foreign codebases I've never seen before, generate descriptions, feed them into Nano Banana, and get infographics that make everything comprehensible. This allows me to visualize codebases in a way I never have before.
I don't think this way naturally, but as I've played with the tool more, I've found ways to use it that accelerate my learning dramatically.
My Multi-LLM Escalation System
Having 3 powerful tools means nothing if you don't know how to orchestrate them.
Here's my exact process:
Step 1: Start with Claude Code for Speed
It's the fastest executor, great at Bash and agentic loops, and very reliable for multi-step workflows.
Step 2: Watch Its Thought-Stream
I always keep streams on because reading the model's real-time reasoning helps me steer. I look for signs of confusion, repeated failed attempts, or silent loops.
If it's stuck, I stop it and move on to the next step.
Step 3: Hand Off to Codex When It Gets Lost
I copy everything Claude generated and ask Codex: "One of my engineers has been working on this problem for three days and can't solve it. Diagnose what's wrong."
Codex will take its time to think, but it almost always finds a solution.
Step 4: Use API Keys If Both Struggle
Sometimes ChatGPT Max is throttled. API keys are a way to skirt this and get full horsepower. I'll burn $20 in tokens without hesitation if it returns value.
My guiding philosophy: Always use the best possible model.
Speed and price aren't a concern for me. Intelligence is the factor I care about.
The Training Bottleneck
I spend 25-40% of my time just learning about these tools and the models they're built on. I'll run experiments with no end goal other than to understand the AI better.
I've tried training my team to think the way I do. We've made progress on less creative tasks, like debugging.
But the results I get—the 1.5-hour autonomous sessions or the instant problem diagnosis—come from an interplay of instinct, experience, curiosity, taste, and metacognition.
Teaching this is like teaching comedic timing. You can't just hand someone a set list and make them a great comedian.
I'm pretty plugged in to the way this is done. I've been using these tools since the very beginning, but I still can't figure out how to package the teaching in a way that scales.
And maybe it's not supposed to scale. It might be more like developing taste.
I've been honing mine for 44 years and haven't found a shortcut to get there yet.
BEFORE YOU GO…
Listen, I get it. Switching tools every week sounds exhausting.
It would certainly be easier to find a tool that works and stick with it.
But that's not the world we're in anymore. The tool that works today is 2nd tier tomorrow and ancient technology a week from now.
The engineers who thrive right now are the ones who can read the release notes on a new model, test it against their current workflow, and switch within an hour if it's better.
It's occasionally tedious and requires constant learning, but it delivers incredible results.
If you want to build, if you want to stay ahead, this is what it takes now.
Talk soon,
Chris.