The biggest mistake I see engineers make with AI coding tools isn't using them too much. It's using them the same way regardless of what they're building.

Whipping out Claude Code for a production service the same way you'd use it for a throwaway script is how you end up with a codebase nobody can maintain (including you).

After months of using AI across everything from quick scripts to full product builds, I've settled into a three-tier system. The rules change depending on what you're shipping.

Tier 1: Simple scripts & throwaway tasks

Fire up Claude Code in the terminal, give it a clear goal, and take a walk. Don't bother reviewing the code, just check the output and final product. These tasks are low-stakes, fast, and the cost of a mistake is near zero. Let the AI cook. OF COURSE if you are doing things like giving it WRITE ACCESS to anything, review the code and be extremely careful. I’m mostly talking about simple read only tasks. To make this even easier, just dump some of your data from a database as a CSV instead of giving it an API key if you’re worried. Criteria for this:

  • One-off data transforms, file renaming, quick automation

  • Nothing going to production, no review needed

  • Speed is the only metric that matters here

Tier 2: Small feature additions

Use an IDE like Cursor or Claude in VS Code. Understand the context before you let it write anything. Give it clear acceptance criteria ideally connected directly to your ticket via a Linear MCP or similar.

  • Let it write the code and the tests

  • Make sure it knows the build command and has access to logs

  • Your review job: focus on code quality, readability, and integrations with other internal services. I find the logic itself is usually correct (especially if given the right tools) but it can make bad assumptions about internal tools that are poorly documented. Make sure these integrations are correct and tight.

Tier 3: Large projects & product builds

Write a detailed spec (PRD) before a single line of code gets written. Your job shifts from coder to supervisor. I still only use one core agent for this (typically Claude Code or Codex) but heavily utilize subagents for speed and review. This is as simple as telling Claude to “use subagents”.

  • Everything must be testable via code (Playwright MCP is great for this)

  • Make it explicitly call out what it can't test (OAuth flows, DNS, etc.) so you handle those manually

  • Never go more than 10 minutes without checking in, give it clear success criteria each time

  • For AI-core products: keep the file system human-readable, consolidate all prompts in one place so you can iterate on them by hand

  • Invest in good logging and tracing. Things will break, you need to be able to figure out what and why.

I’m working on a YouTube video breaking this down in a little more depth. Stay tuned for that.

Have a good week!
Arjay.

Keep reading