How We Use AI Coding Agents

Last week, we had an hour-long session to discuss our workflows and tips for using AI coding agents. Here's what we learned.

Start with research, not code

One of the biggest lessons: don't fire up Claude Code or Cursor to start writing code the moment you get a task. Start with research.

For complex features, especially ones that touch multiple parts of the codebase, spend time in the Claude or ChatGPT desktop app first. Have an architecture discussion. Figure out which libraries to use, what the latest best practices are, how things should be structured. Alternatively, use plan mode in Claude Code, or just instruct it not to write code right away.

The goal is to build vocabulary. If you don't understand what you're describing, you won't be able to articulate it to the agent, and the agent won't know what you're talking about either. Do the messy, convoluted research separately. Then curate the best parts into a clean plan before bringing it to your coding agent.

Claude Code vs Cursor

Our team is split into a few different camps.

Claude Code only: Some of us use Claude Code exclusively. It's built by Anthropic specifically for Claude, so the system prompt and tool descriptions are supposedly optimized for their models.

Cursor only: Others prefer Cursor. The argument: it's ridiculously fast. For small, contained changes where you know exactly what needs to happen, Cursor's Composer rips through it. The workflow is to use Cursor's plan mode with Opus to generate a detailed plan, then switch to Composer which executes the plan at high speed. Cursor's RAG over your codebase also pulls in relevant context automatically.

Switching between tools: Some of us go back and forth intentionally. Claude Code for complex agentic work. Cursor for speed on small surgical fixes.

"Claude completely crushes this" (and when it doesn't)

Some tasks AI rips through. Others, not so much.

Where AI shines:

Isolated bugs in a single file
Small, self-contained features
Boilerplate and scripts
Writing tests (sometimes too verbose, but functional)
Tasks where you can easily verify the output

One engineer (Conner) described a secrets backfill task: "Just barfs Python that works. Super easy."

Where AI struggles:

Code that touches everything. Auth is the classic example. It's a thin thread through a complex path with many pieces that need to work together. AI has a hard time with this.

Parallelism and concurrency. Anything with systems-level complexity. One example: AI wrote code that passed all tests but included if no available workers: sleep for 5 seconds. Tests passed, but user experience would have been terrible.

Staying in control

The biggest frustration with AI agents: they make a wall of changes you can't track.

Here's how we deal with it:

Use plan mode. Before letting the agent make changes, have it outline what it's going to do. Vet the plan. Agents have gotten eager lately. They'll implement something the moment you ask a question about it. Tell them explicitly: "Don't make any edits. Recommend what you're going to do first."

Work in phases. During planning, break the work into phases. Implement phase one, review the changes in your IDE's git view, make manual edits if needed, commit. Then tell the agent: "I've committed this with git hash X. Let's continue to phase two." This builds context in your head and creates checkpoints.

Commit incrementally. Some of us have Claude Code auto-commit after every chunk of progress with a descriptive message. This avoids the "what did it even change?" problem.

Use draft PRs. Have the agent create a draft PR with a description. Review everything before marking it ready.

Agents sometimes write code faster than you can keep track of. Structure the workflow so you can.

Working in parallel

One workflow that's been effective: run multiple Claude Code sessions at the same time.

Open 3-5 terminal tabs. Start a different task in each. While one is exploring the codebase, another might be fixing a CI issue, another helping you review a PR. Check back on each as they make progress.

Use git worktrees to avoid conflicts when running multiple tasks in the same repo. Ask Claude Code to create a new worktree for a task so you're not switching branches in your main working directory.

This type of workflow has helped us crush CI issues in our open source repo. One issue had even been sitting there for over a year, slowly eating away at everyone's attention. We were able to crush it using this approach.

PR reviews with AI

We use Greptile for automated PR reviews. It's good at catching details: off-by-one errors, edge cases, small bugs. It gives you the freedom to review at a higher level because you're not hunting for typos.

The downside: it's too nice. It praises everything. We wish it would be more critical.

The other approach is conversational review. Instead of a one-shot AI review, open Claude Code, check out the PR locally, and ask questions. "What does this function do?" "What does this line do?" You can start at a high level and dive in where you want to.

One thing we agree on: humans still own architectural review. AI can catch implementation bugs. It's not going to tell you if the overall design makes sense.

Join us

Want to join an AI-forward team with solid engineering processes? We're hiring. If any of the positions looks like a good fit, feel free to email me at yksu@eventualcomputing.com.