The Power and Peril of Using Multiple Simultaneous AI Coding Agents By Ed Lyons

Image by Ed Lyons via Midjourney

I first heard of using multiple simultaneous coding agents on your code several weeks ago. I loaded up a YouTube video where a very enthusiastic promoter of AI coding tools was showing how he used Cursor to spin up multiple “background agents” to build his new application. After watching his very contrived and unrealistic example, I shook my head and thought, “This is madness.” I did not ‘like and subscribe’ as the host asked.

I am a big fan of agent coding, where I give instructions and the agent modifies the code for me across many files. However, significant supervision is required. The more changes you ask it to make in a single prompt, the more you need to supervise. So having more than one agent working in parallel felt like a recipe for disaster. There would be too much code to review, and there would be ugly conflicts due to agents all modifying the same files in different ways.

However, since then, the practice of using multiple agents has evolved, and is now available in more than one form. I have found there are many situations where they are useful.

An example from a major tool vendor is Cursor’s “background agents.” It gives you the ability to run Cursor in multiple, separate, cloud-based environments that contain a copy of your project. You can then give a task to each one that will happen while you do something else, including using Cursor as you always have on your local machine. 

This setup isn’t unique to Cursor. OpenAI’s Codex agents work in a similar way.

The recommended way to avoid merge conflicts is to use git worktrees, which is a different flow than using separate branches, but accomplishes a similar goal. Developers who use background agents typically use worktrees for different areas of a project so that there will be no conflicts later. They usually review the work through the agents generating pull requests. The human developer can then review the PRs. 

I am not a fan of reviewing agent work in a pull request. I would rather review changes in something like Visual Studio Code so I can make alterations right there before a local commit. Also, by running tasks in my terminal, I can stop a long-running task that is clearly headed in the wrong direction. Though I can imagine some kinds of tasks where I wouldn’t mind letting it finish without intervention.

Another issue to consider is what happens to the check-ins that agents typically do on a task to ask you permission to do some operation or to explain their progress. I have found this to be essential.  Of course, you can watch the multiple agent windows and work through the check-ins as they come up. But this, isn’t what the promotional videos show, as it would severely reduce the speed gains of synchronous work. 

You could just turn off the requests. In an instructional multi-agent Claude Code video, I saw a developer set each Claude agent to “dangerous” or “YOLO” permissions mode, where it does whatever it wants to your system so that you aren’t interrupted by permission requests. Hmmmm... So if Claude labels using one agent in this way as “dangerous,” I wonder what we should call having multiple agents act dangerously at the same time?

This situation once again demonstrates the tradeoff between speed and safety, which is always part of using these tools. So sure, the dangerous multi-agent demonstrations look incredible. Yet I am quite sure your project manager - nevermind your company’s chief security officer - does not want you using “YOLO” mode.  

I can see using these kinds of agents in a large codebase where there are long-running, low-risk tasks in different sections of the project. I could have one agent work on adding and refining test cases while I was working on a feature elsewhere. 

There are other scenarios where a group of coding tasks is a well-understood assignment. For example, perhaps your company adding a new client means that there are several standard coding and configuration changes to a codebase. You could then specify fine-grained permissions and access controls to let your agents only have what they need to add for that client. Reviewing the resulting PRs would not be a problem at all. 

But I rarely use simultaneous agents for my coding. The benefits of synchronous work is severely reduced by managing them and reviewing their results. After all, my reviews are going to be sequential, not in parallel. 

Note that the separate environments of the independent agents of OpenAI and Cursor have the advantage of a distinct context for all of their activities (one agent will have no knowledge of what any other has done). Experienced users of coding agents know that the context window and your session conversation history are important for improving accuracy, yet they are limited resources. 

So it is attractive to those power users that the agent you have assigned to test cases is not hearing a lot of noise on unrelated items, and can fill up its own context window with content that is just about testing.

Claude Code implemented this agent separation concept in a very elegant way, and is another method to use multiple agents that I believe is far more useful.

Claude’s new ‘sub-agents’ are not in the cloud, but all live in your local environment, and are more like separate personas of your primary agent. The main benefit is that there is a separate context for each one, so each can specialize. 

Claude has a nice setup feature where you can decide what you want your sub-agent to specialize in, and then you save its configuration. That new agent can either be used only on your project, or re-used across multiple projects on your machine. As you can see below, Claude will offer you a default description and prompt to go with your queries that is a typical mystical incantation to get it to produce better results.

When you save the configuration, Claude will invoke it when it sees a good match between what you ask for and what sub-agents you have registered. Or you can ask Claude to use a specific agent explicitly: 

It is also possible to combine background and sub-agent techniques in Claude and have multiple sub-agents work on your codebase at the same time, though you would then run into the same tradeoffs you have with Cursor and Codex agents. 

Similar to background agents, the specialization benefits of sub-agents are only going to be worth it to developers who are experienced with these tools, and can reap the benefits of the separate contexts.

In conclusion, think of using multiple coding agents as you would think of walking multiple dogs at once. If you’re going to hold four leashes at a time, you’d better be good with dogs, and you should know where you are going to take them. And you’d better have already done the walking route with just one dog. 

Ed Lyons