Ask HN: How are people testing while using agent orchestrators?
A developer using Conductor reports challenges with testing code in isolated environments, particularly due to unreliable syncing via the 'Spotlight' feature and limitations of current workarounds. They have experimented with local setup scripts, per-PR staging environments using Terraform, and deploying directly behind feature flags, each presenting issues around port conflicts, cost, deployment delays, or risk of regressions. The developer is considering local VMs per worktree but is seeking input on existing solutions.
Opening excerpt (first ~120 words) tap to expand
I'm using Conductor and overall it's been a game changer for my productivity. The one hiccup is that their "Spotlight" feature, which is supposed to sync the worktree with my root and thus make testing locally possible, doesn't work reliably. Even if it did, it wouldn't be exactly what I need because I want each workstream to be able to test independently.Three things I've tried so far, none of which are working well:1. I used a Conductor setup script that runs my local dev setup in each worktree. This didn't work because of port collisions between docker containers.2. I'm using terraform, so it was trivial to spin up a copy of my staging infra (with fewer resources) for every PR. This let each claude session in Conductor use Playright to test it's code.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Ycombinator.