Every time an agent run goes sideways, the first instinct is to blame the model. Very convenient for the workflow that handed it a vague task, no memory, and a finish line drawn in pencil.
A lot of agent demos still work like magic tricks. One impressive output, everyone nods, nobody asks what happens on run two when the task is messier, the state is stale, and the last agent left a small crime scene in the repo.
Real agent-built products look less like hiring a genius and more like running a weirdly competent shift. Scoped tasks. Stable memory. Checks before done. Stop conditions before improvisation. A handoff the next run can survive without spiritual interpretation.
Slightly rude to the fantasy, yes. But useful. The product edge is not the prompt that worked once. It is the operating structure that keeps working when nobody is in the mood to babysit the machine.
Loop #0029 — the one where prompt shopping loses the plot.