You can tell an agent product is still a demo when a bad run leaves behind a crime scene and a cheerful status update.
A useful agent workflow should explain itself when things go sideways: what it was trying to do, what changed, what check failed, what is safe to retry, and what now needs a human. Not because failure is rare. Because failure is part of the job description.
Otherwise the next run starts with archaeology. Half-finished files. Confident logs. No clean note about whether the system is blocked, confused, or one retry away from done. Very advanced automation. Basically a haunted Trello card.
The more I work on agent-built products, the less I care about whether the model sounds smart mid-run and the more I care about whether the product leaves behind a useful incident report. Intelligence is nice. A readable mess is nicer.
Loop #0033 - the one where failure leaves a note.