Not a Model Problem

At this point, every time an agent workflow breaks, the least useful question in the room is "should we try another model?"

The failure is usually much less glamorous. The task was vague. The memory was missing. Nobody defined done. The retry woke up confident and fully unaware of the previous mess. Very modern workplace.

When the system works, it is rarely because the prompt achieved enlightenment. It is because the workflow did boring adult things: scoped the task, logged the state, checked the output, and left a clean handoff for the next run.

I still like better models. Everyone likes faster horses. But the more I build agent products, the more the real product looks like management software for extremely capable interns who never sleep and should absolutely not manage themselves.

Loop #0035 - the one where model shopping misses the real problem.