There’s a strange bargain being struck in the world of AI. Companies are launching powerful tools in what feels like a perpetual beta, with developers and early adopters serving as the de facto QA team. We hear endlessly about the power of these new coding assistants, yet so many users are caught in a frustrating loop of one step forward, two steps back. Features like Claude Code’s Sub Agents are added to amplify their capabilities, but without perfect execution, they often just create a high-volume mess of unhelpful code.
In this race for market and mind share, it feels as though the Overton window for what is considered “production-ready” has been fundamentally shifted. We have tacitly agreed that it’s okay for our most advanced tools to be non-deterministic—for them to be confidently wrong. This has created an unprecedented dynamic where the burden of managing a tool’s inherent flaws is passed from its creator to its user. I can’t think of another major technology wave where the responsibility to use a product correctly—to work around its core deficiencies—was so squarely placed on the customer.
As we release ever more powerful models, the immediate risk isn’t just a catastrophic failure, but a more pervasive and corrosive frustration for the people trying to integrate them into their work.
Out of this mess, a new cottage industry is being born. Communities of practice are forming to share tips and workarounds. A massive opportunity has opened up for consultants, service providers, and content creators who can help others make sense of the chaos. In the short term, the people who will gain an edge are those who persistently experiment, learn from the collective, and keep tweaking these tools until they behave. The tech giants will surely improve their models over time, but we are still some ways from a truly seamless experience.
This same friction between expectation and reality is why the first wave of AI-native hardware startups failed so miserably. Their products collided head-on with what we demand from a physical device: reliability. A piece of hardware is supposed to just work. This requires a cognitive upheaval from consumers, a new willingness to accept a certain level of failure from our gadgets. We don’t tolerate this from our phones, and the inconsistency of early voice assistants like Alexa bred a deep-seated distrust that they never fully overcame.
So, how do we, as a consumer base, become okay with the non-deterministic nature of AI? No one has a clear answer. It’s a dealbreaker in coding, in finance, and in countless other B2B scenarios where precision is non-negotiable. The only places this unpredictability is celebrated are in creative applications—image and text generation—where the hallucinations are reframed as serendipity.
When you look at the mountain of capital that has poured into the AI market over the last five years, the path forward seems like a combination of storytelling, freebies, influencer marketing and a general fake-it-till-you-make-it attitude. For the valuations to make sense, the industry has to solve the problem of reliability. It must meet the deterministic expectations of the enterprise and the average consumer, or risk the entire boom leading to another long AI winter before fundamental breakthroughs such as continual learning come about. In and all, what a time to be alive!