The end of product development? Just give it to AI.
Is AI part of the product, or the product itself?
Number of baby/toddler interruptions while writing this text: 2
Apologies for the spicy title - but as you’ll soon learn, not only do I not believe there’s an end to product development; I believe there’s two separate strands of product development to think about here.
This post was spurred on by a recent juxtaposition framing I saw1 where “The Gen AI is the product - just iterate within it” and “Product development is still about UX and quality / risk management - AI is a just a delivery mechanism” were put forward as a reality of two extremes. As usual, I believe there’s truth to both and a different framing is needed - one that I think helps understand a lot about why 2025 looks very different AI-wise, to 2021 or 2019 or any other year where the same class of models existed but the public opinion and framing was different.
The idea it boils down to is - AI is the product AND products using AI are a product. The GenAI that the labs are competing on right now is not a fundamental environment change for digital product development (no matter how much they’d like you to believe it’s a stone’s throw from being human-level AI), but it is a new component that shifts the paradigm - although the extent of that shift will differ depending on where on the Sceptic-Hyper spectrum you sit.
But to frame it in terms that, I think, are compelling to follow, I first need to take you on a journey through some of the AI jargon that is floating around and why it’s important.
A (selected) overview of AI terms
Not so long ago, I would’ve abhorred calling myself as “being in AI2”. To me, artificial intelligence is a term that makes most sense when reserved for the actual cutting edge research where machines are being put to reason, to solve novel challenges and pushing the boundary of what silicon can do in relation to organic brains3.
I would’ve called myself a decision scientist, and 99% of what others would call AI - I would call automated decisioning. This is because until the advent of generative AI - the only thing advanced machine learning models did were making decisions within a constrained set of options:
They might’ve decided to label a document with labels (from a constrained set) that are important for the business process
They might’ve decided to create a perimeter over a subset of an image, labelling it with a label from a set of options, to facilitate object detection
They might’ve decided to label a customer as high-risk for churn, or for fraud, which fed into a business process that processed those labels
…. crucially - all of these decision processes, had a very simple objective that is fairly aligned to the space in which they make decisions (the decision space). This made them specialised decision engines.
But with what the industry is calling generative AI - the decision space exploded. The machines could now decide from a seemingly unconstrained space of natural language (LLM’s or Large Language Models, e.g. ChatGPT, Gemini or Claude) or visual imagery (text-to-image, e.g. Midjourney or Dall-E). They could decide for a prompt, what is the best continuation is a series of words that forms a reasonably astute answer, if you squint while you look at it. Or they’d decide that for a prompt a certain way to arrange pixels would work well to describe that.
The unconstrained nature of the decision space that generative AI models make decisions in - is incredibly important. This is what allows them to be generic - which is why we also call them generalised AI, or indeed if you want, AGI (artificial generalised intelligence). AGI is a very loaded term though, which is why I make the distinction here. I think the Generative AI models we have today would be better classed as Generalised Automated Decision Engines, than Generalised Artificial Intelligence - for the same reason why I wouldn’t have called myself as someone working in AI 5 years ago - because automated decisioning from a constrained decision space and artificial intelligence are not the same kettle of fish. Well, at least to me.
I used decision engine a couple times now, so it’s worth explaining what I mean by that. You can think of automated decisioning (or really, any decisioning) as a simple process - you have a constrained set of inputs that inform the decision, and a constrained set of outputs (we call them options in decision science parlance), and a decision is made by using the decision engine to choose the optimal option.
By using language and arrangement of pixels as the option space, Large Language Models and Image-To-Text models achieved in the late 2010’s something so far unthinkable, they exploded the option space to a scale unfathomable to humans. This is because you can arrange pixels an almost infinite amount of ways4, and natural language is also, practically intractable to humans5. This explosion means that there’s suddenly a vast array of objectives you can achieve with these models, as the output space is unconstrained. But what has happened in doing so - is that the great correlation that specialised models have had between the option space and the objective that guides them - has been broken. Remember, the objective of LLM’s is to predict the next token, no more, no less.
But this still wasn’t what caused the paradigm shift. Up until 2022, the models were still used by ML professionals in a specialised way, even if they did have generic outputs, and these professionals carefully chose what to feed as inputs.
This is where 2022 comes forward, and Sam Altman does the world a (dis)service6 and launches ChatGPT7. By doing so, a Chatbot Interface was added to these models, to allow anyone to input anything, and ask a model for output. In doing so, the input space has suddenly exploded, and suddenly you could input anything and get a reasonable enough output on the other side. This made the models truly generalisable and captured the public imagination.
I’d say the spectrums you should keep in mind are:
Output: Constrained Models ↔ Generative Models
Input: Constrained Models ↔ Chatbots
Function: Specialised Models (deep but narrow) ↔ Generalised Models (broad but shallow)
The key insight is that combining a generative model with a chatbot interface pushes a product far to the right on all three spectrums, creating a truly generalisable tool.
Enter the metaphor
The change that has happened is that suddenly something (automated decisioning algorithms) that was the domain of brainiacs in specialised labs (ML departments) has been unleashed to be used by your neighbour Steve, should he want to. Not only that, but he could use it without even having any interfaces that digital software is known for, he could just talk to it.

The previous paradigm shift this mostly resembles, in my mind, is the shift from specialised business computer mainframes to the personal computer being available for any household in the world. This is where I agree with Bill Gates8 that this is similar to what the combination of the PC (once it reached mass-market price point) and the GUI9 has done for adoption of computers as productivity tools. If PC+Windows was the inflection point for usage of deterministic automation (what software engineers do) then GenAI Chatbots are the inflection point for usage of stochastic10 automation (what AI engineers do).
So there’s a couple ways you can take this metaphor further now:
The new era means we have a new operating system (Chatbots) - in which we’ll be building new applications (The Microsoft Windows metaphor).
The new era means we have a new way to visualise output in relation to input - and we should expect outputs of software to be increasingly personal as they don’t have to be constrained (The GUI metaphor)
The new era means we have a new component in the complex system of personal computing; one that can handle infinite complexity and infinite outputs seamlessly, and can be programmed by your neighbour, Steve (The PC metaphor)
I make the case that we are largely11 living through the third one there - the generalised generative model being the component that allows us to tackle problems differently. Under that metaphor, the AI Model is both the product in the same way the PC was the product that defined multiple companies (think Intel, AMD, Dell, IBM, Apple) and enabled multiple companies to build software for it (think the IT industry).
This would mean - the AI labs are here to churn increasingly better generalised models (your Google’s, Open AI’s, Meta’s, Deepseek’s and Anthropic’s of the world) as components to be used by builders to develop their products (everyone else).
I do not think we are living in the Windows metaphor, because I think the stochastic nature of the chatbot isn’t what the end-user will want for their day-to-day interface. They’ll want a stable interface, where they can invoke the AI when they need to. And not invoke it and still get their printer to print or their slide deck to move the slide.
I do not think we are living in the GUI metaphor, because I think the complex outputs are only necessary for some tasks, and the user will expect a level of output they can trust always for the majority of applications, something that is not necessarily possible with a stochastic model alone.
So while the 'OS' and 'GUI' metaphors are tempting, they are ultimately misleading. The 'Component' metaphor is the most useful for builders. It frames GenAI not as a mystical new environment we live inside, but as a powerful, if sometimes unreliable, new part - like a CPU or a graphics card. The fundamental job of a product developer hasn't ended; it has simply gained a new, powerful piece in its toolkit. The crucial skill has remained component selection: knowing when to use this powerful stochastic engine, and when to rely on the deterministic tools we already have.
So how to think about this for product development?
Developing products under the new paradigm
The skill that is becoming increasingly important, both in product management, but in tech product development more generally, is to recognise where it’s appropriate to implement a stochastic solution. These are solutions that don’t have to be right all of the time, but that can deal with complexity without too much faff. In product management parlance - you’re trading off accuracy for simplicity of execution.
This is where generalised models work very well for a couple of reasons:
They’re incredibly easy to prototype, without needing to hire specialised resource. This means time to test a business problem PMF should reduce dramatically as long as customers know and can accept that the system is fallible
They’re very good as a first stab solution where you need a decisioning engine and can live with lower accuracy and/or higher cost; eventually you’ll switch out for a specialised model to increase accuracy and/or lower the cost
It’s a niche use case, where you can deliver 60% value at low effort until you understand more about the value; as long as you can live with it being stochastic rather than deterministic
We’ve had moments where technology was here. In early computer days, you’d have computers doing all kinds of stuff computers were overkill for, such as serving as microcontrollers on networking systems in small-mid organisations; eventually this got commoditised and is now done on specialised chips in specialised hardware. In early web 2.0 days, we had websites that did all kinds of things that specialised software would do. In early excel days, we had all kinds of solutions built in excel that are better when done as specialised software.
The big difference to all the historic paradigm shifts is that this one is stochastic - and homo sapiens doesn’t play well with uncertainty. I am not sure I want to predict what will happen when the masses finally internalise that using chatbots comes with rolling a die. Especially for that startup that claims it can diagnose your child.
This informs what should be the incentive for product development of the providers of the generalised models - a lot more clarity on what it can and cannot do and what its’ documented failure modes are. At the same time, they should be increasing the accuracy of the generalised models and eventually, in the same way that the CPU and GPU have split, we should probably start splitting generalised models to smaller generalised models that are still pretty generalised within a domain, but are not good for all domains. A tiny degree of specialisation. This would be trading-off generalisability for accuracy without sacrificing within-domain flexibility.
That all sounds great and exciting, what’s the catch?
If you’re new to this blog, you might not know that I have a bone to pick with the industry at the moment. I am frustrated by both the skeptics and the hypers of the industry, and I believe a more nuanced view would be more helpful and would actually get us to the benefits this paradigm shift facilitates - faster.
This is where I have a gripe with the amount of AI hype. The fact that we’re in the AI hype cycle means that the AI labs are actually de-incentivised by the hype cycle to do the one thing that would actually make it easier for organisations to use their solutions in the appropriate use cases → if they labelled very clearly where the failure modes are and what the models wouldn’t be good at.
Without this, the models are just given to organisation along an existential directive to implement this new magical technology; as the industry is worried (for good reason) that they’ll be left behind if they don’t reap the benefits of the paradigm shift. At the same time, the people on the ground who should be using it - are aware of the many failure modes through media and don’t have the knowledge to think whether those failure modes are important for their business process or use case. This paralyses adoption. It also fuels AI sceptics and gives them voice.
The hype cycle currently peddles the narrative that the AI models are omnipotent and that they will automate 50% of the workforce in a couple years. To put it in as tempered terms as possible - this is unbelievably unlikely. It would’ve been equally12 unbelievable if someone was to tell you this in 201013. To step away from that narrative, however, means to start losing air-time and potential for capital / user traction, and given that some big AI labs live off the hype cycle exclusively as they don’t have a product otherwise14 and are burning cash at an immense rate (looking at you Open AI) - stopping to engage with the cycle means giving air-time and therefore user-time - to those labs; which, even in a hype cycle, is just too risky when you’re basically producing a commodity.
And so I am not very hopeful for the product development of the models themselves - at least until the bubble bursts. It doesn’t help that all kinds of AI experts are coming from the woodworks to sell the latest snake oil (the narrative you’ll hear this described as is - NFT peddlers have become AI peddlers); which in the eyes of those with enough memory further diminishes the industry.
But this shouldn’t stop us from using the models in ways that are appropriate - if we can take a breather and step away from the hype and the scepticism. So what's the path forward? It's not to wait for the hype bubble to burst, nor is it to blindly 'implement AI'. The real work is to step back and think like an engineer choosing a component. Recognise these models for what they are: powerful, general-purpose, stochastic decision engines. Use them where their fallibility is a fair trade-off for their speed and scale. The challenge for product leaders isn't to become AI experts overnight, but to cultivate the product wisdom they have to know when to use this revolutionary new component, and when to leave it in the box. The future of product development isn't over; it just got a lot more interesting.
Agents in coding is a good place to start. If you don’t know where to continue - look to your resident AI expert, just check that they haven’t been an NFT expert 3 years ago.
I’d link to it, but I saw it on the phone on the tube, and I’m a chaotic person, it only occurred to me that I’ll write a post about it 7 hours after I read it, and by that time I could no longer find it.
The fact I now do has less to do with the field changing dramatically (it hasn’t) and more with the fact that I can’t escape the necessity of branding myself that way in the current climate. It should be very clear, reader, that artificial intelligence used in the population at large (outside of academia) is a marketing concept.
It should also be clear, reader, that by estimates of some AI experts, we’ve not even broken the threshold of intelligence that a cat possesses, let alone humans.
It is, actually, finite, and constrained by how many bits for colour you use and how many pixels the image has. This is still incalculably large, but crucially, orders of magnitude smaller than the resolution of reality. You might think this is not important; until you need to generate images that have HD-level quality and need pixel-level precision. You’ll soon find that the state of the art diffusers fail in the tails of the distribution quite often
But the fact that humans can’t devise a formal system for natural language doesn’t mean it doesn’t exist, it just means we as humans don’t bother to try to get to it, because we don’t need to, our brains solve that complexity for us. Whether LLM’s have actually “cracked the formality” of natural language, or are just good approximations of natural language, is something I’ll leave to PhD’s of computational linguistics to ponder, as I’ve not got skills nor qualifications to answer that.
More on why this is both a service and a disservice, later
I couldn’t live with myself if I didn’t say that Meta launched technologically the same thing a few months earlier, and got massively dunked on; and Google did the same thing a few months earlier still, and got dunked on even more. OpenAI benefited from a) much better marketing; b) reinforced the model to not fail on the basics in ways that gives it virality, but not necessarily more utility; c) not being a big-tech company and so inherently people - and more importantly - media, forgave them more. OpenAI 2022 onward is a master class in product marketing; but they’re not leaders in AI development. Hill I’ll die on.
I am incredibly humble
Graphical User Interface - i.e. moving from MS-DOS to Windows
If you’re new to this blog, think of “stochastic” as “there’s a gremlin in the room rolling a 1000-faced die every time the automation does something, and if they roll 2-9, you get a crap output. If they roll a 1, something catastrophic happens or a chicken randomly appears in your room”
Voice assistants are still a thing, and within voice assistants, we should be thinking of applications that can work within that operating system; but I’d argue by allowing voice assistants to interact with other apps and receive outputs, and not by actually making voice assistants do things. However, Apple is not wrong to say that it will take a while before voice assistants can work as well as Chat GPT at mass-market price points
Hill I’ll die on. Come at me.
And it is worth noting that a similar prediction has been made continuously from at least the 1970’s.
The valuation of a company building a commodity in an ecosystem is very different to a valuation of a company building the new operating system paradigm that will also become the omnipresent and magical god just in a few years