The Prompt Isn’t the Research. It Never Was.

There’s a particular kind of organizational vertigo that happens when a new technology mandate lands before anyone has figured out what it actually means in practice. Everyone is nodding in meetings, the roadmap slides are full of the right buzzwords, and somewhere in the shuffle, the people who are supposed to be the bridge between what the technology can do and what the humans using it actually need get quietly moved to the side. I’ve been a UX leader long enough to recognize that pattern, and I’ve watched it play out with every major shift in how we build products. What’s different about AI isn’t the pattern. It’s the speed and the confidence with which organizations are moving before they’ve asked the questions that matter most.

When the mandate came down at my organization to go AI-first following a recent acquisition, I tried to understand what it actually meant before I reacted to it. What I found was that “AI-first” meant something very specific to the people delivering it, and something significantly more complicated to the people who had to make it work in practice. The assumption embedded in the directive was that AI could do the heavy lifting at the front end of the design process. Why spend weeks on discovery when you have a language model trained on more data than any research team could ever collect? Why talk to customers directly when you could synthesize their feedback at scale and prompt your way to a prototype by end of sprint?

That assumption is where things started to go sideways, and it’s the argument I’ve been making, in different ways to different audiences, ever since.

Peter Merholz has been writing about this tension more pointedly than almost anyone else in the field right now. He cited Figma’s 2025 research finding that teams using AI went faster but didn’t produce better work. They were generating more output at a higher rate of speed, but the quality of what they were producing wasn’t improving because speed and quality are not the same thing, and AI doesn’t know the difference. That finding didn’t surprise me at all, because I was watching a version of it happen in real time. The output looked polished and moved fast, but it was confidently wrong in ways that were genuinely hard to detect until you held it up against what actual customers were telling us.

Jesse James Garrett has been making a related point about where the AI conversation in design is actually happening. His observation is that we’re talking almost entirely about the surface layer, screen design, and visual output, and almost never about what AI is doing at the structural level, the scope level, the strategy level. Those lower layers of the design process are where human judgment isn’t just useful, it’s irreplaceable. And those are exactly the layers that were being skipped when prompt engineers were handed the keys to the product and called UX practitioners.

Here’s what that looked like on the ground. The organization had access to LLM-synthesized data drawn from support tickets and ad-hoc user comments, and on the surface, that data felt like insight. It was voluminous, it was organized, and it arrived fast. The problem was that what it was actually doing was reflecting existing assumptions back at us in a very sophisticated way. Support team opinions dressed up as user data. Stale information with no context for when or why it was generated. Comments captured from users who were frustrated in a moment and not necessarily describing the underlying problem they actually needed solved. The model had no mechanism for distinguishing between a user venting about a bad day and a user identifying a genuine workflow gap, so it ingested all of it and handed us a confident summary that felt like research but wasn’t.

My team was running actual user interviews through Maze and doing real synthesis alongside that LLM data. When we put both side by side, the gap was uncomfortable for everyone in the room, and not because the AI had produced something wildly off-base. It was wrong in the specific ways that matter most. It had optimized for what users said rather than what they meant. It had no way to ask the follow-up question that surfaces the real problem underneath a stated request. It couldn’t tell the difference between a feature request driven by genuine need versus one driven by habit or a misunderstanding of what the product was already capable of doing. That distinction, between what someone asks for and what they actually need, is the core of what a skilled UX practitioner does. An LLM can tell you what people are saying. It cannot tell you whether what they’re saying is the thing worth building toward.

Getting that argument to land required different approaches for different audiences, which is honestly just standard UX leadership work applied to a new context. With product leadership, letting the research speak for itself did most of the heavy lifting. Showing the gap between what the model had synthesized and what our interviews actually revealed was more persuasive than any framework I could have invoked. With engineering, the argument was about downstream risk, because building to an unvalidated assumption is expensive regardless of how fast you got there. With executive leadership, the conversation was about what earns recognition in a competitive market. You don’t become an industry leader by shipping fast. You get there by shipping right.

None of this is a case against AI, and I want to be direct about that, because the UX community has a habit of staking out the human-centered side of this debate in ways that read as resistant rather than strategic. AI belongs in the product development process. It’s genuinely useful for synthesis, for pattern recognition, for surface-level generation that used to consume enormous amounts of designer time. My team uses it, and I actively encourage them to. The question has never been whether to use it. The question is what it can do, what it cannot do, and who carries responsibility for the judgment calls that live in the space between those two things.

That responsibility is what a DesignOps practice for an AI-first organization actually has to address. It’s not a checklist, and it’s not a set of guardrails you bolt onto a process that was already designed without you. It’s the governance layer that asks, before anything ships, whether the AI output has been tested against the reality it’s supposed to represent. It’s the feedback loop that catches the model when it’s confidently synthesizing assumptions into something that passes for data. It’s the team culture that holds the line between moving fast and moving right, because those are not the same destination even when they look like it from a distance.

Merholz and Garrett have both been saying versions of this in different ways for years now. Whoever controls the prompt controls the product. The conversation about AI and design is happening at the surface when it needs to be happening at the strategy layer. Design quality doesn’t manage itself, and no amount of AI acceleration changes that. What it does is accelerate everything, including your mistakes, at exactly the same rate as your wins.

We are in the business of building products that solve real problems for real people. Not the next shiny thing that looked great in a prototype and validated nothing before it shipped. Not a feature backlog assembled from support tickets passed through a language model and handed back as if it were customer research. Someone still has to talk to those people, understand what they’re actually trying to accomplish, and build the organizational infrastructure that keeps that understanding at the center of every AI-assisted decision that follows. That job hasn’t gone anywhere. The tools around it have changed, and the urgency to do it well has only grown.

ai, designops, research, ux teams

The Prompt Isn’t the Research. It Never Was.

Leave a Reply Cancel reply

Recent Thinking

The Missing DesignOps Layer for AI-First Engineering

Delete Your Art

AI Didn’t Kill UX, We Handed It the Knife.