Intent Engineering Is the Missing Layer of AI
In the AI space we are building AI systems to solve problems, what interesting is that we often build things and then we stare at the outputs and quietly ask ourselves a question we don't always say out loud: is this actually working?
The outputs look good. They read well. They feel right. But "feels right" isn't the same as "is right," and most of the time we can't tell the two apart because we never got clear on what right was supposed to mean in the first place.
I've come to think getting this right takes three layers of intentionality: prompt engineering, context engineering, and intent engineering. Most teams get good at the first two and assume the third will sort itself out. It won't. The third layer is where the real work lives.
Prompt engineering
Prompt engineering is the foundation. You're writing the instructions that tell the system how to behave, how to reason, how to respond. And it matters. A sloppy prompt gets you sloppy output.
But prompting tends to be a single-turn way of thinking. You ask, it answers, you move on. Plenty of problems fit that shape, and for those, a sharp prompt is enough. The trouble is that a lot of the problems worth solving aren't single-turn. They're conversations. They unfold. They need the system to gather information, adjust, and act over time. Prompting alone gets you part of the way, and then you hit a wall.
Context engineering
Context engineering is how you get past that wall, and it's bigger than most people think. It isn't just "what should I stuff into the prompt." It's designing the whole system so the AI can get to the information it needs, when it needs it.
Sometimes that's retrieval via RAG, pulling the right record at the right moment. Sometimes it's baking knowledge into a system prompt. Sometimes it's shaping how you ask the user a question so they hand you the very thing you were missing. The constraint underneath all of it is that you can't fit everything in. So context engineering is really about building pathways (plural) that let the system see what matters and ignore what doesn't.
Get this right and you've got a capable system. Capable, but not necessarily good. That distinction turns out to be the whole point.
Intent engineering
Here's the part almost everyone skips. You can have brilliant prompts and beautifully engineered context, and the thing can still do the wrong job perfectly well. Because nobody decided what the right job actually was.
Intent engineering is the work of getting specific about what you care about. Not the mission-statement version. The real version, the one that tells the system what to do when two good options pull in opposite directions.
Klarna is the cautionary tale here, and the interesting part is that the AI didn't fail. It worked. The assistant handled around two-thirds of the company's customer service chats, dropped resolution time from eleven minutes to two, and scored about the same on customer satisfaction as the human agents it replaced. By every number anyone usually points at, it was a success.
Then the CEO went on Bloomberg and explained why he was hiring people back. His own diagnosis was that cost had become "a too predominant evaluation factor," and the result was lower quality where it actually mattered. Read that again, because it's the whole thing. The problem wasn't the model. The problem was what they chose to measure. The capability was real. The intent was wrong. They optimized for cost, and they got exactly what they optimized for: cheaper service that customers liked less. The thing customers actually wanted, the reassurance that a real person was there when the situation got hard, was never what the system was built to protect. The system did its job. The job was just the wrong one.
Now flip it. Picture a service company, say a plumbing business, building an agent to handle dispatch. A customer calls with a burst pipe. Does someone go today, or can it wait until next week? Do you prioritize the big commercial account or the homeowner who's been calling you for fifteen years?
Good prompting tells the agent how to operate. Good context lets it see the schedule, the customer history, the urgency. But the decision it lands on depends entirely on what you told it to value. If the owner's real intent is "relationships are everything, we win by being the people customers trust and call back," then the agent stops optimizing for raw speed or margin and starts protecting the relationship. The loyal customer gets cared for the way you'd want to be cared for. The decision reflects what the business actually believes, not just what's fastest or cheapest.
This is really about evaluation
Here's what ties it together, and it's the part I keep coming back to. You cannot evaluate a system you haven't defined the intent for.
We talk a lot about taste, about whether a model has good judgment. But taste has to come from somewhere, and where it comes from is intent. How do you know a dispatch decision was good? You hold it up against what you said you cared about. Did it strengthen the relationship? Did it reflect our values in the moment we had to choose? That question is your rubric. Without it, you're guessing, and "it feels right" is quietly doing all the work.
Most people treat evaluation as a technical problem you solve after the build. It isn't. Evaluation is inseparable from intent. The unglamorous work of saying out loud what you value, what you'd trade off, and what you'd never compromise is the same work that tells you whether the thing you built is any good.
Klarna had the capability. What they were missing wasn't a better model. It was a clearer answer to a simpler question: what are we actually trying to do here?
That question is the whole job.
