A few weeks ago, we used an AI agent inside Umbraco to write a blog post. The process worked — the article was researched, drafted, structured and published with minimal human input. It was also eye-wateringly expensive. By the time we hit save, we had burnt through $15 in AI tokens on a single piece of content.
That number gave us pause. Not because it broke the bank, but because it pointed to something important: integrating AI into a CMS is not just a technical challenge, it is an economic one. And the difference between an AI workflow that is genuinely useful and one that is quietly costly often comes down to a handful of decisions made before a single prompt is written.
Why AI tokens cost more than you think
To understand what happened, it helps to understand what AI tokens actually are. Every word, punctuation mark and line break that flows through a large language model — both in and out — is counted in tokens. You are charged for what you send to the model as well as what it returns. In a straightforward chat interface, those numbers are manageable. In an agentic workflow inside a CMS, they can grow very quickly.
When an AI agent operates inside Umbraco, it does not just receive a single prompt. It receives your instruction, the full context of the content item it is working on, the schema of the content type, the existing block structure, any system prompts configured by the CMS, and potentially the content of related pages it has been asked to reference. That context window can run to thousands of tokens before the agent has written a single word of the actual article. Multiply that by several back-and-forth tool calls — fetching content, checking schemas, retrieving images, saving drafts — and $15 starts to look less surprising.
Where the tokens went
When we broke down the $15, the culprits were not where we expected. The actual writing — generating the article text — was a relatively small proportion of the total. The bulk of the cost came from three other areas:
- Redundant schema lookups. The agent was fetching content type schemas it had already retrieved earlier in the same session. Each fetch sent and received a large JSON payload.
- Over-eager content fetching. The agent pulled in related articles, sibling pages and media items that were not relevant to the task, because it had not been told not to.
- Unguided tool calls. Without clear instructions about when to use which tools, the agent made several exploratory calls — checking page context, listing children, searching for content — that added up to a significant token overhead before any writing had happened.
None of this was the AI behaving badly. It was the AI doing exactly what it was built to do: being thorough. The problem was that thorough, without direction, is expensive.
How system prompts changed everything
The fix came not from limiting what the AI could do, but from being more precise about how it should do it. System prompts — instructions that sit alongside the AI agent and shape its behaviour before any user request arrives — turned out to be the most effective lever we had.
We made a handful of targeted changes:
- We told the agent not to pre-fetch schemas already provided in the entity context. The Umbraco backoffice already passes a rich property context to the agent. Telling it to use that context first, rather than fetching fresh data, eliminated a significant proportion of redundant tool calls.
- We instructed the agent to confirm the correct entity before making any tool calls. A simple check at the start of each session — does the open workspace match the entity the user is asking about — prevented the agent from fetching content for the wrong page.
- We prohibited speculative child and sibling fetching. Unless the task explicitly required related content, the agent was told not to pull it in. This alone cut a meaningful slice of unnecessary token spend.
- We parallelised independent tool calls. Rather than chaining calls sequentially, the agent was instructed to batch independent operations into a single block. Fewer round trips, lower overhead.
The results
The impact was immediate and measurable. The same blog article workflow — same complexity, same content type, same block structure — ran at a fraction of the previous token cost. The writing quality did not drop. The agent was not less capable; it was just better directed.
What this demonstrated, quite clearly, is that the quality of an AI integration in a CMS is not determined solely by the model. It is determined by the scaffolding around it. The system prompt is not a footnote; it is one of the most important pieces of engineering in the whole setup.
Good system prompts do several things at once. They reduce wasted compute. They make agent behaviour more predictable. They lower the risk of an agent making unexpected changes. And they make the overall experience faster, because fewer redundant operations means quicker responses for the person doing the editing.
What this means for Umbraco teams using AI
If you are exploring AI-assisted content workflows in Umbraco — or already running them — there are a few practical conclusions to draw from our experience.
Token cost is a design consideration, not an afterthought. Before you build an AI workflow into your CMS, think about what context the agent will receive, how many tool calls the task requires, and where you can reduce redundancy without reducing capability.
System prompts are worth investing in properly. A well-crafted system prompt is not just a set of instructions; it is a set of constraints that makes the agent more useful, not less. The goal is not to restrict what the AI can do — it is to focus where it spends its effort.
Measure before you optimise. We only knew where the tokens were going because we looked. Most AI interfaces will show you usage breakdowns if you ask for them. Use that data. The patterns are usually revealing.
The model choice matters, but not as much as the workflow design. Switching to a cheaper model will reduce costs, but it will also reduce quality. Optimising the workflow design keeps quality intact while bringing costs down. Both levers are worth pulling, but workflow comes first.
A honest reflection on AI in the CMS
We are genuinely enthusiastic about AI-assisted content workflows. The productivity gains are real, the quality ceiling is high, and the tools are improving quickly. But the $15 blog post was a useful reminder that enthusiasm is not a strategy.
Integrating AI into Umbraco well requires the same rigour as any other piece of CMS engineering: clear requirements, thoughtful configuration, and a willingness to measure what is actually happening rather than assume it is going well. The good news is that getting it right is not particularly difficult. It mostly requires paying attention to the right things at the right time.
If you are working on AI integration in your Umbraco build and want to talk through the approach, we are happy to share what we have learnt. The system prompts that made the biggest difference took an afternoon to refine. The savings they generated were immediate.
