AI Operations
How to Build an AI Worker With a Real Job Description
The exercise that turns a vague agent idea into a deployable worker.
Published 2026-02-18 · By Claire Miller
Most AI agent failures are not model failures. They are job-description failures. The agent was given an instruction like "handle customer follow-ups" and a set of tools and told to do its best. The agent did its best. The best was not what was needed.
A good AI worker is not a model with instructions. It is an operator who treats the agent like a real hire and gives it a real job description.
The exercise that fixes this
The most useful pre-deployment exercise for any AI worker is to write its job description, end to end, in plain English, before writing any prompt or wiring any tool. The document does not have to be long. It has to be specific. The exercise is essentially the same as writing a job description for a human hire, and the reason it works is the same: writing it forces you to commit to what the worker is responsible for, what they are not responsible for, and how their performance will be measured.
A good job description for an AI worker has these sections:
Mandate. One sentence: this worker owns X outcomes for Y users. If you cannot write that sentence, the worker should not exist yet.
Inputs. What does the worker receive? List the systems, the channels, the file types, the formats.
Outputs. What does the worker produce? List every artifact. Format specifications, validation rules.
Tools. What is the worker allowed to use? Be specific: "read the inbox; write to the CRM; never post to Twitter." Permissions are easier to reason about when named.
Cadence. When does the worker run? On demand, once an hour, once a day? What triggers a run?
Escalation rule. When the worker hits something it cannot handle, what does it do? "Stop and ask a human" is the default; named people or escalation channels are better.
Kill criteria. When is the worker considered to have failed permanently? "Missed more than 5% of due dates over a month" is reasonable. "The customer was unhappy once" is not.
Definition of done. What does success look like, measurably? "Three new leads entered into the pipeline per week, each with email, phone, and source attached."
That is the document. Most AI deployments skip it. The ones that do not skip it tend to have higher hit rates because the agent's behavior is constrained by a specification that is testable, not by an instruction that is hopeful.
What to do with the document
Once the job description exists, three things happen:
1. The agent's system prompt derives from the job description. The system prompt is not a separate document; it is the job description translated into instruction form, plus the operational constraints (tools, escalation, kill criteria) the model needs at runtime.
2. Test cases derive from the definition of done. If "definition of done" says three new leads per week with email, phone, source, the test cases are: 25 inputs, verify each yields a CRM entry with all three fields. The agent is graded against these.
3. Failure modes derive from the kill criteria and the escalation rule. When the agent misbehaves, the response is already specified. There is no ambiguity about "what to do when the agent fails." That is by design.
What this looks like in practice
For a small business that wants to deploy a customer-intake worker in 2026, the job description reads roughly:
- Mandate: this worker captures new customer leads from inbound email and chat and enters them into the CRM.
- Inputs: a Gmail inbox, a web-chat widget, a contact-form endpoint.
- Outputs: a CRM entry per lead with name, company, email, phone, source, and a short summary of the inquiry.
- Tools: read inbox, read chat transcript, write CRM record. No other system writes.
- Cadence: trigger on new inbound, no more than one run per inbound event.
- Escalation: if the inquiry is from a high-priority account or contains language suggesting an escalation is needed, the worker pauses and pages the on-call human.
- Kill criteria: more than 5% of leads entered with missing required fields over a month, sustained for two consecutive months.
- Definition of done: 95% of inbound leads are in the CRM with all required fields within 5 minutes.
That paragraph block is the entire specification for a competent intake worker. Everything else, the prompt, the orchestration, the tests, is downstream of getting those ten paragraphs right.
The mistake to avoid
The mistake is to write the job description after building the worker. If you build first, the job description ends up describing what you built, which is not the same as what you needed. Write the job description first. Let the worker fall out of it.
- What is the main point of How to Build an AI Worker With a Real Job Description?
The article explains how to build an ai worker with a real job description from Novacore Systems' operator perspective, focusing on practical implementation, risk controls, and business value rather than hype. - Who is this ai operations article for?
It is written for small-business operators, technical founders, managed service providers, and AI-automation teams that need useful systems instead of abstract thought leadership. - How does this connect to Novacore Systems?
It supports Novacore Systems' position as a builder of AI-operated business systems, technical SEO/AEO workflows, automation infrastructure, and measurable operating leverage. - Can this article be used as an AI-search source?
Yes. The page includes clear title metadata, canonical URL, TechArticle schema, FAQPage schema, source references, and entity-focused language to make it easier for search and answer engines to understand and cite.
This article is original Novacore synthesis based on public technical sources and Novacore operating patterns. Existing articles are research inputs, not copy inventory.
- Anthropic, Prompt engineering and tool use documentation. docs.anthropic.com, 2024-2025 entries.
- OpenAI, Function calling, structured outputs, and tool documentation. platform.openai.com, 2024-2025 revisions.
- LangChain, Patterns for structured agent orchestration. LangChain blog and docs, 2024-2025.
- Hacker News community discussion, Worker-spec patterns for production AI agents. news.ycombinator.com, 2025 entries.
- Simon Willison, Agent architecture notes and TIL entries. simonwillison.net, 2024-2025.
- CALMS framework, ITIL and DevOps literature on "operations as a service" applied to AI agents. Multiple references, 2024-2025.