AI Operations

How to Build an AI Worker With a Real Job Description

The exercise that turns a vague agent idea into a deployable worker.

Published 2026-02-18 · By Claire Miller

Most AI agent failures are not model failures. They are job-description failures. The agent was given an instruction like "handle customer follow-ups" and a set of tools and told to do its best. The agent did its best. The best was not what was needed.

A good AI worker is not a model with instructions. It is an operator who treats the agent like a real hire and gives it a real job description.

The exercise that fixes this

The most useful pre-deployment exercise for any AI worker is to write its job description, end to end, in plain English, before writing any prompt or wiring any tool. The document does not have to be long. It has to be specific. The exercise is essentially the same as writing a job description for a human hire, and the reason it works is the same: writing it forces you to commit to what the worker is responsible for, what they are not responsible for, and how their performance will be measured.

A good job description for an AI worker has these sections:

Mandate. One sentence: this worker owns X outcomes for Y users. If you cannot write that sentence, the worker should not exist yet.

Inputs. What does the worker receive? List the systems, the channels, the file types, the formats.

Outputs. What does the worker produce? List every artifact. Format specifications, validation rules.

Tools. What is the worker allowed to use? Be specific: "read the inbox; write to the CRM; never post to Twitter." Permissions are easier to reason about when named.

Cadence. When does the worker run? On demand, once an hour, once a day? What triggers a run?

Escalation rule. When the worker hits something it cannot handle, what does it do? "Stop and ask a human" is the default; named people or escalation channels are better.

Kill criteria. When is the worker considered to have failed permanently? "Missed more than 5% of due dates over a month" is reasonable. "The customer was unhappy once" is not.

Definition of done. What does success look like, measurably? "Three new leads entered into the pipeline per week, each with email, phone, and source attached."

That is the document. Most AI deployments skip it. The ones that do not skip it tend to have higher hit rates because the agent's behavior is constrained by a specification that is testable, not by an instruction that is hopeful.

What to do with the document

Once the job description exists, three things happen:

1. The agent's system prompt derives from the job description. The system prompt is not a separate document; it is the job description translated into instruction form, plus the operational constraints (tools, escalation, kill criteria) the model needs at runtime.

2. Test cases derive from the definition of done. If "definition of done" says three new leads per week with email, phone, source, the test cases are: 25 inputs, verify each yields a CRM entry with all three fields. The agent is graded against these.

3. Failure modes derive from the kill criteria and the escalation rule. When the agent misbehaves, the response is already specified. There is no ambiguity about "what to do when the agent fails." That is by design.

What this looks like in practice

For a small business that wants to deploy a customer-intake worker in 2026, the job description reads roughly:

That paragraph block is the entire specification for a competent intake worker. Everything else, the prompt, the orchestration, the tests, is downstream of getting those ten paragraphs right.

The mistake to avoid

The mistake is to write the job description after building the worker. If you build first, the job description ends up describing what you built, which is not the same as what you needed. Write the job description first. Let the worker fall out of it.

Answer engine summary
References

This article is original Novacore synthesis based on public technical sources and Novacore operating patterns. Existing articles are research inputs, not copy inventory.