When teams start building agent systems, they spend most of their time on the model configuration: choosing the right model, tuning the temperature, writing the system prompt, adding tools. This is necessary work. It is also the wrong place to spend most of your time.
The teams that build reliable agent systems spend most of their time on knowledge engineering — specifically, on the structured reference documents that give agents the domain expertise to do their jobs well. These documents are what we call skill documents, and they are the most underestimated component in any serious agent system.
The common mistake
The common mistake is trying to put all the agent's knowledge into the system prompt. This works in demos and breaks in production. System prompts have practical length limits. More importantly, they mix three different categories of information that need to be separated for the system to be maintainable: the agent's role and operating constraints, the domain knowledge the agent needs, and the task-specific context for the current session.
When all three are in the system prompt, updating the domain knowledge means rewriting the system prompt. Testing whether a knowledge change improved agent quality means re-running the full system. Reusing the agent in a different context means adapting the entire prompt instead of swapping a single document.
What a skill document actually is
A skill document is a structured reference document that contains the domain knowledge an agent needs to do a specific job well. It is not a prompt. It is reference material — the equivalent of a technical handbook that a specialist consults when doing their work.
A skill document for a security review agent contains: the categorised taxonomy of vulnerability classes the agent should check for, the severity rating framework, the output format for findings, and the exceptions that require immediate human escalation. The agent does not need to reason about what a SQL injection vulnerability is — the skill document defines it precisely. The agent applies the definition to the code it is reviewing.
A skill document for an architecture planning agent contains: the organisation's standard patterns for different application types, the approved technology choices for different constraint profiles, the anti-patterns that have caused problems in past projects, and the checklist that must be completed before an architecture document is considered complete.
The skill document is curated by humans with domain expertise. It encodes accumulated knowledge in a form that agents can apply reliably and consistently.
The three layers
System prompt (generic container). Defines the agent's role, its operating constraints, and its output format requirements. Changes infrequently. The same system prompt can be used across many different task contexts by swapping the skill document and session context.
Skill document (domain knowledge). Contains the specialised reference knowledge the agent needs for a specific domain or task type. Changes when the domain knowledge changes — when new vulnerability classes are discovered, when new architecture patterns are adopted, when hard-won lessons from production need to be encoded. Maintained like code: versioned, reviewed, and tested.
Session context (task-specific input). Contains the specific information for the current task: the code to review, the requirements to analyse, the document to process. Changes with every task invocation. Generated by the pipeline, not maintained by humans.
Separating these three layers enables something that the monolithic system prompt approach cannot: you can update the agent's knowledge without changing its operating constraints, test knowledge changes in isolation, and reuse the same agent architecture across different domains by swapping skill documents.
How skill documents enable agent reuse
A code review agent with the right skill document can review Python, TypeScript, or Go — not because the model has specific language expertise baked in, but because the skill document contains the language-specific patterns and anti-patterns the agent needs to apply. Swapping the skill document changes the agent's domain expertise without changing anything about the agent's architecture.
This composability is why skill documents are the building blocks of a scalable agent system. A library of well-maintained skill documents is a library of domain expertise that can be applied flexibly across different tasks and contexts. This is qualitatively different from a library of system prompts, each of which is monolithic and hard to reuse.
Versioning skill documents like code
Skill documents need to be version-controlled for the same reason code does: when something goes wrong, you need to know what changed. When an agent's performance degrades, the first question is whether the skill document changed. When a new vulnerability class becomes relevant, you need to update the security review skill document and track that update across all deployments of that agent.
The versioning discipline: every skill document has a version number in its header. Every agent invocation logs which version of the skill document it used. When you investigate an agent output, you can retrieve the exact skill document the agent was working from at that point. This is the audit trail that makes agent systems maintainable over time.
Example structure: a code review agent skill document
SKILL: code-review-sentinel
VERSION: 2.4.1
DOMAIN: Security and quality review for TypeScript/Next.js applications
REVIEW_TAXONOMY:
critical:
- Authentication bypass vectors
- SQL/NoSQL injection paths
- Exposed secret patterns (regex: SK_|pk_|Bearers)
- Insecure direct object reference
high:
- Missing input validation at API boundaries
- Unhandled error states that expose stack traces
- Session token handling deviations from OWASP standard
medium:
- Missing rate limiting on public endpoints
- Overly permissive CORS configuration
- Dependency versions with known CVEs
OUTPUT_FORMAT:
- finding_id: sequential integer
- severity: critical | high | medium | low
- location: file_path:line_number
- description: plain-language explanation (max 2 sentences)
- recommendation: specific remediation step
ESCALATION_TRIGGERS:
- Any critical finding requires immediate human review before pipeline continues
- Three or more high findings in a single module requires architect review
SCOPE_EXCLUSIONS:
- Test files (*.test.ts, *.spec.ts) — reviewed separately
- Generated files (*.generated.ts) — flag only critical findings
This is not a prompt. It is a structured reference that tells the agent precisely what to look for, how to rate what it finds, how to format its output, and when to stop and escalate. The agent applies this structure to the code it is reviewing. The result is consistent, auditable, and predictable.
Skill documents as the real IP
In an AI-native development shop, models are commodity. Any competent team can access the same models. The competitive advantage is the accumulated knowledge encoded in skill documents — the lessons from hundreds of code reviews, the architecture patterns proven across dozens of projects, the failure modes catalogued from production incidents.
This knowledge takes time to build. It is built by engineers who know what to look for, who capture what they find in structured form, who maintain and update the documents as the domain evolves. The model reads the document. The engineer writes it. The quality of the agent is bounded by the quality of the knowledge the engineer provides.
At SocioFi, skill documents are maintained in a versioned repository, reviewed by senior engineers when updated, and tested against a library of known-good and known-bad examples before deployment. They are the thing we would most closely guard if we were worried about someone replicating our capability — not because they contain secrets, but because they represent years of accumulated, structured engineering knowledge that cannot be generated or purchased.
Explore SocioFi Labs — See our research and open-source work