Documenting an agent is not governing it
Most organisations can describe their AI agents in detail: the architecture, the tools, the memory, the base model, the benchmarks, the internal safety testing. What far fewer can do is govern what those agents decide once they are running. Documentation answers the question "what is this agent and what can it do?" Governance answers a harder one: "can we trust the decisions it makes over time?" The two are routinely confused, and an agent that is thoroughly documented but ungoverned is exactly the kind of system that passes every review and then fails in production.
This is part of the Agentic AI element of the GovCompass-7.
The comfortable half of the problem
There is a great deal of information available about what AI agents are. Vendors describe their planning and reasoning capabilities, their memory, their tool integrations, and their base models. Teams produce model cards, benchmark results, and reports from internal safety testing and red teaming. Deployment is documented: the interfaces, the APIs, the hosting, the access controls.
This is the comfortable half of the problem, and organisations do it well because the information is abundant and the questions are static. What is the agent built on? What can it do? How is it deployed? These are answerable at a point in time, and once answered they tend to stay answered until the next version. Documentation of this kind is necessary. The EU AI Act requires much of it: technical documentation under Article 11, record-keeping under Article 12. An organisation that cannot describe its agents has a more basic problem than governance.
But describing an agent is not the same as governing it, and the gap between the two is where most agentic risk lives.
The harder half
Governance answers a different question, and it is not a point-in-time question. It asks whether the decisions the agent makes can be trusted as they accumulate, change, and compound over the agent's operating life. This is the half that documentation does not reach, and it has three parts.
Decision governance. An agent does not just have capabilities; it exercises decision authoritydecision authorityA dimension of an agent's autonomy: how consequential the decisions it may make are, from recommending to a human to deciding and acting without review.Open full entry →. The governance questions are about that authority, not the capability. How much decision authority does this agent hold, and who delegated it? When does its goal need to be revalidated, because the goal it was given six months ago may no longer be the goal the organisation wants it pursuing? Are the policies it operates under expressed in a form the agent actually follows at runtime, rather than written in a document no running system reads? Documentation describes what the agent can decide. Decision governance constrains what it is allowed to decide and keeps that constraint current.
Runtime governance. A model cardmodel cardStandardised documentation for a model: intended use, performance (including per group), limitations, training data summary — a release-gate artefact and transparency tool.Open full entry → describes the agent as it was at the moment of testing. But an agent in production drifts: its inputs change, its memory accumulates, its behaviour shifts. Governance has to operate at runtime, not only at design time. Dynamic risk management adjusts as the agent's behaviour and context change. Runtime assurance and drift detection surface the moment the agent starts behaving outside its expected envelope. State and memory governance keeps the agent's accumulated context from quietly corrupting its future decisions. None of this appears in documentation, because documentation is static and the risk is dynamic.
AccountabilityaccountabilityThe principle that a named human or organization answers for an AI system's outcomes, through ownership, documentation, audit trails and redress — never the system itself.Open full entry → and trust. Documentation can show that an agent was tested. It cannot, on its own, establish that a specific decision the agent made in production can be explained, that an identifiable person is answerable for it, that the organisation's reliance on the agent is calibrated to its actual reliability, and that the agent is governed across its whole lifecycle from design to retirement. These are the trust questions, and they are answered by governance operating continuously, not by a document produced once.
The governance shift
The move from documenting models to governing autonomous decisions is a shift along several axes at once, and naming them makes the gap concrete.
Traditional AI governance governs models; agent governance governs the autonomous decisions those models drive. Traditional governance relies on static controls set at deployment; agent governance needs dynamic controls that adjust at runtime. Traditional governance produces point-in-time assurance, a snapshot that was true when the assessment was done; agent governance needs continuous assurance, because the thing being assured keeps changing. Traditional governance assumes human execution, a person acting on the model's output; agent governance has to account for autonomous execution, where the agent acts without that human step. Traditional governance produces compliance documentation; agent governance has to produce regulatory evidence, which is documentation that demonstrates the controls actually operated, not merely that they were designed. And traditional governance provides model oversight; agent governance requires decision oversight, supervision of the choices the agent makes rather than the model that makes them.
Each of these is a move from something static and describable to something dynamic and governed. An organisation that has done the left-hand column and believes it has done the right-hand column has the exact gap this article is about.
Why the confusion is dangerous
The danger is that documentation creates the appearance of governance. An agent with a thorough model card, clean benchmarks, and a documented deployment looks governed. It has passed the reviews that ask "what is this agent and what can it do?" But those reviews do not ask whether the decisions it makes next month, on inputs it has not yet seen, after its memory has accumulated, can be trusted. The agent that fails in production is frequently the one that was best documented, because the documentation gave everyone confidence to widen its autonomy without building the runtime governance that wider autonomy requires.
This is the same pattern that appears across responsible AI: a control designed once and never operated, a pillar that looks compliant on paper and fails silently in production. With agents the pattern is sharper, because the gap between what the documentation describes and what the agent does grows every day the agent runs.
What to do about it
The practical response is to treat documentation and governance as two separate deliverables and to require both. Documentation answers what the agent is; require it, because the EU AI Act does and because you cannot govern what you cannot describe. But do not let a complete set of documentation stand in for governance.
For each agent, in addition to the documentation, establish the three governance capabilities that documentation does not provide. Decision governance: who holds the agent's decision authority, when its goals are revalidated, and how its policies are enforced at runtime. Runtime governance: dynamic risk management, drift detection, and memory governance that operate while the agent runs. Accountability: decision-level explainabilityexplainabilityThe ability to give a meaningful reason for a specific output of an AI system to the people it affects — distinct from transparency, which is disclosure that and how AI is used.Open full entry →, a named answerable owner, calibrated reliance, and lifecycle governance from design to retirement.
These map onto the GovCompass pillars that autonomy stresses most: accountability, transparencytransparencyOpenness about the fact that AI is used and how it operates in general: disclosures, documentation, notices. Pairs with explainability, which addresses individual outcomes.Open full entry → and explainability, and the integrating agentic element that binds them. The test of whether you have governed an agent, rather than merely documented it, is simple. Documentation lets you answer what the agent is and can do. Governance lets you answer whether you can trust the decisions it is making right now, and whether you would know if you could not.