GovCompass
AI governance

Documenting an agent is not governing it

By GovCompass.ai· Last verified June 2026· Agentic governance is moving fast; this maps the documentation-versus-governance gap onto the EU AI Act and the GovCompass-7.

Most organizations can describe their AI agents in detail: the architecture, the tools, the memory, the base model, the benchmarks, the internal safety testing. What far fewer can do is govern what those agents decide once they are running. Documentation answers the question "what is this agent and what can it do?" Governance answers a harder one: "can we trust the decisions it makes over time?" The two are routinely confused, and an agent that is thoroughly documented but ungoverned is exactly the kind of system that passes every review and then fails in production.

This is part of the Agentic AI element of the GovCompass-7.

The comfortable half of the problem

There is a great deal of information available about what AI agents are. Vendors describe their planning and reasoning capabilities, their memory, their tool integrations, and their base models. Teams produce model cards, benchmark results, and reports from internal safety testing and red teamingred teamingAdversarial testing that deliberately probes an AI system for failures, harmful outputs or security weaknesses, before and after release.Open full entry →. Deployment is documented: the interfaces, the APIs, the hosting, the access controls.

This is the comfortable half of the problem, and organizations do it well because the information is abundant and the questions are static. What is the agent built on? What can it do? How is it deployed? These are answerable at a point in time, and once answered they tend to stay answered until the next version. Documentation of this kind is necessary. The EU AI Act requires much of it: technical documentationtechnical documentationRecords a provider must compile and keep for a high-risk AI system to demonstrate conformity, covering its design, data, testing, risk management and monitoring.Open full entry → under Article 11, record-keeping under Article 12. An organization that cannot describe its agents has a more basic problem than governancegovernanceThe system through which an organization steers itself: corporate governance, risk management, compliance, lines of accountability, risk appetite, and the operating model. It exists across everything the organization does, before and beyond AI. AI governance is this same system extended for AI. See AI governance, governance design, execution level.Open full entry →.

But describing an agent is not the same as governing it, and the gap between the two is where most agentic riskriskIn the EU AI Act's terms, the combination of the probability that a harm occurs and the severity of it if it does. The link between a principle (via the harm that would breach it) and a control (the measure that reduces it). Naming the harm and assessing its risk is required by Art. 9 before any mitigation measure is chosen. See harm, control, residual risk.Open full entry → lives.

The harder half

Governance answers a different question, and it is not a point-in-time question. It asks whether the decisions the agent makes can be trusted as they accumulate, change, and compound over the agent's operating life. This is the half that documentation does not reach, and it has three parts.

Decision governance. An agent does not just have capabilities; it exercises decision authoritydecision authorityA dimension of an agent's autonomy: how consequential the decisions it may make are, from recommending to a human to deciding and acting without review.Open full entry →. The governance questions are about that authority, not the capability. How much decision authority does this agent hold, and who delegated it? When does its goal need to be revalidated, because the goal it was given six months ago may no longer be the goal the organization wants it pursuing? Are the policies it operates under expressed in a form the agent actually follows at runtime, rather than written in a document no running system reads? Documentation describes what the agent can decide. Decision governance constrains what it is allowed to decide and keeps that constraint current.

Runtime governance. A model cardmodel cardA structured document describing an AI model: its purpose, training data, performance across conditions, limitations, and intended use. A core deployment-stage artifact, it lets the people operating and overseeing a system understand what it does and where it should not be trusted. Part of the technical documentation an auditor expects for a high-risk system. See artifact, life cycle.Open full entry → describes the agent as it was at the moment of testing. But an agent in production drifts: its inputs change, its memory accumulates, its behavior shifts. Governance has to operate at runtime, not only at design time. Dynamic risk management adjusts as the agent's behavior and context change. Runtime assurance and drift detection surface the moment the agent starts behaving outside its expected envelope. State and memory governance keeps the agent's accumulated context from quietly corrupting its future decisions. None of this appears in documentation, because documentation is static and the risk is dynamic.

AccountabilityaccountabilityThe principle that a named human or organization answers for an AI system's outcomes, through ownership, documentation, audit trails and redress — never the system itself.Open full entry → and trust. Documentation can show that an agent was tested. It cannot, on its own, establish that a specific decision the agent made in production can be explained, that an identifiable person is answerable for it, that the organization's reliance on the agent is calibrated to its actual reliability, and that the agent is governed across its whole lifecycle from design to retirement. These are the trust questions, and they are answered by governance operating continuously, not by a document produced once.

The governance shift

The move from documenting models to governing autonomous decisions is a shift along several axes at once, and naming them makes the gap concrete.

Traditional AI governance governs models; agent governance governs the autonomous decisions those models drive. Traditional governance relies on static controls set at deployment; agent governance needs dynamic controls that adjust at runtime. Traditional governance produces point-in-time assurance, a snapshot that was true when the assessment was done; agent governance needs continuous assurance, because the thing being assured keeps changing. Traditional governance assumes human execution, a person acting on the model's output; agent governance has to account for autonomous execution, where the agent acts without that human step. Traditional governance produces compliance documentation; agent governance has to produce regulatory evidenceevidenceThe concrete proof that a control is designed, implemented, and working: a test report, an audit trail, an impact assessment, a monitoring log. Each link in the governance chain produces an artifact, and together they are what an organization hands to its own board, a regulator, a customer, or an affected person to show, not say, that a system is governed. Its absence is itself the failure: a risk register without test results, or a mitigation claimed without validation, is a governance gap, not a paperwork one. The closing link of the governance chain. See control, governance.Open full entry →, which is documentation that demonstrates the controls actually operated, not merely that they were designed. And traditional governance provides model oversight; agent governance requires decision oversight, supervision of the choices the agent makes rather than the model that makes them.

Each of these is a move from something static and describable to something dynamic and governed. An organization that has done the left-hand column and believes it has done the right-hand column has the exact gap this article is about.

Why the confusion is dangerous

The danger is that documentation creates the appearance of governance. An agent with a thorough model card, clean benchmarks, and a documented deployment looks governed. It has passed the reviews that ask "what is this agent and what can it do?" But those reviews do not ask whether the decisions it makes next month, on inputs it has not yet seen, after its memory has accumulated, can be trusted. The agent that fails in production is frequently the one that was best documented, because the documentation gave everyone confidence to widen its autonomy without building the runtime governance that wider autonomy requires.

This is the same pattern that appears across responsible AIresponsible AIThe set of principles an AI system should live up to: fairness, safety and reliability, privacy, security and robustness, transparency and explainability, accountability, and human oversight. Widely shared and sitting under the EU AI Act and the major frameworks. On their own the principles are statements of intent; the law turns them into duties that cannot be met unless they are carried inside the organization's governance, which is how responsible AI lands in governance rather than beside it. GovCompass organizes the seven principles into a control framework, the GovCompass-7, one pillar per principle. See principle, pillar, governance.Open full entry →: a controlcontrolThe concrete, testable measure that reduces a specific risk, and through that risk protects the principle behind it. Also called a risk management measure, risk response, or risk treatment. Always traceable to the risk it addresses: under EU AI Act Art. 9 every control must map back to a specific risk, and controls recorded separately from their risks is a recognized compliance failure. It works in one of three types: preventive, detective, or corrective. See risk, control types, evidence.Open full entry → designed once and never operated, a pillarpillarA responsible-AI principle as something an organization actively holds rather than merely endorses: one of the seven pillars of the GovCompass-7 control framework, one per principle. A pillar is held, not implemented, by naming the harms that would breach the principle, assessing their risk, and placing controls that reduce it. Distinct from the integrating element (agentic AI), which binds the seven rather than being one of them. See principle, harm, risk, integrating element.Open full entry → that looks compliant on paper and fails silently in production. With agents the pattern is sharper, because the gap between what the documentation describes and what the agent does grows every day the agent runs.

What to do about it

The practical response is to treat documentation and governance as two separate deliverables and to require both. Documentation answers what the agent is; require it, because the EU AI Act does and because you cannot govern what you cannot describe. But do not let a complete set of documentation stand in for governance.

For each agent, in addition to the documentation, establish the three governance capabilities that documentation does not provide. Decision governance: who holds the agent's decision authority, when its goals are revalidated, and how its policies are enforced at runtime. Runtime governance: dynamic risk management, drift detection, and memory governance that operate while the agent runs. Accountability: decision-level explainabilityexplainabilityThe ability to give a meaningful reason for a specific output of an AI system to the people it affects — distinct from transparency, which is disclosure that and how AI is used.Open full entry →, a named answerable owner, calibrated reliance, and lifecycle governance from design to retirement.

These map onto the GovCompass pillars that autonomy stresses most: accountability, transparencytransparencyOpenness about the fact that AI is used and how it operates in general: disclosures, documentation, notices. Pairs with explainability, which addresses individual outcomes.Open full entry → and explainability, and the integrating agentic element that binds them. The test of whether you have governed an agent, rather than merely documented it, is simple. Documentation lets you answer what the agent is and can do. Governance lets you answer whether you can trust the decisions it is making right now, and whether you would know if you could not.

Continue

Legal referencesArt. 11Art. 12
Share Share on LinkedIn

More on Accountability

Art. 10 EU AI Act: data and data governance for high-risk AI

Reference

Art. 10 requires that the training, validation, and testing data for high-risk AI systems meets quality criteria: relevant, sufficiently representative, and as free of errors and complete as possible for the intended purpose. It also requires documented data governance practices covering collection, preparation, bias examination, and gap mitigation, and it permits the limited processing of special-category data where strictly necessary to detect and correct bias, under safeguards.

Art. 12 EU AI Act: record-keeping and logging for high-risk AI

Reference

Art. 12 requires high-risk AI systems to technically allow for the automatic recording of events (logs) over their lifetime. The logging must enable traceability of the system's functioning at a level appropriate to its intended purpose, support post-market monitoring, and help identify situations that may lead to risk or substantial modification. It is a design obligation on the provider that makes the system auditable by construction.

Art. 19 EU AI Act: keeping the automatically generated logs

Reference

Art. 19 requires providers of high-risk AI systems to keep the logs that the system automatically generates (under Art. 12) for as long as they control them, for a period appropriate to the intended purpose and at least six months unless other law requires longer. It is the retention counterpart to the Art. 12 logging capability, and it works alongside the deployer retention duty in Art. 26.6.

Art. 26.1 EU AI Act: following provider instructions as a deployer

Reference

Art. 26.1 requires deployers to use high-risk AI systems strictly in accordance with the provider's instructions for use. This means using the system only for its intended purpose, within its specified technical configuration, and by qualified users, and documenting that compliance. Deviating from the instructions can shift liability entirely to the deployer.

More on Transparency & explainability