Progressive autonomy: a maturity model for agent deployment
The safest way to deploy an agent is to grant it the least autonomy that lets it do its job, then widen that autonomy only as evidence of reliable behaviour accumulates. Progressive autonomy is to agentic governance what the three control layers are to the rest of the GovCompass-7: the operating discipline that turns a principle into a practice. This article sets out a maturity model for agent deployment along three dimensions, decision authority, process autonomy, and accountability, and the controls that should be in place at each level.
This is part of the Agentic AI element of the GovCompass-7.
Why autonomy should be earned, not granted
The most common agentic governanceagentic governanceGoverning the actions an autonomous AI system takes, not just the decisions it makes; ensuring those actions can be contained, traced, and reversed.Open full entry → failure is over-relianceover-relianceGranting an agent more autonomy than its demonstrated reliability justifies, usually on the strength of a demo. The most common agentic governance failure.Open full entry →: an organisation grants an agent more autonomy than its demonstrated reliability justifies, usually because the agent performed well in a demo. Demo performance is the agentic equivalent of a pre-deployment bias test that is never repeated. It tells you the agent worked once, on a known input, in a controlled setting. It tells you nothing about how it behaves across the live distribution of inputs, over time, in combination with other agents.
Progressive autonomyprogressive autonomyGranting an agent the least autonomy that lets it work, then widening its scope only as evidence of reliable behaviour accumulates. Autonomy is earned, not configured.Open full entry → treats autonomy as something an agent earns through evidence, not something it is granted by configuration. The agent starts with a narrow scope and low-consequence actions, operates under close observation, and is given wider scope only when the evidence supports it. This is the same logic as the GovCompass-7 maturity ladder, where an element moves from preventive-only to fully governed as more control layers come into operation. Applied to agents, it becomes a deployment discipline.
Three dimensions of autonomy
Autonomy is not one quantity. It is useful to track an agent along three dimensions, because an agent can be advanced on one and restricted on another.
Decision authoritydecision authorityA dimension of an agent's autonomy: how consequential the decisions it may make are, from recommending to a human to deciding and acting without review.Open full entry → is how consequential the decisions the agent is permitted to make are. At the low end, the agent recommends and a human decides. At the high end, the agent decides and acts without review.
Process autonomyprocess autonomyA dimension of an agent's autonomy: how much of a multi-step process it runs without a human checkpoint, from a single step to an end-to-end workflow.Open full entry → is how much of a multi-step process the agent runs without a human checkpoint. At the low end, the agent completes one step and hands back. At the high end, the agent runs an entire workflow, including invoking sub-agents, end to end.
Accountability scopeaccountability scopeA dimension of an agent's autonomy: how far the consequences of its actions reach, from internal and reversible to customer-facing, hard to reverse, and legally weighty.Open full entry → is how far the consequences of the agent's actions reach. At the low end, actions are internal, reversible, and low-value. At the high end, actions affect customers, are hard to reverse, and carry legal or financial weight.
A well-governed deployment is explicit about where each agent sits on all three dimensions, and moves an agent along a dimension only when the controls for the next level are in place and the evidence justifies it.
A four-level model
Level one, assisted. The agent recommends; a human reviews and executes every action. Decision authority, process autonomy, and accountabilityaccountabilityThe principle that a named human or organization answers for an AI system's outcomes, through ownership, documentation, audit trails and redress — never the system itself.Open full entry → scope are all low. Required controls: action logging, a clear human decision point, and a documented scope. This is where every high-consequence agent should start.
Level two, supervised. The agent executes low-consequence actions autonomously but escalates anything above a defined threshold to a human. Process autonomy rises; decision authority and accountability scope stay bounded. Required controls: escalation triggers, behavioural monitoring against an expected envelope, and the ability to halt the agent.
Level three, bounded autonomous. The agent runs full workflows within a defined boundary, including invoking sub-agents, and a human is on the loop rather than in it. Required controls: action-level logging across the chain, drift detection, circuit breakers, rollback capability where the domain allows, and a tested incident process. This is the level at which the EU AI Act's high-risk obligations bite hardest, because the system now materially influences decisions without per-action human review.
Level four, cross-agent managed. Multiple agents operate as a managed system, with a governance view across the whole, attention allocated to the weakest agent, and the agentic estate run as a living management system. This is the agentic analogue of the ISO/IEC 42001 management-system maturity that the GovCompass-7 framework points to.
How to use the model
For each agent in your inventory, place it on the three dimensions and the four levels, and confirm that the controls required for its level are designed, implemented, and evidenced. An agent operating at level three with only level-one controls is the agentic equivalent of an element governed by preventive controls alone: it looks capable in a demo and fails silently in production.
The discipline is to make autonomy a decision that is documented, controlled, and revisited, not a default that creeps upward as the team grows comfortable with the agent. Comfort is not evidence. The agent earns its next level by demonstrating reliable behaviour under monitoring, and the organisation grants it deliberately, with the controls for that level already in place.