A refund agent at a fintech approves a $4,000 chargeback at 2 a.m. Nobody saw it before the money left. By morning, three people have to answer for it: the CIO who deployed the agent, the Risk officer who signed the control framework, and the Legal lead who now has to explain the call to a regulator. Not one of them can point to a rule that said this action was allowed to run on its own.
That missing rule is the actual governance problem with AI agents. The frameworks you have probably already read, the Gartner six-step, the IBM playbook, the vendor security checklists, all tell you to "maintain oversight." None of them tell you where the decision right should live for a specific thing an agent can do. So the hardest question in AI agent governance is not "which model?" Models are interchangeable, and they get better every month. The durable question is the one nobody hands you an answer to: where does the decision right live when an agent can execute?
AI agent governance is the practice of deciding who holds the right to authorize each action an autonomous agent can take. Every action should sit in one of three decision-rights modes: Autonomous (the agent acts alone), Approve-before (a human signs off before execution), or Review-after (the agent acts, a human reviews the trail afterward). The right mode is set by the action's reversibility, blast radius, and regulatory exposure - not by the model.
What follows is a decision tree to place any agent action into one of those three modes, the criteria to defend each placement when an auditor asks, and a production example for all three. I wrote it for the people who carry the accountability when it goes wrong: CIO, Risk, and Legal.
Why "which model?" is the wrong governance question
Model selection feels like the big decision because it is the loud one. Procurement argues about it, vendors pitch it, and the benchmarks turn over every quarter. But a model is a component, and you can swap a component. What you cannot swap cheaply is the set of rules deciding which actions an agent may take with no human in the path. That ruleset is your control surface. It outlives any single model you bolt underneath it.
One distinction is worth making once, because the rest of this piece leans on it. An agent that only recommends is a different governance object than an agent that can execute. A recommendation engine drafts a reply, scores a lead, flags an anomaly. None of that creates an obligation, because a human still acts on it. The decision-rights question only switches on when the agent takes the consequential step itself: sends the email, moves the money, changes the record, files the document, touches the infrastructure.
Here is the part the existing checklists miss. Most governance advice was written to protect you from agents you did not know existed. Shadow AI. Employees pasting data into consumer tools. Unsanctioned plug-ins. That is a real problem and worth controlling, but it is a different problem. It tells you nothing about how much autonomy to grant the agents you deployed on purpose, for a business reason, with a budget line behind them. For those, "have oversight" is not an answer. Oversight at what point? Before the action, or after it? Owned by whom? That is the gap this framework closes.
The three decision-rights modes, defined
Strip away the vendor vocabulary and a human decision can sit in exactly three places relative to an agent's action. Before it. After it. Or nowhere in the immediate path. Name them plainly and the whole governance conversation gets shorter.
Autonomous - the agent acts alone
In Autonomous mode, the agent takes the action with no human in the execution path. It is safe when the action is reversible, the blast radius is small, and you can see whether the agent did the right thing. Reach for it where being wrong once costs almost nothing and asking a human every single time would burn the value to the ground. Autonomous is not "unwatched." It is watched in aggregate, not per action.
Approve-before - the agent proposes, a human signs off
In Approve-before mode, the agent prepares the action and a human authorizes it before anything runs. This is the pattern most people picture when they say human in the loop ai agents. The human is a gate, and the action has to pass through it. It is the right default for anything irreversible, high-blast-radius, or legally loaded. The cost is latency and human attention, which is the exact reason you save it for actions that earn the friction.
Review-after - the agent acts, a human reviews the trail
In Review-after mode, the agent executes immediately and a human reviews afterward, on a sample or on every action, with a way to roll back and a way to escalate. This is what the literature calls "human on the loop," or supervisory control. It is the mode most competitors skip outright, and it happens to be where a lot of the real agentic value lives. You get the speed of autonomy with an accountability record stapled to it.
The terminology matters because the SERP leaves it muddy. So, plainly: Approve-before maps to human-in-the-loop, and Review-after maps to human-on-the-loop. If you have been treating "human in the loop" as a single switch, human present or human absent, that is the first thing to fix. There are three positions here, not two.
The decision tree: how to place any agent action
This is the centerpiece. When a new agent use case lands on your desk, you do not have to relitigate your whole AI strategy. You walk one action through five questions, in order, and the mode drops out the bottom.
- Reversibility - can the action be undone cheaply? If undoing it is expensive, slow, or flat-out impossible (a wire transfer, a public statement, a deleted record with no backup), the action is a candidate for Approve-before or human-only. Reversible actions stay in the running for Review-after or Autonomous.
- Blast radius - how many systems, customers, or dollars does one action touch? A single action that can reach thousands of customers or six figures of spend belongs in Approve-before, even when each instance feels tiny. Aggregate exposure is still exposure.
- Regulatory and legal exposure - does the action create a compliance or contractual obligation? If yes, Approve-before, with Legal or Risk explicitly in the path. Anything that makes a representation to a regulator, a customer, or a court is not the place to discover your agent got creative.
- Confidence and observability - can you measure whether the agent did the right thing, and roll back if it did not? High observability plus reversibility is what makes Review-after and Autonomous defensible. If you cannot tell after the fact whether the action was correct, you cannot use the modes that rely on after-the-fact review.
- Frequency and latency cost - would human approval kill the value? High-volume, low-stakes actions, where a human gate would build an impossible queue, point toward Autonomous or Review-after. This question is the counterweight to the first three. It stops you from gating everything into a bottleneck.
Walk those in order and the logic is plain. Irreversible, or high-blast-radius, or regulated pushes the action up to Approve-before. Reversible, plus observable, plus high-frequency pulls it down to Review-after or Autonomous. The middle cases are where judgment lives, and that is fine. The point is that the placement is now a documented, defensible decision instead of a vibe somebody had in a standup.
The framework matrix below is the one structured asset this model needs. It maps each mode to its action profile, who holds the decision right, what the audit trail has to capture, and the failure it is built to prevent.
| Mode | Typical action profile | Who holds the decision right | Audit trail must capture | Failure it prevents |
|---|---|---|---|---|
| Autonomous | Reversible, low blast radius, high frequency, high observability | The agent, within hard pre-set scopes | Every action logged, scope config, rate limits, kill-switch events | Bottlenecking high-volume low-stakes work into a human queue |
| Approve-before | Irreversible, high blast radius, or regulated | A named human approver (often Risk or Legal for high-stakes classes) | The proposed action, the approver identity, the decision, the timestamp | An agent committing an unrecoverable or non-compliant action alone |
| Review-after | Reversible, observable, time-sensitive, moderate stakes | The agent acts, a named reviewer owns the after-review | The action, the review record, the sampling rate, rollback events, escalations | Speed without accountability, and silent error accumulation |
Autonomous in production: what it looks like and where it is safe
Autonomous works when the worst single outcome is cheap and visible. A support-ops agent that triages and tags inbound tickets, routing each to the right queue, is a clean fit. A mistag reverses in seconds and shows up in queue metrics anyway. Same story for an agent that enriches CRM records from public data, or one that drafts, but does not send, routine internal summaries. The thread running through all of them: a wrong action is small, undoable, and observable.
The guardrails stay non-negotiable even here. Autonomous does not mean unbounded. It means:
- Hard scopes - the agent acts only inside an explicit allowlist of action types and targets.
- Rate limits - a ceiling on actions per minute or hour, so a runaway loop cannot scale the damage.
- A kill switch - one control that halts the agent instantly, owned by a named person.
- Full logging - every action recorded, because Autonomous runs on aggregate review, and you cannot review what you never logged.
The trap to watch is aggregation. An action that looks low-stakes per instance can quietly become high blast radius at volume. The lesson that keeps surfacing in field reports from organizations running dozens of agents is blunt: autonomy granted action by action, without anyone watching the aggregate, is how scope drifts. Each grant looked reasonable on its own. The sum did not. Set the aggregate exposure budget, not just the per-action rule.
Approve-before in production: the high-stakes default
When the action is irreversible or legally loaded, Approve-before is the default, and you should be comfortable saying that out loud in the room. An agent that drafts an external customer email, kicks off a refund above a threshold, modifies production infrastructure, or files anything with financial or legal weight should propose, not execute. The human approval is the control that turns a would-be incident into a reviewed decision.
The objection never changes: approval queues kill the ROI of automation. They do, if you build them carelessly. They do not have to. The patterns that keep Approve-before fast:
- Threshold-based approval - auto-execute below a risk line, require sign-off above it. A refund agent might clear anything under $200 on its own and route the larger amounts for approval.
- Batched approvals - group similar pending actions so a human approves a reviewed set instead of one pop-up at a time.
- Explicit reviewer SLAs - if a human sits in the path, that human owes a response time, and the queue has an owner. An approval step with no SLA is not a control. It is a delay generator.
This is also the point where Risk and Legal stop being abstractions and turn into a named position in the path. For action classes with regulatory exposure, the brief is short: the approver for that class is a specific Risk or Legal owner, and that placement is itself part of your accountability record. When the regulator asks who authorized the action, the answer is a name. Not "the system."
Review-after in production: speed with accountability
Review-after is the mode most governance content ignores, and it earns real depth, because it is where a lot of practical agentic value actually sits. The agent acts immediately. It resolves a low-risk ticket, posts an internal status update, re-prices inside a guardrailed band. Then a human reviews the trail afterward. You get the speed of Autonomous with an accountability layer Autonomous does not carry.
It only counts as governance if the review is real. Review-after has four requirements, and dropping any one of them turns it into Autonomous in a costume:
- Complete, tamper-evident logs - the full record of what the agent did, in a form nobody can quietly edit.
- A defined sampling or audit cadence - either every action gets reviewed, or a stated percentage does, on a schedule someone owns.
- A rollback path - a concrete way to undo an action the review catches, fast.
- An escalation rule - if review keeps finding errors in an action class, that class gets promoted to Approve-before automatically. The mode is not permanent.
The dangerous failure here is the one that looks fine on paper. Review-after with logs nobody reads. Honestly, it is worse than honest Autonomous, because it wears the appearance of governance with none of the substance. If you are going to claim an action class is reviewed, fund the review.
Making the model auditable: what Risk and Legal need
The reason this framework holds up under scrutiny is simple. The placement decision is the accountability record. You are not bolting a governance layer on top of the agents. You are documenting, per action class, who holds the decision right and who reviews, which is exactly what an auditor walks in asking for.
Map each mode to the artifact someone will eventually request:
- Autonomous - the scope and rate-limit configuration, the action logs, and evidence the kill switch exists and is owned.
- Approve-before - the approval logs showing proposed action, approver identity, decision, and timestamp.
- Review-after - the review records, the sampling cadence, rollback evidence, and the escalation history.
There is one more move that separates a mature program from a static one. Modes should migrate. Start a new action class in Approve-before while confidence is low. As the logs show the agent is reliable and observability is solid, graduate the class to Review-after. Eventually, for genuinely low-stakes reversible work, it can earn Autonomous. Governance is a dial you turn as evidence accumulates, not a switch you flip once at deployment and forget. Document the migration, and the dial itself becomes part of the audit story.
Common mistakes when assigning decision rights
The framework is simple to state and easy to get wrong in predictable ways. The four that show up most:
- Treating human-in-the-loop as binary. Human in or human out is two options. There are three. Collapsing Review-after into either Autonomous or Approve-before throws away the mode where most safe-but-fast automation actually lives.
- Setting the mode by model capability instead of action risk. A more capable model does not earn an action more autonomy. The action's reversibility and blast radius set the mode. The model is a component, not a license.
- No migration path. Everything frozen in Approve-before produces automation theater, a human rubber-stamping a queue, all the latency of oversight with none of the throughput of automation. If nothing ever graduates, the framework has calcified.
- Review-after with no real review. Logs nobody reads is the most expensive mistake, because it stays invisible right up until the incident. If you cannot fund the review, the action does not belong in Review-after.
Key takeaways
- The governance question for AI agents is decision rights, not model choice. Models swap. Decision-rights placement is the control surface that lasts.
- Every agent action sits in one of three modes: Autonomous (acts alone), Approve-before (human signs off first), or Review-after (human reviews the trail).
- Place each action by walking five criteria in order: reversibility, blast radius, regulatory exposure, observability, and frequency.
- Approve-before is human-in-the-loop. Review-after is human-on-the-loop. Different controls, and Review-after is the one most programs neglect.
- Governance is a dial. Migrate action classes toward more autonomy as confidence and observability grow, and document the migration as part of your audit trail.
Map your agent use cases to the three modes
Name every action your agents can take, and the mode each one sits in, and you have a defensible governance program. If you cannot, that is the gap to close before the next deployment. Not after the next incident.
Book a Discovery Sprint - a focused engagement that maps your specific agent use cases onto the Autonomous, Approve-before, and Review-after modes and produces an auditable decision-rights policy your Risk and Legal teams can stand behind.