Governing Agentic Workflows Before They Scale

Abstract

An agent is not a bigger chatbot. It is a system that turns one user action into many model calls, selects its own tools, reaches across enterprise data, and decides for itself what context goes into each request. That autonomy is exactly what makes agents useful and exactly what makes them hard to govern. This paper maps the new risk surface agents introduce, explains why per-application controls fail at agent scale, and presents a control framework that lives at the context gateway beneath every agent. It includes a phased rollout that moves an organization from observe-only to fully governed without rewriting a single agent.

Key takeaways

1.Agents move the decision about what enters each model call from the developer to the runtime, which is precisely what governance has to catch up to.
2.Autonomy creates new risks: uncontrolled data egress, runaway cost, unexplained action, prompt injection through tools, and excessive agency.
3.Per-application governance fails at agent scale because controls drift, every new agent is a new gap, and no consolidated record exists.
4.Governing the context layer applies policy, redaction, data-class rules, approvals, and audit once, beneath every agent.
5.Effective control combines preventive, detective, and corrective mechanisms, each with a concrete gateway implementation.
6.Observability for autonomy means tracing, replaying, and attributing every individual agent step, not just the top-level request.
7.A phased rollout that starts observe-only proves auditability before it changes any agent behavior, and de-risks every step that follows.

Executive summary

The shift from assistants to agents is a shift in who decides. A conversational feature sends the context a developer wrote. An agent assembles its own context at runtime, chooses which tools to call, fans a single instruction into a chain of model calls, and reaches into systems the developer never explicitly named. The thing you most need to govern, what was sent to a model and why, is now decided by the agent rather than by a person.

Most agent governance today is per application. Each team wires its own redaction, its own allow-lists, its own logging, if it wires them at all. That arrangement does not survive contact with scale. Every new agent is a new gap, controls drift apart across teams, and no one holds a consolidated record of what the fleet actually did. The first serious incident is also the first time anyone looks for evidence that was never centrally captured.

5–100+

model calls a single agentic workflow can fan out per user action (typical range)

70–90%

of agent traffic that bypasses any central policy when governance is per application (illustrative)

1 of N

agents needed to leak data or take an unexplained action before the whole fleet is in scope for review (illustrative)

The durable answer is to govern the context layer, not each agent. A gateway that sits between every agent and every model can apply policy, redaction, data-class rules, approvals, and audit once, beneath all of them. This paper sets out the risk surface, a preventive-detective-corrective control framework mapped to gateway mechanisms, and a five-step rollout. The argument is simple: govern before you scale, because retrofitting controls across a fleet of deployed agents costs far more than building the layer they all pass through.

I. What changes with agents

A conversational AI feature has a comforting property: a developer decides what goes into the prompt. The system instructions, the retrieval template, the tool list, the safety rules, all of it is authored ahead of time and reviewed like any other code. An agent breaks that property. Given a goal, it plans its own steps, decides which tools to call and in what order, chooses what to retrieve, and assembles the context for each call at runtime. The most governance-relevant decision in the system, what reaches the model, is now made by the agent itself.

One action becomes many calls

A user clicks once. Behind that click an agent may produce a plan, select a tool, call the model to format the arguments, execute the tool, retrieve supporting documents, reason over the result, call another tool, and synthesize a final answer. Each of those is a separate model call carrying its own context. The fan-out is not incidental; it is how agents work. A workflow that reads as one request in a design document is five to one hundred or more calls in production.

Tool use widens the blast radius

Tools turn a text generator into an actor. An agent that can query a database, call an internal API, send a message, or write to a ticketing system can cause effects in the world, not just produce words. Every tool is a new path along which data can leave and along which untrusted content can enter. The agent decides which of those paths to take, and it decides per step.

Data reach is set at runtime

Because the agent chooses what to retrieve and which tool to invoke, the set of systems it touches is not fixed at design time. An agent given access to a document store and a search tool can reach data the developer never specifically enumerated, simply because the goal led it there. Autonomy and data reach are the same property viewed from two angles, and both are decided while the workflow runs.

II. The new risk surface

The risks agents introduce are not new categories of harm. They are familiar harms made faster, more frequent, and harder to attribute because a machine is now making the choices that used to require a person. The table below names the five that matter most for an enterprise putting agents into production.

Risk	How agents introduce it	Consequence
Uncontrolled data egress	Agent decides at runtime what to retrieve and place in a prompt, reaching data no developer enumerated	Regulated or confidential data sent to a model or a third party with no record
Runaway cost	Fan-out, retries, and reasoning loops multiply calls per action with no per-workflow ceiling	A single misbehaving agent generates a large unbudgeted spend before anyone notices
Unexplained action	Tool calls execute effects in source systems without a step-level record of why	An action lands in production that no one can reconstruct or justify after the fact
Prompt injection via tools	Untrusted content returned by a tool or document is treated as instruction by the agent	Attacker-controlled text redirects the agent to exfiltrate data or misuse a tool
Excessive agency	Agent holds broad standing permissions and acts beyond what the task required	A small reasoning error becomes a large effect because the agent could do more than it needed to

The agent risk surface: how autonomy introduces each risk and what it costs.

Three of these, prompt injection, excessive agency, and the data-exposure path behind uncontrolled egress, map directly onto the most cited categories of language-model application risk in public guidance. The other two, runaway cost and unexplained action, are operational rather than adversarial, but they are the two that surface first in real deployments and the two that erode executive trust fastest.

III. Why per-application governance fails at agent scale

The instinct is to govern each agent where it lives. The team that builds an agent adds redaction in its own code, maintains its own list of approved tools, and writes its own logs. This works for one agent owned by one careful team. It does not survive a fleet.

Controls drift

Two teams implement redaction differently. One masks national identifiers, the other does not consider them sensitive. One logs full prompts, the other logs only metadata. There is no single definition of a policy and no single place to change it, so the controls diverge the moment there is more than one of them, and they keep diverging as people come and go.

Every new agent is a new gap

Governance that lives in application code is opt-in by construction. A new agent is compliant only if its author remembers to wire in every control and wires each one correctly. At a handful of agents this is merely fragile. At dozens, built by different teams under deadline, the probability that all of them are fully governed approaches zero. The fleet is exactly as strong as its least careful integration.

No consolidated record

When logs live with each application, there is no single answer to the question a regulator or an incident responder will ask: what did our agents send to models, and what did they do, across the organization, in this window. Reconstructing that means querying many systems with different schemas and different retention, assuming each one captured anything at all. The evidence you need most is the evidence that was never centrally kept.

A fleet governed application by application is governed only as well as its weakest integration, and you will not know which one that is until it fails.

IV. Govern the context layer

The structural fix is the same one that controls cost: move governance to the place all context passes through. A gateway between every agent and every model is the one point where policy, redaction, data-class rules, approvals, and audit can be applied once and apply to all. The agent still decides what it wants to send. The gateway decides what is allowed to pass, and records what did.

Policy, defined once

Rules about which models may receive which classes of data, which tools an agent may call, and which actions require review are authored in one place and enforced for every agent. A change to policy is a change in one location, not a coordinated edit across many codebases that some teams will miss.

Redaction and data-class rules at the boundary

Every outbound request is inspected and classified before it reaches a model. Sensitive spans are masked or blocked according to the data class and the destination, so a request that would carry regulated data to a model that is not approved for it never leaves the boundary. Because this runs beneath the agent, it holds even when the agent's own logic is wrong or its prompt has been manipulated.

Approvals as a gate, not an afterthought

Defined high-consequence actions are held at the gateway pending human approval rather than executed and reviewed later. The agent proposes; a person disposes; the gateway enforces the pause. The control does not depend on each agent remembering to ask.

Audit by default

Every request and every tool action passes through the gateway, so the record is a property of the path rather than something each team opts into. There is one schema, one retention policy, and one place to answer what the fleet did.

V. A control framework

Governance is not a single feature. It is a layered set of controls that prevent what they can, detect what they cannot prevent, and correct what gets through. The value of placing them at the gateway is that each one is implemented once and inherited by every agent. The table maps the three control types to concrete gateway mechanisms.

Control type	Objective	Gateway mechanism
Preventive	Stop disallowed context or action before it happens	Data-class redaction, model allow-lists, tool scoping, per-workflow cost ceilings
Preventive	Limit standing authority to the task at hand	Least-privilege tool grants and short-lived, step-scoped permissions
Detective	See what every agent step did, in time to act	Full request and tool-call tracing, anomaly and cost alerts, injection heuristics
Detective	Attribute behavior to a workflow, team, and identity	Per-step tagging and a consolidated, queryable audit record
Corrective	Stop a misbehaving agent quickly	Circuit breakers, kill switches, and automatic throttling on threshold breach
Corrective	Reconstruct and remediate after an incident	Step-level replay and an immutable record of context and actions

Preventive, detective, and corrective controls mapped to gateway mechanisms.

No single row is sufficient on its own. Prevention will miss novel attacks, detection without correction only documents the damage, and correction without detection has nothing to trigger it. The point of the gateway is that an organization can hold all three at once, consistently, for every agent it runs, rather than hoping each team assembled its own version of the same defenses.

VI. Observability for autonomy

You cannot govern what you cannot see, and an agent is mostly invisible at the level a normal request log operates. A top-level log says a user asked a question and received an answer. It does not show the eight model calls, three tool invocations, and two retrievals that happened in between, which is exactly where the governance-relevant decisions were made. Observability for autonomy means instrumenting the steps, not the request.

Trace every step

Each model call and each tool invocation in a workflow is recorded as a node in a single trace, linked to the user action that started it. The trace shows the plan, the order of operations, the context sent at each call, the tool arguments, and the result returned. The unit of observation is the agent step.

Replay to reconstruct

Because the gateway holds the exact context and arguments of every step, a past workflow can be replayed to understand why an agent did what it did. This is the difference between an incident review that produces an explanation and one that produces a guess.

Attribute every call

Each step carries the identity of the agent, the workflow, the team, and the end user or service on whose behalf it ran. Attribution makes it possible to answer who, not just what, and to scope an incident to a single agent rather than the whole fleet.

Per agent step, the record should capture at minimum the following.

The full context sent to the model and the data classes it contained after redaction.
The model and version that received it, and whether that destination was approved for that data class.
The tool invoked, its arguments, and the result returned, including whether the result was treated as untrusted.
The identity chain: agent, workflow, team, and the principal on whose behalf the step ran.
Cost and latency for the step, so fan-out is visible as it accumulates rather than only at month end.
Any policy decision applied: allowed, redacted, blocked, or held for approval, and the rule that decided it.

VII. Human-in-the-loop and approvals

Autonomy is a setting, not a constant. The right amount depends on the consequence of the action, and a governed system makes that tradeoff explicit rather than leaving it to whatever the agent decides in the moment. The question is not whether to keep a human in the loop but precisely when autonomy should pause.

When autonomy should pause

Pause where an action is hard to reverse, touches regulated data or money, affects a customer directly, or exceeds a cost or scope threshold. Read-only retrieval and low-stakes internal steps can run unattended. A write to a system of record, an external message, a payment, or a bulk operation should stop and ask. The dividing line is consequence, and it should be written down as policy, not improvised per agent.

Approval policies for defined actions

Each high-consequence action is named in policy with the approval it requires: who can approve it, how long the request may wait, and what happens on timeout. The gateway holds the action until the approval arrives, attaches the approver's identity to the audit record, and refuses to execute anything in the protected set without it. Because the gate lives beneath the agent, an agent cannot skip it by malfunctioning or by being manipulated into trying.

Approvals are a control, not a courtesy

An approval step that the agent itself implements can be removed, bypassed, or simply forgotten by the next team to ship. An approval enforced at the gateway is a property of the path every agent shares. Put the gate where the agent cannot route around it.

VIII. Regulatory pressure

The case for governing agents is not only operational. The external expectations on AI systems are converging on exactly the controls a context gateway provides: documented risk management, data governance, human oversight, logging, and traceability. An organization that can show what its agents sent and did is also an organization that can answer a regulator.

What the frameworks ask for, and where the gateway answers

The EU AI Act requires, for high-risk systems, record-keeping and automatic logging of events, human oversight, and data governance over what enters the system. The NIST AI Risk Management Framework organizes its expectations around Govern, Map, Measure, and Manage, which in practice means knowing your AI systems, measuring their behavior, and managing it continuously. The OWASP Top 10 for LLM Applications names prompt injection, sensitive information disclosure, and excessive agency as leading risks. Emerging audit expectations push toward a consolidated, tamper-evident record of AI activity. A context gateway answers each of these in the same place: redaction and data-class rules satisfy data governance and disclosure control, approvals satisfy human oversight, per-step tracing and an immutable record satisfy logging and traceability, and tool scoping with least privilege answers excessive agency. The alternative, assembling this evidence per application after the fact, is the expensive path the next section is written to avoid.

None of these frameworks mandates a gateway by name. They mandate outcomes: that you can govern, document, oversee, and reconstruct what your AI does. A single layer beneath all agents is the most direct way to produce those outcomes consistently, rather than proving them one agent at a time and hoping the coverage holds.

IX. A phased governance rollout

Governance does not have to arrive all at once, and it should not change agent behavior on day one. The sequence below establishes visibility first, then control, with each step de-risking the next. Adoption is a configuration change: point each agent's model and tool calls at the gateway. No agent is rewritten.

1Route observe-only. Send every agent's model and tool traffic through the gateway and change nothing about behavior. Establish a per-workflow baseline of steps, fan-out, cost, latency, and what each agent actually sends and does.
2Classify data. Inspect the traffic now flowing through the gateway and classify what agents send by data class. You will usually find sensitive data reaching destinations no one intended, which is the evidence that turns governance from a project into a priority.
3Enable redaction and approvals. With the data picture in hand, turn on redaction and data-class rules at the boundary, and place defined high-consequence actions behind approval gates. Behavior changes only where a rule says it must.
4Add policy. Author model allow-lists, tool scoping, least-privilege grants, and per-workflow cost ceilings as central policy enforced for every agent, including ones not yet built.
5Prove auditability. Exercise the consolidated record: replay a past workflow end to end, attribute it to an identity, and produce the report a regulator or incident responder would ask for. Auditability you have rehearsed is the only kind you can rely on.

Each phase delivers value before the next begins. Observe-only alone surfaces runaway cost and surprising data reach. Classification alone justifies the controls that follow. By the time policy is enforced, the organization already trusts the layer enforcing it, because it has been watching it work in read-only mode the whole time.

X. Conclusion

Agents are worth building, and they are being built whether or not the governance is ready. The choice an enterprise actually faces is not whether to adopt agents but whether to put the control layer in before the fleet grows or after. Before, it is one gateway that every new agent inherits by pointing at a base URL. After, it is a retrofit across dozens of deployed agents owned by different teams, each with its own controls to reconcile and its own logs to reconcile them against, undertaken under the pressure of an incident or an audit rather than at leisure.

The economics are decisive. Governing the context layer once, beneath all agents, costs a fraction of governing each agent separately and reconciling them later, and it is the only approach that holds as the fleet scales. Govern the layer they share, prove you can reconstruct what they did, and do it while the fleet is small enough that the layer is cheap to install. The stakes of autonomy are real, but they are manageable for the organization that builds the place all context passes before it has many agents passing context through it.

References

[1]NIST, Artificial Intelligence Risk Management Framework (AI RMF 1.0), NIST AI 100-1, 2023.
[2]European Union, Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act), 2024.
[3]OWASP, Top 10 for Large Language Model Applications, 2025.
[4]MITRE, ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) knowledge base.
[5]NIST, Generative AI Profile (AI 600-1), Companion to the AI Risk Management Framework, 2024.