Skip to content

Risk

Data risk classification is a standard part of enterprise software governance. What makes AI agents different is not that they introduce risk where none existed before, but that their risk profile is harder to assess statically — and that new categories of risk emerge from their non-deterministic nature.

A traditional software system that queries a database does so in a way that is mostly fixed at design time — you can read the code and know what it accesses. An AI agent makes decisions dynamically: which tools it calls, what data it retrieves, what actions it takes. Two runs of the same agent on different inputs can touch entirely different categories of data and take entirely different kinds of actions.

The non-determinism also means that vulnerabilities can arise from combinations that would not be obvious from inspecting each capability in isolation. An agent with access to an email tool and an outbound HTTP tool might be entirely safe when each is used independently. In a single context — say, an email containing a prompt that causes the agent to chain both tools together — those same capabilities can become an exfiltration vector.

In formal risk management — the kind used in insurance, finance, and enterprise governance — risk is not simply danger or the presence of something hazardous. It is a way of quantifying uncertainty about future events in terms of two properties: the likelihood that an event occurs, and the magnitude of its consequences if it does. An event that is highly likely but has minor consequences may be lower risk than one that is unlikely but catastrophic.

Risk is always prospective — it describes exposure to possible outcomes, not outcomes that have already occurred. Risk management exists to make that exposure legible before anything goes wrong, so that organisations can make informed decisions about what level of exposure is acceptable and where controls are warranted.

In data governance, this translates to questions like: what sensitive information does this system have access to, what could happen if it were mishandled, and how likely is mishandling given how the system operates?

Every action an agent takes has two risk-relevant properties: the sensitivity of the data involved, and the consequence of the action being performed.

Data sensitivity reflects the regulatory and ethical weight attached to a category of information. Personal identifiers, financial records, and health data each carry different implications if they are mishandled, exposed, or processed without appropriate controls.

Action consequence reflects what the agent did with the data. Reading a record is different from writing one. Retrieving information is different from initiating a transaction. The potential for harm scales with the type of action — an agent that reads a health record poses different risk than one that modifies it.

What risk classification does and does not tell you

Section titled “What risk classification does and does not tell you”

A risk classification is an assessment of what data an agent handled and how consequentially it acted — not a judgement about whether the agent behaved correctly or produced accurate outputs. Those are quality questions. Risk and quality are orthogonal: an agent can be high-risk and functioning exactly as intended, or low-risk and producing poor outputs.

A High classification means the run involved high-sensitivity data, high-consequence actions, or both. It identifies runs that warrant closer attention; it does not mean something went wrong.

  • Risk profile — how to configure the weights, multipliers, and thresholds that translate risk dimensions into a classification.
  • Span — the unit at which risk is assessed.
  • Instance — where the aggregate risk classification is recorded.
  • Quality and performance — the separate question of how well an agent does its job.