Loading...

About the Identifier Scheme

How risks are organized and referenced in the AI Risk Registry

The AI Risk Registry uses a hierarchical identifier scheme inspired by library classification systems. These identifiers provide stable, human-readable codes that make it easy to reference, cite, and discuss specific AI risks.

Why Structured Identifiers?

As AI risk taxonomies grow and evolve, having stable identifiers becomes essential for:

  • Citation and reference — Easily cite specific risks in reports, policies, and research
  • Cross-referencing — Map risks to other taxonomies like NIST AI RMF, OWASP LLM Top 10, and MITRE ATLAS
  • Communication — Provide a shared vocabulary for discussing AI risks across organizations
  • Stability — Identifiers never change once assigned, ensuring links and references remain valid

Identifier Format

Every risk in the registry has a unique identifier following this pattern:

R-CDPS.NNN

Component Meaning
R Registry prefix
C Class (1-4)
D Domain within the class (1-9)
P Pattern (0-9, tens digit)
S Slot for expansion (0-9)
NNN Sequence number within the pattern (001-999)

Example: R-4210.003 breaks down as: - R — Registry prefix - 4 — Class 4 (Harms) - 2 — Domain 2 (Content & Conduct Harms) - 1 — Pattern 1 (Content Safety Harms) - 003 — Third risk in that pattern

The Four-Class Model

The registry organizes all AI risks into four fundamental classes based on "what kind of thing" the risk represents:

Class Code Description
Threats 1 Adversarial actions that an attacker can intentionally trigger
Failure Modes 2 Non-adversarial technical issues arising from limitations, bugs, or drift
Governance Failures 3 Organizational and process failures in controls and accountability
Harms 4 Impact statements describing effects on people, organizations, or society

Why Separate Classes?

Mixing fundamentally different risk types (threats, failures, governance gaps, and harms) creates practical problems:

  • Ownership ambiguity : Threats are owned by security teams, Failure Modes by engineering, Governance Failures by legal/compliance, and Harms by policy/product.
  • Inconsistent assessment : Likelihood of an adversarial attack is assessed differently than likelihood of a system bug.
  • Incoherent controls : Controls for attack techniques look nothing like controls for organizational gaps.

Separating these classes enables clearer ownership, consistent assessment, and coherent control mapping.

Classification Structure

Each class contains Domains (broad categories), which contain Patterns (specific risk families), which contain individual Risks .

Class (e.g., Threats)
└── Domain (e.g., Prompt & Interface Attacks)
    └── Pattern (e.g., Prompt Injection)
        └── Risk (e.g., Indirect Prompt Injection)

Class 1: Threats

Adversarial actions an attacker can intentionally trigger in a realistic deployment.

Domain ID Examples
Prompt & Interface Attacks R-1100 Prompt injection, jailbreaking
Identity & Trust Attacks R-1200 Impersonation, channel compromise
Persistence & Supply Chain R-1300 Persistent compromise, artifact tampering
Data & Model Attacks R-1400 Data poisoning, model extraction
Tooling & Privilege Attacks R-1500 Integration abuse, privilege escalation
Malicious Use R-1600 Content abuse, weaponization

Class 2: Failure Modes

Non-adversarial technical issues arising from limitations, bugs, drift, or emergent behavior.

Domain ID Examples
Reliability & Calibration R-2100 Hallucination, capability limitations
Alignment & Capability R-2200 Goal misalignment, dangerous capabilities
Model Modification Drift R-2300 Fine-tuning degradation, quantization drift
Assurance Gaps R-2400 Transparency deficits, evaluation gaps
Emergent & Systemic R-2500 Capability thresholds, multi-agent dynamics

Class 3: Governance Failures

Organizational and process failures in controls, accountability, or lifecycle management.

Domain ID Examples
Regulatory & Legal R-3100 Compliance failures
Accountability & Oversight R-3200 Governance gaps
Lifecycle & Operations R-3300 Change management, monitoring gaps
Safety Management R-3400 Evaluation program failures
Human-in-the-Loop R-3500 Review workflow failures

Class 4: Harms

Impact statements describing effects on people, organizations, society, or environment.

Domain ID Examples
Information & Trust R-4100 Information integrity harms
Content & Conduct R-4200 Content safety harms
Privacy & Civil Liberties R-4300 Privacy harms, surveillance harms
Safety & Cyber-Physical R-4400 Cyber-physical harms, availability harms
Fairness & Discrimination R-4500 Discrimination harms
Economic & Labor R-4600 Economic inequality, labor market harms
Power & Governance-of-Society R-4700 Autonomy loss, dependency harms
Creative Economy & IP R-4800 IP harms, creative economy harms
Environmental R-4900 Environmental impact harms

How Risks Are Classified

When a new risk is identified, it follows these steps to ensure consistent classification.

Step 0: Choose Class

Select based on the primary causal driver :

If the risk... Assign to Class
Can be intentionally triggered by an adversary Threats (1)
Arises from limitations, bugs, drift, or emergent behavior Failure Modes (2)
Is primarily a failure of controls, accountability, or process Governance Failures (3)
Describes an impact to people, organizations, or society Harms (4)

Step 1: Choose Domain

Select the domain where the primary harm mechanism is best controlled — where the most direct technical or organizational control belongs.

Step 2: Choose Pattern

Select the pattern that best describes:

For Class Pattern describes
Threats Attack technique or vector
Failure Modes Failure mechanism
Governance Failures Broken control or process
Harms Impact category (who/what is harmed, how)

Handling Complex Risks

When a risk spans multiple classes, split it into linked entries:

Example: "Data poisoning by an insider" involves: - A Threat (the poisoning attack technique) - A Governance Failure (inadequate access controls) - A Harm (data integrity compromise)

Create three linked entries rather than one overloaded entry.

Relationships Between Risks

The registry models causal chains between risks using typed relationships:

Type Use Case Example
causes Mechanism → Harm Prompt injection causes information integrity harm
exploits Threat → Failure Mode Attack exploits a hallucination vulnerability
enabled_by Mechanism → Governance Failure Attack enabled by inadequate access controls
related General association Two risks in similar domains

These relationships help trace how threats lead to harms and what governance gaps enable them.

Stability Guarantees

Identifiers are permanent. Once a risk is assigned an R-ID, that identifier will never be reassigned to a different risk. This ensures that:

  • External references remain valid indefinitely
  • Historical analysis can track risks over time
  • Cross-taxonomy mappings stay accurate

When risks are merged, split, or deprecated:

Scenario What Happens
Merged risks The lowest ID is kept; others are marked deprecated with superseded_by pointer
Split risks The original ID stays with one child; new IDs are assigned to others
Deprecated risks The ID is preserved with a deprecation notice

Using the Identifiers

You can use R-IDs to:

  1. Reference risks in documentation — "Our system implements controls for R-1110.007 (Indirect Prompt Injection)"
  2. Map to compliance frameworks — Cross-reference R-IDs with NIST AI RMF, ISO 42001, or other standards
  3. Track risk assessments — Use R-IDs as stable keys in your risk management systems
  4. Communicate with stakeholders — Provide precise risk references in reports and discussions

Cross-References

Each risk in the registry includes mappings to external taxonomies where applicable, including:

  • NIST AI Risk Management Framework
  • OWASP LLM Top 10
  • MITRE ATLAS
  • IBM AI Risk Atlas
  • Cisco AI Taxonomy
  • EU AI Act risk categories
  • MIT AI Risk Repository

These mappings help you understand how risks relate to frameworks you may already be using.

Registry Versioning

The registry uses version numbers in the format YYYYMM.X :

  • YYYY — Year
  • MM — Month
  • X — Release within month (0-9)

Examples: - 202601.0 — First release, January 2026 - 202601.1 — Second release, January 2026 - 202602.0 — First release, February 2026

Each risk tracks when it was introduced and, if applicable, when it was deprecated, enabling historical analysis.