AI Agent Identity Assurance: The NIST IAL/AAL Crosswalk

5/6/2026 · 11 min read · aisecurityidentityagentsgovernmentagentic-auth

TLDR: When an agent acts on your behalf inside a system that requires identity assurance, it doesn’t create a new identity - it borrows yours. NIST’s IAL and AAL framework tells you exactly how much trust the system is expecting from that identity. Here’s how to read that framework for agents - and what it means that when something goes wrong, the forensic trail will always point to you.

This is Part 1 of a three-part series. Part 2 covers securing the browser session - prompt injection, dry runs, and action-time assurance. Part 3 covers what system owners need to build to be agent-aware.

The Setup

NIST SP 800-63 is the federal standard for digital identity. Two concepts matter most:

IAL (Identity Assurance Level) - how well was the person’s real-world identity verified at enrollment? Three levels. IAL1: self-asserted, no proofing. IAL2: remote proofing with validated evidence (think ID.me, Login.gov). IAL3: in-person, supervised, with biometrics - PIV card issuance, USPS proofing, Trusted Traveler enrollment.

AAL (Authentication Assurance Level) - how strong is the proof that the right person is authenticating right now? Three levels. AAL1: single factor (password). AAL2: two factors (password + OTP, push notification, or similar). AAL3: hardware-backed, phishing-resistant MFA - a YubiKey, a PIV card, a platform authenticator tied to a hardware security enclave.

One thing to be precise about before going further: IAL and AAL requirements are set by the relying party - the system being accessed - based on the sensitivity of what it protects. A benefits portal might require IAL2. Federal systems operating under M-22-09 require phishing-resistant MFA, which maps to AAL3. The credential either meets the bar or it doesn’t. You don’t negotiate it based on what you’re planning to do.

This framework was built for humans at a keyboard. Agents are entering that picture fast. And they change almost every assumption the framework makes.

What Agents Do to Identity

When an agent uses computer-use or browser-use to act on your behalf, it isn’t creating a new identity. It’s borrowing yours.

The agent logs in using your session cookie, your stored credential, your OAuth token. The downstream system sees you. It applies whatever IAL and AAL requirements it has - and as far as it can tell, those requirements were met when you authenticated.

That’s the core problem. And here’s the corollary that doesn’t get talked about enough: if something goes wrong - the agent submits the wrong form, takes an action you didn’t intend, gets manipulated into changing your account details - you are the actor of record in every downstream system.

The audit log shows your credentials. The access token carries your identity. The timestamp falls inside your authenticated session. There is no field in a standard server log for “AI agent was involved.” The investigator, the auditor, the fraud examiner - they all see you.

Getting that record corrected, if it can be corrected at all, requires proving a negative: that you didn’t authorize what the system says you did. That’s a hard position to be in. It’s one of the stronger arguments for building agent attribution infrastructure before something goes wrong, not after.

The Crosswalk

Not official NIST guidance. A practical mental model for teams building now.

Systems Requiring IAL1 (No Proofing Required)

The system doesn’t require real-world identity verification. Public portals, open data APIs, unauthenticated services. The system decided the stakes are low enough that it doesn’t need to know who you are.

For agents: full autonomy is reasonable. The agent is acting as an unverified principal on a system that wasn’t expecting verification anyway. Nothing is being amplified that the system wasn’t already comfortable with.

Risk: Low. The agent isn’t acting as anyone specific.

Systems Requiring IAL2 (Remote Identity Proofing)

The system required a real human to prove their identity remotely before getting access - a document submission, a video proofing flow, a knowledge-based verification. The system made a judgment: we need to know this is a real person who is who they say they are.

For agents: the human met IAL2 at enrollment. The agent did not - and cannot. The agent is now operating on a system that granted access based on a trust decision about a specific human. None of that trust extends to the agent’s judgment, scope, or behavior. The system has no idea the agent exists.

This is where most deployments are flying blind. The human verifies once. The session token goes to the agent. The agent can do everything that session allows. That’s too much.

What to do at IAL2 systems:

Scope the agent’s permissions explicitly at token issuance - read vs. write, which endpoints, which resources
Log every agent action with attribution back to the authorizing human
Time-limit tokens handed to agents
Require human confirmation before write, submit, or delete actions

Systems Requiring IAL3 (In-Person Proofing)

The system required physical presence, biometric collection, and supervised document verification. The system decided the stakes are high enough that remote proofing isn’t sufficient.

For agents: they shouldn’t be executing autonomously here. And there’s nothing stopping teams from deploying them anyway. That’s the part nobody says out loud.

IAL3 systems require what the spec doesn’t define but practitioners understand: intent assurance. Not just “we verified who this person is” but “we verified this person intended this specific action at this specific moment.” That’s a human-present problem. Agents can draft, prepare, and organize. They shouldn’t execute.

AAL - How the Authentication Side Breaks Down

IAL is about enrollment. AAL is about the moment of access. This is where agents run into a hard wall.

AAL1 (password): An agent can store and replay a password. Trivially. This is why password-only auth has been inadequate for a decade - and why an agent with a stored password adds no protection over any other credential theft scenario.

AAL2 (multi-factor): Here’s where precision matters. AAL2 requires two authentication factors - something you know plus something you have. An agent that has access to both (a stored password and a phone that receives SMS OTPs or push notifications) can mechanically satisfy the two-factor requirement. But it defeats the purpose.

AAL2’s intent is to prove a human is in control. An agent passing through a code proves only that someone has access to those credentials - not that a human is present and consenting. It’s the letter of the spec with none of the spirit. For SMS OTP specifically, NIST 800-63B marks it RESTRICTED due to SIM-swapping risks. An agent reading SMS codes is a second-order version of that same attack surface.

The blunter version: an agent intercepting MFA isn’t satisfying AAL2. It’s doing what a phishing kit does. The mechanism is the same. The authorization isn’t.

AAL3 (hardware-backed, phishing-resistant): An agent cannot touch a YubiKey. It cannot provide a fingerprint via a hardware enclave. It cannot satisfy verifier impersonation resistance. Agents are structurally incapable of meeting AAL3. Full stop.

For federal systems requiring AAL3, a human must authenticate first, then delegate a carefully scoped credential to the agent. That delegation needs to be auditable, time-limited, and scoped to the minimum necessary. Part 3 of this series covers how to build systems that can actually receive and enforce that delegation.

Dual Control Is Not a New Idea

Before we had agents, we had high-stakes decisions that no single person should be able to make alone. The pattern is everywhere:

Banking (maker-checker): The person who initiates a wire cannot also approve it. Two authenticated users, two separate credentials. SWIFT mandates this for international transfers.
Nuclear (Two-Person Integrity): Physically impossible to arm a weapon alone. Two keys, two bodies, two simultaneous authentication events. Institutionalized paranoia as policy.
Aviation (cross-check callouts): Every critical action requires one crew member to call it and another to confirm before execution. Neither can skip the other. Non-negotiable by regulation.
Pharmaceuticals: Controlled substance dispensing requires two pharmacists to independently verify. One signature isn’t enough.
Accounting (SOX / segregation of duties): The person who initiates a transaction cannot also approve it. Codified after Enron. The financial world’s maker-checker.
Legal (notarization): Signing a legal document requires a third party to witness identity at the moment of signing - not just that you exist, but that you were present and consenting right now.

The thread: the higher the stakes, the more the system is designed so no single point of trust can be compromised. Agents collapse that by default. One compromised session token and the agent can do everything that session allows.

The question isn’t whether dual control applies to agents. It’s how to implement it for a principal that doesn’t have a body. Part 2 covers the architectural patterns for doing that.

Where the Standards Work Stands

This space is active and unsettled. Three efforts worth watching:

WIMSE (Workload Identity in Multi-System Environments) - An IETF working group (draft-ietf-wimse-arch) focused on workload-to-workload identity across distributed systems, with AI agents explicitly in scope. Combines SPIFFE workload credentials with OAuth. The most serious standards effort happening right now on this problem, though it was built for microservices and is being stretched to cover agents rather than designed for them from scratch.

Transaction Tokens for Agents - An IETF draft (draft-araut-oauth-transaction-tokens-for-agents) that extends OAuth transaction tokens to carry agent context through a call chain. The key constraint: scope narrows at each hop. A sub-agent can only do less than the agent that delegated to it, never more. This is the enforcement mechanism most current deployments lack entirely.

HDP (Human Delegation Provenance) - A lightweight cryptographic protocol (arXiv 2604.04522) proposing a tamper-evident chain of custody from the authorizing human to every downstream agent action. Verifiable offline, no central registry required. Reference TypeScript SDK available on GitHub. Early work - but the design space is open enough that teams building on it now will have real influence over where it goes.

None of these are accepted RFCs. No official standard exists yet. That’s both the risk and the opportunity.

Quick Reference

System IAL Requirement	System AAL Requirement	What the Agent Can Do	Human Role
IAL1 (none)	None	Full autonomy	Optional oversight
IAL2	AAL1	Read with logging	Define scope, review logs
IAL2	AAL2	Read with logging	Confirm writes; never let agent pass MFA
IAL3	AAL2	Prepare and surface only	Execute all writes independently
IAL3	AAL3	Prepare and surface only	Authenticate and execute; agent is a drafting tool

Two things this table won’t tell you: what scope the agent should have within a given IAL/AAL tier (that’s your job to define), and how to build the human step-up patterns for each row (that’s Part 2).

What This Means If You’re in Government

Federal teams deploying agentic AI against systems with IAL/AAL requirements need to answer four questions before going live:

What does the target system require? IAL and AAL are the system’s requirements - not yours to negotiate. If you don’t know, find out before deployment.
How are agent actions scoped and logged? If something goes wrong, can you reconstruct exactly what the agent did, when, and under whose authorization?
What’s your step-up trigger? Which actions require human confirmation? Define this before deployment, not after an incident.
Does the system have any policy for agents? The answer is almost always no. Document it in your deployment notes, escalate it to your security officer before go-live, and establish human step-up for every write action as your default posture until a policy exists. The absence of a prohibition is not authorization - and if something goes wrong before a policy exists, that distinction won’t protect you.

Part 3 covers what system owners need to build to actually have a policy surface for agents.

The Bigger Picture

The identity frameworks we have were built when “users” meant humans. That assumption is breaking down fast. Agents are already authenticating, acting, and submitting inside systems designed for people - and those systems have no way to know.

IAL and AAL are still the right foundation. The assurance levels map clearly to the risk profile of different systems. The missing layer is scope, delegation provenance, and action-time human presence - which is what the next two posts in this series are about.

Building agents against government or regulated systems? I’d genuinely love to compare notes. Reach out.

← Back to Blog