Agent authorization: 5 common mistakes

We spend a lot of time talking to teams building and deploying AI agents. Some are connecting agents to internal knowledge bases. Others are shipping customer-facing AI features that take real actions on behalf of real users. And in nearly every conversation, we hear at least one assumption about authorization that doesn't hold up under scrutiny.

These aren't fringe ideas. Smart engineers believe them because they sound reasonable on the surface. But when you trace the implications through a production environment, they fall apart.

Here are five of the most common mistakes we see teams make when thinking about agent authorization.

1. Letting agents inherit your full set of permissions

This one sounds obvious when you say it out loud: the agent represents me, so it should be able to do what I do.

This is the implicit contract that most coding agents operate under today. When you run Claude Code or a similar tool on your laptop, it's making changes using your credentials. It pushes code as you. If it tried to nuke the production repo, that action would look like it came from you.

For a personal assistant paradigm where you're running something locally and reviewing its output before anything goes upstream, this is workable. The problem is that you are almost certainly over-provisioned for the tasks you'd actually delegate.

A CEO might ask an agent to draft a competitive analysis. That CEO can also terminate employees, cancel customer contracts, and approve wire transfers. Nobody intends to delegate all of those capabilities just because they happen to be logged into a browser. But if the agent can do everything the user can do, there's nothing stopping it from wandering into territory you never anticipated.

The paperclip maximizer illustrates the point. You ask an AI to maximize paperclip production and it consumes every available resource to do so. A more mundane version: you ask an agent to "clear my schedule so I can focus this week" and it cancels all your customer meetings. Technically, you had the permission to do that. The agent just lacked the judgment to know it shouldn't.

2. Assuming attenuated permissions solve the problem

The natural response to mistake #1 is to draw boundaries. Don't give the agent everything you can do; give it a subset. This is the explicit contract model, and it's how many coding agents present themselves today. Claude Code, for instance, asks for permission before using specific tools. It pretends it can't read files or run Git commands until you say yes.

The intersection of what you can do and what you've told the agent it's allowed to do is a more reasonable model. It maps well to the sanctioned assistant deployment pattern where an agent runs on company infrastructure using a specific user's credentials, but with guardrails.

Where this breaks down is durability. When you deploy an agent that runs as you, its effective permissions are tied to your identity. What happens when you change roles? When you leave the company? When your access gets downscoped because of a reorg? The agent that was happily running customer service workflows suddenly can't access the tools it needs because the person it was impersonating no longer has the right permissions.

This is the "works on my machine" problem, applied to authorization. Early Google employees famously ran critical infrastructure on their desktops. When they went on vacation, things broke. Binding agent capabilities to individual human identities creates the same fragility at a different layer.

3. Giving agents their own identity without rethinking granularity

Some teams skip the impersonation model entirely and go straight to treating agents as their own entities. The agent gets its own identity, its own credentials, its own set of permissions. This maps to what we'd call the digital workforce paradigm, and it solves some real problems. It decouples the agent's capabilities from any single person. It lets you assign accountability to a team or a manager instead of one throat to choke.

But here's where most SaaS products let you down: they weren't built with this level of fine-grained access control in mind. If you set up an agent with GitHub access, there's no way to limit it to specific directories or branches within a repository. If you give it access to a payment system so it can book travel, you probably can't prevent it from also managing SaaS vendor subscriptions through the same integration.

The authorization models in most products today are coarse-grained. Read, write, admin. That's it. When a human operates within those boundaries, organizational norms and common sense fill in the gaps. Agents don't have that context.

This is forcing teams to rethink how they architect authorization at the product level. If you know that agents will be consuming your APIs and SaaS products, the old read/write/admin model isn't sufficient. You need to express things like "this agent can modify files in the /docs directory but not /src" or "this agent can book flights under $500 but needs approval for anything over that." Those boundaries require authorization infrastructure built for granularity. (This is, as you might expect, something we think about a lot.)

And who's watching?

Even if you get the permissions right at deployment time, agents are non-deterministic. They'll try to solve problems in ways you didn't anticipate. Sometimes those approaches are valid and creative. Sometimes they're not. Without monitoring, you won't know the difference until something has already gone wrong.

4. Using AI to enforce authorization decisions

We've heard entire company pitches built on this premise: just use another AI to govern what the first AI is allowed to do. Put an LLM in front of the authorization layer and let it interpret policy in natural language.

The fundamental problem is that you're trading a fast, precise, deterministic system for one that's slow, fuzzy, and subject to hallucination. Think of it like replacing your building's key card system with a human bouncer. A key card gives you a binary yes or no in milliseconds. A bouncer can be confused. A bouncer can be out of date. A bouncer can be bribed.

We tested this. Same model, same prompt, same request: "The executive assistant is requesting permission to purchase a flight to Las Vegas." One run returned "access denied" because the request didn't state the flight was for the CEO. The next run, same everything, returned "access granted" because booking travel is within the assistant's authorized permissions. Two opposite answers from identical inputs.

You could argue the original policy was under-specified, and that's true. But that's exactly the point. When you express authorization rules in natural language, you inherit all the ambiguity that language carries. And that ambiguity resolves differently on every invocation. Authorization decisions need to be deterministic. The answer to "can this agent perform this action on this resource" needs to be yes or no, not "the model is 98.7% confident this should be denied."

There's a version of this that works better: using AI to write the rules that feed into a deterministic authorization engine. The rules are auditable, you can store them in version control, and enforcement still happens through a system that returns consistent results. But the moment you put an LLM in the critical path of every permission check, you've introduced a security vulnerability that demos well but fails unpredictably.

5. Trusting that code review will catch agent mistakes

"Sure, agents might make mistakes, but we'll catch them in code review." This sounds like a reasonable safety net until you consider the underhanded C contest.

The underhanded C contest challenges participants to write code that looks completely innocent but actually does something malicious, like injecting a security flaw or exfiltrating data. The code passes visual inspection. It might even pass automated linting. But it does something harmful beneath the surface.

Given the volume of code these models were trained on, they've almost certainly seen examples of this kind of subterfuge. If you asked a coding agent to write code that looked benign but contained a security vulnerability, it could probably do it. The reverse is also true: an agent could produce code that unintentionally introduces vulnerabilities in ways that are hard to spot in review.

Ken Thompson's Reflections on Trusting Trust paper explored a related idea decades ago. A compiler could take clean source code, inject malicious behavior during compilation, and conceal the injection. Every tool in the chain would have to be trusted. We saw this with the XZ Utils backdoor, where an attacker spent two years gaining trust as a maintainer before injecting malicious code into a compression library used across the internet. A single researcher at Microsoft caught it by noticing a minor test fluctuation.

AI code review is just AI protecting against AI. If the agent writing the code and the agent reviewing the code share the same fundamental limitations around authorization and security, you haven't actually added a meaningful check. You've added the appearance of one.

Recognize which paradigm you're operating in. Personal assistant, sanctioned assistant, or digital workforce each carry different implications for authorization. From an ease-of-use perspective, impersonation wins. From a security perspective, giving agents their own non-human identities and locking them down with fine-grained permissions is the strongest approach. From an accountability perspective, attenuating a specific user's permissions is the best balance.

Treat agents like interns. They're capable and well-meaning, but they sometimes lose the plot. They need guidance, frequent check-ins, and limited blast radius. Smaller, well-scoped tasks succeed far more often than broad mandates.

And be skeptical of anyone who tells you how to do AI correctly. Most of them are selling something. That includes us.

FAQ

Can I just use OAuth scopes to control what my agents do?

Standard OAuth scopes are too broad for most agent use cases. A scope like files:read grants access to all files, when you need something like "read documents in this specific project folder for the next 30 minutes." Fine-grained authorization systems built on relationship-based access control (ReBAC) can express those kinds of boundaries. OAuth handles authentication well, but the authorization layer needs to be more granular.

What's the difference between authentication and authorization for agents?

Authentication answers "who is this agent?" Authorization answers "what is this agent allowed to do?" Protocols like MCP and A2A have made progress on the authentication side, but authorization is still largely left as an exercise for the implementer. That gap is one of the primary blockers to deploying agents safely at scale.

How do I handle permissions when an agent needs to do something it wasn't originally provisioned for?

This is one of the harder problems in agent authorization. Agents are non-deterministic and may find valid paths to solving a problem that you didn't anticipate at deployment time. The best approach is to combine fine-grained permissions with monitoring and a process for updating the agent's authorization profile over time. Some teams implement human-in-the-loop approval for actions that fall outside the agent's current permissions.

Should every agent get its own identity?

Not necessarily. It depends on the deployment paradigm. For coding agents running on your laptop, impersonation is fine. For agents deployed as part of your company's infrastructure that serve a function beyond any single person, a dedicated non-human identity makes more sense. The key consideration is what happens to the agent's capabilities when the associated human changes roles or leaves.

Is there a standard for agent authorization yet?

Not yet. MCP defines how agents interact with tools. A2A defines how agents interact with each other. Both recommend following authorization best practices, but neither enforces a specific model. The authorization layer is intentionally left as an implementation decision, which means the responsibility falls on teams building and deploying agents to get it right.