May 20, 2026

How Tool Calling Auth Changes When You Move from Single-Tenant to Multi-Tenant

Q: Why does the auth failure mode in agent systems feel different from a database tenant_id filter bug?

In SaaS auth, a missing tenant_id filter returns the wrong data. Your application can catch it; your customer can report it; you can write a compensation. In agent auth, a missing tenant_id resolution causes your agent to act — under the wrong user's delegated authority, in a live external system, irreversibly. Salesforce accepted the write because the access_token was valid. Slack posted the message because the token was valid. There is no rollback. The authority backing the action belonged to someone who never authorized it, and the downstream system has no way to know the difference.

Q: What is connection_id and why isn't user_id sufficient for the audit trail?

connection_id is an immutable identifier for a specific OAuth grant event — the record created when Alice completed the authorization flow and your system received her access_token + refresh_token. user_id identifies Alice. connection_id identifies the specific act of Alice granting access on a specific date, with specific scopes, from a specific OAuth app version. The compliance question isn't 'did Alice's agent call Salesforce?' It's 'was the grant covering that call still in force at execution time, and had Alice's consent not been superseded by a revocation?' Only connection_id links the tool call entry to the grant event that can answer that question.

Q: What's the difference between invalid_grant on a token refresh and a network error, and why does the handling have to differ?

invalid_grant means the refresh_token was revoked, expired, or already consumed in single-use rotation. The user's delegated grant is gone. No retry will recover it. A network error means the refresh HTTP request failed transiently; retrying with exponential backoff is correct. Treating invalid_grant as a transient error causes the agent to retry until the backoff budget is exhausted. Treating a network error as invalid_grant causes the agent to halt and prompt re-authorization for a grant that is still valid.

Q: For background agents with no user in the loop — scheduled syncs, pipeline automation — which flow covers the tool calls?

OAuth Client Credentials (RFC 6749) for org-level automation in systems that support it. Service accounts (GCP IAM, AWS IAM) for cloud APIs. The critical constraint: the credential must be bound per tenant. A Client Credentials grant shared across tenants means background jobs for Tenant B can execute under Tenant A's org-level authority with no error raised.

Q: What does the audit trail need to answer for SOC 2 CC6.1?

The auditor's core question: 'For this specific agent action at this specific time, which authorization grant was in force, and was it still valid?' That requires connection_id linking to the original oauth.authorization_complete event, tenant_id, user_id, active oauth_scope at execution time, and a token_valid_at_execution assertion. Audit entries written without these fields cannot be retroactively attributed. The gap is permanent.

Team Scalekit

TL;DR

In single-tenant agent auth, one user's delegated grant uniquely covers every tool call. "Valid token" and "authorized to act" are the same answer. Moving to multi-tenant pulls them apart permanently, and the gap between them is where silent cross-tenant actions live.
A wrong database query returns bad data. A misconfigured agent acts: it updates the wrong opportunity stage, posts to the wrong Slack channel, creates the wrong Linear ticket, irreversibly, in systems you don't own, without raising an exception. The blast radius is categorically different.
The identity chain in an agent system is multi-hop: the LLM selects a tool, the agent executes it, the grant was issued by a human days or weeks ago, the action lands in a third-party system. Every hop must be traceable to the original authorization event.
Tool selection is itself an auth decision. Whether the LLM should be able to call ^{salesforce_delete_record} in this workflow context is not answered by checking whether the token is valid. It is a separate authorization question that precedes execution.
The ^tenant_id column is where multi-tenant auth surfaces in code. It is nowhere near where the problem ends.

Your agent needs to call Salesforce on behalf of a user. You run the Authorization Code flow (RFC 6749), store the access_token and refresh_token against that user's connected account, and ship. The agent acts on their records. Everything works.

Six months later, the second enterprise customer onboards. Their Salesforce grant goes into the same store. A background job that runs hourly has no customer context in its execution scope. It resolves the first Salesforce token it finds. It updates records in Salesforce.

Not their records. The first customer's.

No exception. No 403. Salesforce accepted the call because the access_token was valid. The delegated grant was legitimate. The agent was authorized — to act for the wrong person, in the wrong org, on data it had no business touching.

The anomaly surfaces six weeks later, in a customer review.

What Single-Tenant Agent Auth Gets Right

One delegated grant per tool means one answer to every auth question.

get_token("salesforce") returns Alice's grant because there is only one. The OAuth callback has one possible owner. The refresh lock has one key space. Revocation of Alice's Salesforce grant correctly terminates all Salesforce access because Alice's grant was all the Salesforce access. Audit attribution is implicit.

These aren't oversights. They're correct simplifications. The physical reality of one user, one grant per tool, does the work that explicit enforcement would otherwise require.

What changes at the second customer isn't the complexity of the code. What changes is that "valid credential" is no longer the same as "authorized to act." Every layer of the auth stack that relied on those being identical now has to enforce the difference explicitly.

That's the shift. Not a schema migration. A different problem class.

Why Agent Auth Was Never Just Credential Management

Before naming what breaks at multi-tenant, it helps to be precise about what was always true about agent auth — and invisible at one tenant.

Five structural properties distinguish agent auth from application auth. At one tenant, none of them create visible problems. At N tenants, every one of them does.

The credential belongs to the user, not to you

An OAuth grant is delegated authority. Alice authorized your agent to act on her behalf in Salesforce. The grant is hers. She can revoke it from her phone at 11pm. Salesforce can invalidate it when she changes her password. Her employer can revoke it as part of offboarding. None of these events notify your agent before the next tool call fails.

In application auth, you issue the session. You control it. In agent auth, you hold a credential whose validity is controlled by three parties you don't own: the user, the OAuth provider, and sometimes the user's employer.

Actions are external and irreversible

Your database has transactions. It has rollbacks. A wrong write can be compensated.

Salesforce doesn't. Slack doesn't. Linear doesn't. When your agent updates a Salesforce opportunity to "Closed Won" under the wrong user's grant, downstream commission calculations trigger. When it posts to the wrong Slack channel, the message is read. When it creates the wrong Linear ticket, it exists.

A misconfigured agent doesn't return bad data. It acts. In live external systems. Irreversibly. Without raising an exception. Because the access_token was valid and the downstream API accepted the call.

The blast radius of an auth failure in agent systems is not a data inconsistency. It is an action taken in an external system on behalf of someone who never authorized it.

The identity chain is multi-hop

In application auth, the identity chain is short: user authenticates, session is created, request is scoped to that session. Two hops, both visible.

In an agent system, the chain is longer:

A human grants your agent access to Salesforce on a Monday
The LLM, days later, decides to call ^{salesforce_update_record} based on context in its prompt
The agent execution layer resolves a credential and calls the Salesforce API
The action lands in Salesforce's systems

Four hops. The human who authorized the action in step 1 is not present at step 4. The LLM that decided to act in step 2 has no direct relationship to the grant from step 1. For any action to be traceable back to its authorization, every hop must be linked — and that linkage must be maintained explicitly by the auth infrastructure, not inferred from ambient state.

This is what connection_id is for. Not tenant_id. Not user_id. An immutable identifier for the specific OAuth grant event that authorizes the entire chain. Every tool call in the audit log references it. Without it, you can prove a valid token was used. You cannot prove the specific authorization that covered the action hadn't been revoked between step 1 and step 4.

Background agents have no session context

Standard auth infrastructure assumes a session: a user is present, they authenticated, requests are scoped to that session. The session ties identity to action.

Background agents have no session. The user who authorized the agent is not present when the agent acts. The job queue that fires at 3am has no ambient identity context. It has whatever was injected into the job payload at dispatch time — and if tenant_id and user_id weren't injected at dispatch, the job has no legitimate way to resolve them later.

This is the temporal gap. The grant was issued on Monday. The agent acts on Thursday at 3am. The auth infrastructure must explicitly maintain the connection between those two moments across potentially days of elapsed time, across job queue restarts, across process boundaries. An architecture that relies on session context to carry identity will silently fall back to the wrong grant — or the first one it finds — when the session is gone.

Tool selection is an authorization decision

Whether the token is valid answers one question: can this agent authenticate to Salesforce?

It does not answer a different, more important question: should this agent be able to call salesforce_delete_record in this workflow context?

In application auth, access control governs what an authenticated user can read or write. The user interface constrains which operations are surfaced. The authorization model governs access to resources.

In agent systems, the LLM selects tools dynamically. If salesforce_delete_record is in the tool registry and the token has write scope, the LLM can call it. Whether it should — whether that capability is appropriate for this workflow, this tenant, this level of user consent — is not answered by token validity. It is a separate authorization layer that must be enforced before execution: which tools does this agent have the right to use, in this context, on behalf of this user, at this tenant?

This is not a token problem. It is a capability scoping problem. But it is an auth problem — and at single-tenant, it's invisible, because the agent's tool registry is yours and the blast radius of a wrong selection is local. At multi-tenant, a tool that is appropriate for one tenant's workflow may be catastrophically wrong for another's.

What Multi-Tenancy Exposes

Every structural property above was true for one tenant. Multi-tenancy makes each one a live, concurrent failure mode.

Revocation mid-workflow across N tenants. Each of your N customers' users can independently revoke their OAuth grants. At one tenant, you notice immediately. At twenty, a revocation event for a user at Org 7 is one of hundreds of credential lifecycle events happening simultaneously. Without webhook-driven revocation propagation — not polling, not discovering it on the next ⁴⁰¹ — your agent continues operating under authority that no longer exists, until an action fails partway through a multi-step workflow.
Concurrent refresh races load the wrong tenant's token. Two background jobs for Org 1 and Org 2 hit Salesforce token expiry in the same second. The distributed refresh lock is keyed on ^provider. Both queue. The first job refreshes, writes Org 1's new ^access_token + ^{refresh_token}. The second acquires the lock, re-reads the store, and loads Org 1's freshly issued token while processing Org 2's workflow. The lock must be keyed on ^{(tenant_id, provider, user_id)}. Not ^provider. Every layer — the lock, the lookup, the schema primary key, the ^execute_tool call signature, the revocation handler.
The background job has no session to fall back on. A background job dispatched for Alice at Org 1 must carry ^tenant_id and ^user_id in its payload at dispatch time. A job that resolves these from environment variables or process-level defaults inherits whatever context the worker process has — which in a multi-tenant job queue is arbitrary. If ^tenant_id isn't in the payload, the job's resolution of Alice's Salesforce grant is undefined.
The audit chain breaks without ^{connection_id}. At one tenant, you can reconstruct authorization retrospectively. At twenty tenants, with tool calls firing across hundreds of users, the only reliable linkage from action to authorization is ^{connection_id} in every log entry. When the enterprise security review asks "prove that the agent's Salesforce call at 2:47am on March 15 was covered by a grant that was still in force at that moment," ^user_id alone cannot answer it. ^{connection_id} can — because it links the tool call entry directly to the ^{oauth.authorization_complete} event that created the grant, including its scopes, its issuance time, and its revocation event if one exists.
Tool registry scope diverges across tenants. An enterprise customer with a finance team may require that ^{salesforce_delete_record} never be accessible to the agent in their environment, regardless of OAuth scope. A startup customer may not care. At one tenant, you set the tool registry once. At N tenants, tool availability is a per-tenant configuration that must be enforced before the LLM ever selects a tool — not after the token is resolved.

Here is where each structural property lands in code:

Surface

Single-tenant assumption

Multi-tenant requirement

Credential resolution

^{get_token("salesforce")}

^{get_token(tenant_id, provider, user_id)}

Distributed refresh lock

Keyed on ^provider

Keyed on ^{(tenant_id, provider, user_id)}

OAuth state parameter (RFC 6749 §4.1.1)

CSRF token only

Signed payload: ^tenant_id, ^user_id, HMAC-verified

Token schema primary key

^(provider)

^{(tenant_id, provider, COALESCE(user_id, ''))} + ^{connection_id}

execute_tool call signature

^{execute_tool("salesforce_update", params)}

^{execute_tool("salesforce_update", params, {tenant_id, user_id})}

Revocation scope

^{revoke("salesforce")}

^{revoke(tenant_id, provider, user_id)}

Tool registry

Shared across all callers

Per-tenant capability configuration

Audit log

Action + timestamp

Action + ^{connection_id} + ^tenant_id + ^user_id + ^oauth_scope + ^{token_valid_at_execution}

These aren't the hard part. They're where the structural problem lands in code.

Authentication and Authorization Are Not the Same Question Anymore

In single-tenant agent auth, two questions share one answer.

"Does this agent have a valid credential for Salesforce?" Yes. "Is this agent authorized to act on Salesforce?" Also yes — because the credential is Alice's, Alice is the only user, and the scope covers the operation.

At multi-tenant with dynamic tool selection, these diverge completely.

Authentication asks: does this agent hold a valid ^access_token for Salesforce, for Alice at Org 1, with the right scopes?
Authorization asks: is this agent permitted to call ^{salesforce_delete_record}, for Alice at Org 1, in this workflow context, given what Alice consented to when she authorized the agent?

A valid credential answers the first question. It says nothing about the second. An agent can be fully authenticated — holding a live, scoped ^access_token — and still be operating outside the bounds of what was authorized, if the tool it's calling was never part of Alice's consent, or if Org 1's configuration restricts that capability, or if the workflow context doesn't justify that level of access.

"What the user can't do, the agent can't do" — but also: what the user didn't explicitly consent to delegate, the agent shouldn't be able to select.

The five identity layers that must be tracked explicitly in multi-tenant agent systems:

Layer

Single-tenant

Multi-tenant requirement

Trigger identity — who initiated this workflow?

Implicit; inferred

Explicit; injected into job payload at dispatch

Execution identity — which credential is running?

Implicit; one per provider

^{(tenant_id, provider, user_id)} at every layer

Delegated authority — whose grant backs this call?

Implicit; one grant

Resolved per call; bound to ^{connection_id}

Tenant identity — which org's resources are affected?

Implicit; one org

Enforced at schema, lock, revocation, and audit layers

Attribution identity — who is the actor in downstream systems?

Implicit

Explicit; traceable from grant issuance to audit query

Collapsing any two of these produces a silent failure. The most common collapse: execution identity and tenant identity sharing a service account with no per-tenant binding. The agent holds a valid credential. The action goes to the wrong org. Salesforce accepts it.

Recommended Reading: Access Control for Multi-Tenant AI Agents covers the three privilege escalation patterns that emerge when these layers collapse.

The Enterprise Security Review Doesn't Grade on a Curve

Everything above is a production correctness problem. What follows is a commercial one.

The first enterprise customer's security team arrives with a questionnaire. Not a feature request. A forensic audit.

Under whose delegated OAuth grant did this agent call Salesforce at 2:47am on March 15?
Was that grant still in force at execution time, or had the user revoked it?
Was the user who granted it still employed at that point?
Which scopes were active on that credential at the moment of action?
Are tokens encrypted at rest, isolated per tenant? Can one customer's configuration expose another's?

These questions require connection_id in every audit log entry from the first connected account. They require per-tenant credential isolation enforced at the schema level. They require revocation events propagated in real time, not discovered on the next failed tool call. They require token validity captured at execution time, not inferred from the fact that the call succeeded.

None of these can be retrofitted onto log entries written before the columns existed. Every tool call entry written without tenant_id and connection_id is a permanent gap. The audit gap is the period under review.

SOC 2 CC6.1 requires correlated evidence that credentials were valid at the time of each action, tied to specific principals. GDPR Article 6 requires every data access to be tied to a lawful basis traceable per action, per user, per tenant. The lawful basis for an agent action is the user's delegated grant.

The teams that built their own auth infrastructure spend the six weeks before enterprise close retrofitting properties the schema was never designed to carry. The teams that didn't spend those six weeks on the product.

What Scalekit Implements So You Don't Have To

The problem in production: you are maintaining live delegated authority relationships across N users' connected accounts, in M external systems with different token lifecycles, each capable of unilateral state changes, across K tenants, with compliance requirements arriving as a step function.

Authentication must be built into the infrastructure. That is what Scalekit is - your agent auth stack.

Connected account is always (^user_id, ^tenant_id, provider)

There is no single-tenant primitive. Every connected account is scoped to a specific user at a specific tenant for a specific provider from the moment of first authorization. get_token("salesforce") is not a valid operation. Resolution is always explicit:

result = await client.execute_tool( tool="salesforce_update_record", params={"record_id": record_id, "fields": update_fields}, user_id=current_user.id, tenant_id=current_org.id )

The compound key is structural. Not a convention. Not a best practice to document and hope engineers follow.

Provider-specific token lifecycle, not a generic refresh loop

Each provider's grant model is handled per its actual behavior:

Slack: Single-use ^{refresh_token} rotation; each successful refresh issues a new ^{refresh_token} that immediately invalidates the previous one; rotated atomically under a distributed lock keyed on ^{(tenant_id, provider, user_id)}
Google Workspace: 1-hour ^access_token; ^{refresh_token} revoked on password change, inactivity, or consent screen modification; proactively refreshed before expiry, not on ⁴⁰¹
Salesforce: 2-hour ^access_token; session invalidation on IP restriction change or admin revocation; surfaced via ^{connected_account.token_invalid} before the next tool call fails
API key tools (Datadog, Exa, Brave): No ^expires_in; validity confirmed on use; admin rotation surfaced as ^{connected_account.reauthorization_required}, not a silent ⁴⁰¹
Service accounts (BigQuery, Snowflake): Per-tenant, IAM-scoped, org-level authority; no user grant in the loop

When Slack changes their token rotation semantics — as they did in 2024 — the change is scoped to Scalekit's Slack connector. Agent code doesn't change. Silent failures don't cascade across tenants.

Revocation propagated before the next tool call

When Alice disconnects her Salesforce integration, or an IT admin runs an offboarding script, Scalekit receives the provider-side webhook and fires a structured event to your application:

connected_account.disconnected → pause workflow, prompt re-authorization connected_account.token_invalid → halt background execution, surface to operator token.refresh_failed → distinguish invalid_grant (re-auth required) from transient error (retry)

The signal arrives when the authority changes. Not when the next 401 does. No polling. No stale state.

The identity chain is traceable end-to-end

Every execute_tool call emits:

Field

What it enables

connection_id

Links the action to the specific OAuth grant event; required for SOC 2 CC6.1 chain-of-authority

tenant_id

Per-org forensic queries; required for GDPR Article 6

user_id

Principal attribution; whose delegated authority backed this action

oauth_scope

Active scopes at execution time; proves least-privilege at the moment of action

token_valid_at_execution

Asserted by successful execution; answers "was the grant in force?"

The SOC 2 auditor's question is answerable. So is the GDPR question. The chain from Alice's Monday authorization to Thursday's 3am tool call is intact in the log.

What stays in your application

Routing logic: which tool to call, when, with what parameters
Business logic: what to do with the result
Consent UX: triggering the Scalekit-provided authorization URL

Token storage, refresh scheduling, concurrent locking, provider-specific lifecycle handling, revocation propagation, and audit emission run outside your application code. Credentials never touch the agent runtime.

The build vs. buy analysis covers what it costs to own what Scalekit owns. The short version: it's not one sprint, and the sprint estimate doesn't include Slack changing their token rotation policy six months post-ship.

FAQs

Why does the auth failure mode in agent systems feel different from a database tenant_id filter bug?

In SaaS auth, a missing tenant_id filter returns the wrong data. Your application can catch it; your customer can report it; you can write a compensation. In agent auth, a missing tenant_id resolution causes your agent to act — under the wrong user's delegated authority, in a live external system, irreversibly. Salesforce accepted the write because the access_token was valid. Slack posted the message because the token was valid. There is no rollback. The authority backing the action belonged to someone who never authorized it, and the downstream system has no way to know the difference.

Our agent only calls three OAuth tools today. At what point does the identity chain complexity actually matter?

From the first background job that runs without a user session present. The moment your agent operates outside a synchronous request context — a scheduled workflow, an event-driven trigger, a queued task — the session that authenticated the user is gone. The only link between "Alice authorized this agent on Monday" and "this job running on Thursday at 3am" is what was explicitly injected into the job payload at dispatch time. If tenant_id, user_id, and a reference to the originating connection_id weren't included at dispatch, the job has no valid path to Alice's grant. Three tools doesn't change this. One background job does.

What is connection_id and why isn't user_id sufficient for the audit trail?

connection_id is an immutable identifier for a specific OAuth grant event — the record created when Alice completed the authorization flow and your system received her access_token + refresh_token. user_id identifies Alice. connection_id identifies the specific act of Alice granting access on a specific date, with specific scopes, from a specific OAuth app version. The compliance question isn't "did Alice's agent call Salesforce?" It's "was the grant covering that call still in force at execution time, and had Alice's consent not been superseded by a revocation?" Only connection_id links the tool call entry to the grant event that can answer that question. user_id alone cannot — Alice may have multiple Salesforce grants across re-authorization cycles.

Tool selection feels like a product decision, not an auth decision. Why is it being treated as auth?

Because the LLM selects tools dynamically based on context, not based on what the user explicitly consented to delegate. When Alice authorized your agent, she consented to a set of operations described in the OAuth consent screen. If the tool registry has salesforce_delete_record and the access_token has write scope, the LLM can call it — whether or not deleting records was part of what Alice understood she was authorizing. Tool availability per workflow context, per tenant, per user consent level is an authorization question: what capabilities should this agent be allowed to exercise here? Answering it with "whatever the token's scope covers" is not an authorization policy. It is an absence of one.

What's the difference between invalid_grant on a token refresh and a network error, and why does the handling have to differ?

invalid_grant means the refresh_token was revoked, expired, or already consumed in single-use rotation. The user's delegated grant is gone. No retry will recover it. The agent must stop, emit a token.refresh_failed event, and surface a re-authorization prompt. A network error means the refresh HTTP request failed transiently; retrying with exponential backoff is correct. Treating invalid_grant as a transient error causes the agent to retry until the backoff budget is exhausted, consuming OAuth quota and leaving the workflow in a partially-executed state. Treating a network error as invalid_grant causes the agent to halt and prompt re-authorization for a grant that is still valid. The distinction requires explicit branching on the OAuth error response, not a generic exception handler.

For background agents with no user in the loop — scheduled syncs, pipeline automation — which flow covers the tool calls?

OAuth Client Credentials (RFC 6749) for org-level automation in systems that support it: Salesforce service orgs, HubSpot, some Google Workspace APIs. It produces an org-scoped access_token with a standard expires_in and a defined refresh path, with no user authorization required. Service accounts (GCP IAM, AWS IAM) for Google Cloud APIs and AWS services. The critical constraint either way: the credential must be bound per tenant. A Client Credentials grant or service account shared across tenants means background jobs for Tenant B can execute under Tenant A's org-level authority with no error raised, because the token is valid and the downstream API accepts it.

What does the audit trail need to answer for SOC 2 CC6.1?

The auditor's core question: "For this specific agent action at this specific time, which authorization grant was in force, and was it still valid?" That requires: connection_id linking to the original oauth.authorization_complete event, tenant_id, user_id, active oauth_scope at execution time, and a token_valid_at_execution assertion. The connection_id is what makes the chain traceable. Without it, you can prove a valid access_token was used. You cannot prove the specific grant backing it hadn't been revoked between issuance and use — which is exactly the question SOC 2 CC6.1 asks. Audit entries written without these fields cannot be retroactively attributed. The gap is permanent.

Related reading: ‍