Scalekit FAST MCP Integration is now live
Get started

Build production-ready agent workflows with remote MCP servers

Hrishikesh Premkumar
Founding Architect

TL;DR

  • Remote MCP servers are becoming the standard way teams expose internal capabilities used across infra, support automation, DevOps, CRM, and onboarding because they replace scattered scripts and API keys with a secure, typed, centrally authenticated interface.
  • FastMCP acts as the resource server, exposing clean tool definitions, while Scalekit issues short-lived OAuth 2.1 tokens with scopes and automatically rotates them.
  • Local MCP is great for quick experiments, but Remote MCP unlocks multi-agent, multi-team, production-grade automation with auditability and zero secret sharing.
  • Browse the Todo example to learn the full pattern: scoped tools, JWKS-validated tokens, in-memory CRUD, and end-to-end authentication using Scalekit.
  • Once authentication is solved, teams can scale internal capabilities, GitOps triggers, CRM updates, and onboarding flows without reinventing security.

Why teams are moving to remote MCP servers

Most engineering teams don’t set out to create chaos; it accumulates over time. A small utility script written for one teammate gets copied into another repo, then wrapped in a quick internal endpoint, and eventually wired into an agent workflow without anyone fully documenting how it works. Before long, LLM agents start hitting these internal tools in ways they weren’t designed for. Secrets end up duplicated across repos, API keys get shared casually in chats just to unblock someone, and onboarding a new engineer means walking them through a scattered collection of scripts and endpoints that nobody fully owns anymore. Each tool behaves a little differently, each uses its own conventions for authentication, and over time, the system becomes brittle in ways that only surface when something breaks.

The shift begins when developers realise the real friction isn’t the tools themselves but the lack of a standard interface and a secure trust model around them. Agents need typed, consistent contracts to call tools reliably. Security teams need scopes, rotation, and auditability. Developers need to stop hand-rolling auth logic for every script. Remote MCP servers solve all of this by flipping the model: instead of exposing scattered endpoints and long-lived secrets, teams expose internal capabilities through FastMCP, secure them with OAuth 2.1 via Scalekit, and give agents a clean, universal interface backed by short-lived, scoped tokens. No secrets in repos, no custom middleware, no brittle integrations, a proper boundary between tools and the agents that call them.

This guide walks through that shift step by step, moving from local, ad-hoc tools to a secure Remote MCP server backed by OAuth 2.1 scopes and FastMCP’s typed capabilities. You’ll learn why Remote MCP is a better mental model than local scripts or ad-hoc APIs, how OAuth 2.1 replaces API keys cleanly, and how Scalekit integrates with FastMCP to create a secure, production-grade Remote MCP Server. 

Along the way, we’ll build a complete Todo server secured with scoped OAuth tokens, validate them using Remote OAuth, and show how this pattern scales into a proper internal automation platform. By the end, you’ll understand not just how to implement Remote MCP but why teams are standardising their entire internal tool layer around it.

Local vs remote MCP: A developer’s mental model

Most developers' first experience with MCP is through local tooling experiments, plugins, small scripts, or quick prototypes where the agent runs on their machine and interacts with local tools. This setup feels effortless: no servers, no tokens, no infra. But the convenience fades quickly once the team tries to share tools, operate automations in production, or plug agents into internal systems. Local MCP wasn’t designed for multi-user, multi-agent, or multi-team environments, and the limitations are immediately apparent: only the person running the agent on their laptop can use the tools, nothing is shareable, and every agent session requires direct access to the developer’s local machine.

Remote MCP changes how internal tools are built and shared. Instead of every developer running tools locally on their laptop, those tools become network-accessible services with stable URLs, real authentication, audit logs, and scoped permissions. The team no longer passes around API keys or custom scripts; Scalekit issues signed tokens, and FastMCP enforces them, giving you an enterprise-grade trust boundary without building your own security infrastructure.

For developers familiar with REST, Remote MCP will feel both familiar and upgraded. You get typed interfaces, predictable request/response formats, and tool definitions designed specifically for LLM agents, not human clients. Unlike REST, Remote MCP is purpose-built for ephemeral task execution, fine-grained scopes, and real agent workflows. It becomes an internal “API layer for agents,” replacing scattered scripts, custom endpoints, and brittle secrets with a unified, secure, and standardised protocol.

Once this distinction clicks, developers quickly understand why teams are standardising on Remote MCP. What starts as a local experiment evolves into a secure, shared automation platform that scales across tools, workflows, and entire engineering orgs.

Feature
Local MCP
Remote MCP
Internal REST APIs
Security model
None
OAuth 2.1 + scoped tokens
API keys or custom logic
Token rotation
Not needed
Automatic via Scalekit
Manual, error-prone
Standardization
Minimal
Strong, typed MCP tools
Varies by service
Multi-agent use
Hard
Native
Custom integration
Developer experience
Great for prototypes
Great for shared infra
Mixed, depends on API
Scaling to teams
Breaks quickly
Designed to scale
Requires maintenance
Auditability
None
Centralised and scoped
Depends on implementation

Why remote OAuth beats API keys and local secrets

Developers often start with API keys because they’re simple: drop a key into a .env file and everything works locally. But the moment multiple agents or services need access, the model falls apart. One engineer copies the key into a script, another adds it to a CI variable, someone else pastes it into Slack to unblock a teammate, and now the same secret exists in half a dozen places. Because no one knows all the places it lives, rotation becomes risky or gets delayed, and the key quietly gains far more reach than intended. Without scopes or expiry, any leak exposes everything behind it, making API keys fragile and hard to govern as systems grow.

Remote OAuth reverses this dynamic. Instead of long-lived secrets, clients receive short-lived, signed tokens from an authorization server. Scalekit issues OAuth 2.1 tokens with embedded scopes that define exactly what the caller can do, and automatically rotates keys. FastMCP acts as the protected resource server, validating each token via Scalekit’s JWKS endpoint before any tool executes. No secrets flow through the system; agents never store credentials; and permissions are explicit rather than implicit.

This is especially important for MCP, where agents are inherently dynamic. An LLM shouldn’t hold a permanent API key, and a workflow that only needs read access shouldn’t be granted write access. Scoped tokens fix that. FastMCP exposes token scopes directly to your tools, making it straightforward to enforce boundaries without custom auth logic.

The end result is a system that feels natural for developers and safe for organisations: secrets disappear from repos, tokens expire automatically, access is auditable, and authentication becomes a reliable, centralised layer instead of a brittle one-off implementation in every script.

Architecture overview: How remote tool calls flow

Once a developer understands why Remote OAuth replaces brittle API keys, the next question becomes: how does a real Remote MCP request actually move through the system? The flow is intentionally simple, but it introduces a security boundary that developers never get with local MCP or ad-hoc scripts. Instead of clients calling your FastMCP server directly with long-lived secrets, every request is validated against Scalekit’s authorisation layer before a tool ever runs. This turns your MCP server from a “trusted client” model into a proper “trusted token” model, exactly how modern APIs and cloud services operate.

A typical agent interaction begins when an MCP client (such as an automated workflow, IDE plugin, or internal bot) requests a scoped OAuth 2.1 access token from Scalekit. The token encodes who is calling, what they are allowed to do, and which resource server it applies to. With the token in hand, the client sends a standard MCP tool call to your FastMCP server using a Bearer header. FastMCP verifies the token signature using Scalekit’s JWKS endpoint, checks its expiry and scopes, and only then executes the tool. This keeps the tool layer clean, predictable, and fully decoupled from authentication logic.

Because all validation happens before code execution, multiple agents and even multiple teams can safely call the same MCP server without stepping on each other. One agent may only read tasks; another may create and update them; an internal automation might run full workflows. The server remains the same. Scalekit issues different scoped tokens depending on who is calling. This separation of concerns is why Remote MCP servers scale far beyond local MCP workflows.

You can also explore the complete working example in the GitHub repository, which includes the full FastMCP server, Scalekit configuration, and all CRUD tools end-to-end: 

Preparing the local environment

Before building any tools, you want a local setup that behaves the same for everyone on the team. A simple FastMCP workspace with a virtual environment, a .env file, and a starter server.py provides a predictable foundation for testing authentication, iterating safely, and scaling the project without configuration drift.

1. Create a clean project structure

  • Make a small Python workspace
  • Add a server.py and a tools/ folder
  • Create a virtual environment (python3 -m venv venv)

2. Add environment variables

Use a .env/.env.example file to keep secrets out of code:

PORT=3002 SCALEKIT_ENVIRONMENT_URL=https://your-env.scalekit.com SCALEKIT_CLIENT_ID=your_client_id SCALEKIT_RESOURCE_ID=mcp_server_id MCP_URL=http://localhost:3002/

This mirrors how real teams share safe templates without exposing secrets.

3. Install dependencies

  • Install FastMCP
  • Install the Scalekit provider
  • Load env variables with dotenv

4. Add a minimal FastMCP server

A basic server.py with the Scalekit provider is enough to test remote OAuth end-to-end:

mcp = FastMCP( "Todo Server", stateless_http=True, auth=ScalekitProvider( environment_url=os.getenv("SCALEKIT_ENVIRONMENT_URL"), client_id=os.getenv("SCALEKIT_CLIENT_ID"), resource_id=os.getenv("SCALEKIT_RESOURCE_ID"), mcp_url=os.getenv("MCP_URL"), ), )

5. Validate your setup

Once everything is wired up, run the server to make sure the core authentication flow works end-to-end. FastMCP should load Scalekit’s JWKS keys, validate incoming tokens, and return a simple response from the placeholder tool. If those pieces work, you’ve confirmed the skeleton is functioning correctly, giving you a reliable base to start adding real tools, scopes, and modules as your internal automation layer expands.

Registering the remote MCP server in Scalekit

With your local environment prepared, the next step is to tell Scalekit what this MCP server is and how it should be trusted. FastMCP doesn’t mint or manage tokens; it relies on an external OAuth 2.1 authority to define who can call the server and what they’re allowed to do. That’s why the real registration happens in Scalekit. It becomes the single source of truth for scopes, environments, and access boundaries.

1. Create a new MCP Server resource in Scalekit

Open the Scalekit dashboard → MCP ServersAdd Server.

Enter the public or local URL of your server (e.g., http://localhost:3002/). This tells Scalekit where tokens issued for this resource will be used.

2. Define the scopes your tools require

For the Todo server, you’ll add:

  • todo:read → list + fetch todos
  • todo:write → create + update + delete

 These scopes become part of the contract between every agent and your server.

3. Copy the generated identifiers

Scalekit creates two IDs you’ll need to configure your server:

  • RESOURCE_ID → uniquely identifies your MCP server
  • CLIENT_ID → identifies which client can request tokens for it

Together with your environment URL, these become the authentication config for FastMCP.

4. Add the values to your environment variables

These go into your .env:

PORT=3002 SCALEKIT_ENVIRONMENT_URL=https://your-env.scalekit.com SCALEKIT_CLIENT_ID=skc_12345 SCALEKIT_RESOURCE_ID=res_67890 MCP_URL=http://localhost:3002/

Once these are set, FastMCP automatically handles token validation, discovery of JWKS keys, signature verification, scope verification, and permission enforcement without any custom auth code on your side.

Designing a remote MCP server: Boundaries, tool modules & scopes

Once your server is registered in Scalekit, the real design work begins. A Remote MCP server isn’t just a place to expose functions; it becomes a capability surface that multiple teams and agents rely on. This means the structure of your tools, the boundaries between them, and the scopes that protect them matter just as much as the Python code beneath.

A good starting point is deciding how to break tools into modules. Smaller, focused tools are easier for agents to reason about and far safer to secure. A Todo service naturally splits into read and write actions, mapping cleanly to todo:read and todo:write. Larger systems follow the same pattern: tasks, users, billing, and deployments, each forming its own module with narrowly defined scopes. This keeps the server aligned with a least-privilege model from day one.

Scopes eventually become the vocabulary of your internal platform. A clear scope, like deploy.trigger, signals intent immediately, whereas broad labels like admin or full_access blur boundaries and create long-term risk. Remote MCP treats scopes as first-class citizens: FastMCP exposes them inside every tool call, and Scalekit embeds them directly into each token. Thoughtful scope design early on pays off significantly as more teams and agents begin consuming the same server.

Once your tools are modular and scopes well-defined, you have the foundation for a secure, scalable internal automation layer. With those boundaries established, you’re ready to implement the full FastMCP Todo server, complete with Remote OAuth and scope-based access control.

Implementing the FastMCP Todo server

Once your scopes and boundaries are defined in Scalekit, building the Todo server becomes straightforward. FastMCP handles the protocol, and Scalekit handles authentication, so your only job is to express capabilities as typed tools. Here’s the full implementation broken into clear, practical steps:

1. Load environment variables and wire up the Scalekit OAuth provider

FastMCP doesn’t manage authentication internally; it delegates to Scalekit. Add the required environment variables and initialize the ScalekitProvider:

from dotenv import load_dotenv from fastmcp import FastMCP from fastmcp.server.auth.providers.scalekit import ScalekitProvider from fastmcp.server.dependencies import AccessToken, get_access_token load_dotenv() mcp = FastMCP( "Todo Server", stateless_http=True, auth=ScalekitProvider( environment_url=os.getenv("SCALEKIT_ENVIRONMENT_URL"), client_id=os.getenv("SCALEKIT_CLIENT_ID"), resource_id=os.getenv("SCALEKIT_RESOURCE_ID"), mcp_url=os.getenv("MCP_URL"), ), )

This wiring enables your server to validate OAuth 2.1 bearer tokens using Scalekit’s JWKS keys.

2. Create a lightweight in-memory store

We rely on a lightweight in-memory store for todos, which keeps the example simple while still showcasing the full Remote OAuth flow from token validation to scoped tool execution.

_TODO_STORE: dict[str, TodoItem] = {}

3. Add a tiny helper for scope enforcement

All permission checks flow through this one function:

def _require_scope(scope: str) -> Optional[str]: token: AccessToken = get_access_token() if scope not in token.scopes: return f"Insufficient permissions: `{scope}` scope required." return None

4. Implement each Todo tool with typed inputs + scope checks

FastMCP tools need only focus on the logic; the token is already validated.

Create Todo

@mcp.tool def create_todo(title: str, description: Optional[str] = None) -> dict: error = _require_scope("todo:write") if error: return {"error": error} todo = TodoItem(id=str(uuid.uuid4()), title=title, description=description) _TODO_STORE[todo.id] = todo return {"todo": todo.to_dict()}

List Todos

@mcp.tool def list_todos(completed: Optional[bool] = None) -> dict: error = _require_scope("todo:read") if error: return {"error": error} todos = [ todo.to_dict() for todo in _TODO_STORE.values() if completed is None or todo.completed == completed ] return {"todos": todos}

Get Todo

@mcp.tool def get_todo(todo_id: str) -> dict: error = _require_scope("todo:read") if error: return {"error": error} todo = _TODO_STORE.get(todo_id) if todo is None: return {"error": f"Todo `{todo_id}` not found."} return {"todo": todo.to_dict()}

Update Todo

@mcp.tool def update_todo( todo_id: str, title: Optional[str] = None, description: Optional[str] = None, completed: Optional[bool] = None, ) -> dict: error = _require_scope("todo:write") if error: return {"error": error} todo = _TODO_STORE.get(todo_id) if todo is None: return {"error": f"Todo `{todo_id}` not found."} if title is not None: todo.title = title if description is not None: todo.description = description if completed is not None: todo.completed = completed return {"todo": todo.to_dict()}

Delete Todo

@mcp.tool def delete_todo(todo_id: str) -> dict: error = _require_scope("todo:write") if error: return {"error": error} todo = _TODO_STORE.pop(todo_id, None) if todo is None: return {"error": f"Todo `{todo_id}` not found."} return {"deleted": todo_id}

5. Start the MCP server

if __name__ == "__main__": mcp.run(transport="http", port=int(os.getenv("PORT", "3002")))

This launches a fully authenticated MCP server that validates tokens, checks scopes, and exposes typed CRUD tools.

6. Understand the bigger picture

All of this code works because:

  • Scalekit issues short-lived OAuth tokens
  • FastMCP validates them before running any tool
  • Tools enforce narrow scopes with a tiny helper function

Nothing here is Todo-specific swap in GitOps triggers, CRM updates, deployment actions, onboarding workflows, or internal utilities, and the pattern stays identical.

Running the FastMCP server locally

Once your server code and environment variables are in place, the first real milestone is starting the FastMCP server and watching it accept authenticated requests for the very first time. This part feels simple on the surface, just a Python process listening on a port, but behind the scenes, FastMCP is already preparing to validate real OAuth 2.1 tokens from Scalekit, discover JWKS keys, and enforce scopes for every incoming call.

source venv/bin/activate python server.py

When the server boots, FastMCP announces the HTTP transport and begins listening on:

http://localhost:3002/

Everything behind /mcp is now protected. The moment a request comes in, FastMCP fetches your Scalekit JWKS keys, validates signatures and scopes, and executes the tool. If something is wrong, like an expired token or a missing scope, the request never reaches your business logic.

Token enforcement

Every tool you defined calls _require_scope, ensuring immediate, predictable scope enforcement. If you see a response like:

{"error": "Insufficient permissions: `todo:write` scope required."}

It means the caller’s token is valid but lacks the required permission, exactly how a secure resource server should behave.

Connecting with an MCP client 

Once your server is running, the next step is to interact with it using a real MCP client. The simplest option during development is the MCP Inspector, which lets you explore tool schemas, run calls, and verify authentication end-to-end. Here’s the full flow broken into steps:

1. Start the MCP Inspector

Run the Inspector locally:

npx @modelcontextprotocol/inspector@latest

2. Enter your server’s connection URL

In the Inspector UI, set:

http://localhost:3002/mcp

3. Click Connect and complete the OAuth approval

Before any tools load, Scalekit will show an OAuth consent screen. The Inspector needs authorization to connect to your MCP server, so you’ll be asked to approve the connection. Once you confirm, Scalekit issues a short-lived, signed OAuth token and returns it to the Inspector, allowing it to automatically authenticate tool calls.

4. Explore and test your tools

After you approve the OAuth screen, the Inspector automatically discovers all tools, their input schemas, output types, and metadata. You can then invoke any of them: create_todo, list_todos, get_todo, update_todo, or delete_todo, and watch scope enforcement in action. Calls requiring todo:read will succeed with a read token, while write operations will fail cleanly unless the token includes todo:write. All validation happens through Scalekit’s JWKS keys before your tool logic executes, confirming that your Remote OAuth setup is functioning end-to-end.

How this pattern scales across teams and projects

Before diving into how this pattern scales, we need to complete the loop: once you’ve run the MCP server locally and tested it through the Inspector, you now have your first fully functional Remote MCP setup. At this point, the pieces are working end-to-end: the server exposes typed tools, the Inspector connects through OAuth, Scalekit issues scoped tokens, and FastMCP validates them. This becomes the foundation for everything that follows. With that working baseline in place, it becomes clear why the same architecture scales naturally across teams.

Once your first Remote MCP server is running, typed tools, scoped access, and centralised token issuance start to show their real value. Billing, GitOps, onboarding, and internal developer utilities can all publish their own MCP modules without having to reinvent authentication or share secrets. Each team inherits the same trust boundary from Scalekit, no custom auth middleware, no secret sprawl, and no duplicated logic.

Because scopes map directly to capabilities, every agent receives only the access it needs. Monitoring workflows get read-only scopes; automation pipelines receive targeted write permissions. FastMCP enforces these boundaries uniformly, so dozens of agents and teams can interact with internal systems safely without interfering with one another.

As more teams adopt the pattern, its benefits compound: new tools drop in cleanly, new agents onboard without code changes, environments stay isolated, and every action remains auditable and revocable. The Todo server is a small example, but its architecture provides a scalable template for exposing internal capabilities with predictable behaviour and a shared trust model across the entire organisation.

Best practices for production MCP servers

Running a Remote MCP server in production is about creating a dependable surface for multiple teams and agents, and the practices behind the small Todo example scale directly into larger internal systems.

1. Design meaningful, intent-based scopes

Avoid broad permissions and define clear read/write boundaries so each tool declares exactly what it needs. Scalekit-issued tokens keep access minimal, predictable, and automatically enforced by FastMCP. In larger orgs, this prevents issues like a reporting bot accidentally receiving deployment rights simply because an “admin” scope was too loosely defined.

2. Separate environments cleanly

Production, staging, and development should never share the same MCP resource or client configuration. Using separate URLs and IDs ensures staging tokens can’t ever hit production systems. Enterprises often require this by policy, e.g., a QA agent should never be able to modify live customer data, even by mistake.

3. Add observability without logging tokens

Log tool calls, scope failures, and unexpected behaviour, but never the tokens themselves. This gives teams visibility into how agents interact with internal systems, making debugging much easier. In practice, when a workflow pipeline starts failing in production, these logs reveal whether the issue is due to expired tokens, missing scopes, or an actual tool failure.

4. Deploy like a modern microservice

Containerise the MCP server, run it behind HTTPS, autoscale when needed, and rely on Scalekit for JWKS rotation and stable token validation. Enterprises benefit from this because it slots naturally into existing Kubernetes, service mesh, or API gateway setups, meaning MCP tools can operate with the same reliability guarantees as any internal service.

5. Keep tools small and boundaries clear

Each tool should stay focused and lightweight, clearly stating the exact scope it requires. Authentication stays outside application logic, keeping your code clean and predictable. This mirrors how enterprise teams structure microservices as small, well-defined capabilities that are easy to test, audit, and reason about across large organisations.

Conclusion: Building a Reliable Foundation for Agent-Ready Internal Tools

Internal tooling doesn’t fail because it’s complex; it fails when it grows without structure. Remote MCP servers address this by providing teams with a single, secure way to expose internal capabilities through typed tools, short-lived OAuth tokens, and clear scope boundaries. No more long-lived secrets or ad-hoc endpoints, just predictable, discoverable interfaces agents can trust.

The Todo server here is small, but the pattern scales naturally: add new modules without rewriting auth, onboard teams without increasing risk, and let agents call multiple internal systems without ever holding sensitive credentials. As AI-driven automation becomes standard, this model becomes the safest and most maintainable way to build internal tooling.

If you want to build further, start by adding a real database, splitting the server into domain modules, designing production-grade scopes, and creating separate dev/stage/prod MCP resources in Scalekit. For deeper guidance, explore Securing FastMCP with Scalekit: Remote OAuth Done Right and the official FastMCP Quickstart, both walk through the same architecture you used here and show how to grow it into a full internal platform your entire engineering org can rely on.

FAQ

How is a remote MCP server different from a traditional REST API?

A Remote MCP server doesn’t expose endpoints; it exposes capabilities. Instead of routing, JSON handling, and custom middleware, you define typed tools with clear schemas. FastMCP handles discovery automatically, and authentication is handled via OAuth 2.1 rather than scattered API keys. REST is built for humans calling endpoints; MCP is built for agents executing tasks with predictable, structured, scope-aware interfaces.

Why does remote MCP need OAuth instead of API keys?

API keys are simple but impossible to manage safely at scale. They can’t express scopes, rarely get rotated, and a single leak compromises everything. OAuth 2.1 solves this with short-lived, signed tokens that carry explicit permissions. Scalekit issues the tokens, FastMCP validates them, and every call enforces least-privilege access, making the system far safer, more auditable, and better suited for agent workflows.

How does Scalekit validate access to my MCP tools?

Scalekit never validates requests directly. Instead, it issues JWT access tokens and exposes JWKS keys that FastMCP uses to verify each incoming call. FastMCP checks the token’s signature, expiration, scopes, and intended resource before executing a tool. If anything is wrong with the scope, expired token, or mismatch, the server rejects the request cleanly. All trust flows from Scalekit’s signed tokens, not from secrets embedded in your code.

Can multiple agents or internal teams safely share the same MCP server?

Yes, that’s one of the main reasons Remote MCP exists. Each agent receives a token with only the scopes it needs, and FastMCP enforces those scopes per tool call. A read-only monitoring agent might get todo:read, while an internal workflow might receive todo:write. The server doesn’t change; the permissions do. This allows many agents and teams to rely on a single server without risking cross-access or privilege leaks.

How should scopes be organised for larger teams?

Scopes work best when they clearly describe intent and stay specific. A billing tool might expose billing.read and billing.charge, while a GitOps tool might define deploy.trigger and deploy.rollback. Broad scopes like admin or full_access almost always create problems later because they don’t map to real actions or enforce least privilege. When scopes reflect actual capabilities, teams can reason about access cleanly and scale safely as new tools and workflows are added.

No items found.
Secure your MCP now
On this page
Share this article
Secure your MCP now

Acquire enterprise customers with zero upfront cost

Every feature unlocked. No hidden fees.
Start Free
$0
/ month
1 million Monthly Active Users
100 Monthly Active Organizations
1 SSO and SCIM connection each
20K Tool Calls
10K Connected Accounts
Unlimited Dev & Prod environments