Jul 30, 2025

Building our MCP server: A developer's journey

Developer

No items found.

Six months ago, most of us hadn’t even heard of MCP.

There was no standard playbook, no “right” way to expose tools to LLMs. Just a few GitHub repos, some early experiments, and a lot of shared frustration in Discord threads. But something about it clicked — the idea that AI agents could go beyond chat and actually do things, securely and reliably.

As an auth platform, we knew we had to take MCP seriously as an infrastructure. Because when agents start taking actions, everything comes back to trust, auth, and scopes.

So we rolled up our sleeves and brought up an internal MCP server for Scalekit. Along the way, we leaned on every working implementation out there, learned from the community’s growing pains, contributed back fixes, and most importantly, shaped the infrastructure we’d want other teams (and their agents) to rely on.

Before we even touched scopes or auth, we had one big question to answer: “How should responses be streamed back to the agent?”

This wasn’t just about UX or output polish. It impacted everything, from how clients parsed responses, to how rich the tool outputs could be, to how easily agents could consume what we were sending.

It sounds like a small implementation detail, but it shaped how we designed the entire thing. So let me put it the way we thought about it…

SSE vs HTTP streamable

Picture this: you're at a restaurant. You tell the waiter, “Keep bringing me dishes as the kitchen prepares them.”

That’s what Server-Sent Events (SSE) does. It delivers each item — one at a time, in a predictable format, without you having to ask again. If the connection drops, it picks up where it left off. Reliable, minimal, and great when you just need simple updates.

But that’s all you get:

No control over how it's delivered
No way to bundle multiple items
No support for complex formats or richer outputs

Now imagine instead of a waiter, you had a delivery van from the kitchen.

It can carry full meals, sides, drinks, instructions, and even partial updates — all streamed to you, however you want to unpack them. That’s HTTP Streamable.

It’s a bit more work to manage as you have to handle reconnects and unpacking, but the flexibility is worth it when you're working with LLMs that need structured JSON, multi-step outputs, or large payloads.

Step 1: Getting the first tools working

We started small — just a couple of internal tools wired up to the MCP server. These weren’t production-critical yet. The goal was to validate:

Can agents call our tools, and can we control what happens when they do?

Each tool had a clear structure:

A name
A short description
The scopes it would later enforce
A ^run() function to actually do the thing

Here’s what one of the earliest tools looked like — ^{list_environments}:

server.tool("list_environments", { description: "List all environments visible to the current user", run: async ({ userId }) => { const envs = await getEnvsForUser(userId); return { content: envs.map(env => ({ type: "text", text: `${env.id} (${env.name})` })) }; } });

We used Zod schemas to validate inputs for tools that needed them:

const schema = z.object({ org_id: z.string().startsWith('org_'), });

And we kept registration clean and centralized:

export function registerTools(server: McpServer){ server.tool("create_organization", createOrgTool); server.tool("list_environments", listEnvsTool); }

Having all tools registered declaratively in one place made it easy to reason about what we were exposing and how.

Even at this early stage, we knew that eventually every tool would need clear access boundaries. But first, we just wanted the wiring to feel right, both for us as developers and for the agents calling them.

Here's a full demo of how Scalekit's MCP servers work:

Step 2: Designing for access control from the start

With a few tools wired up, it was time to protect them.

Since the MCP server was going to be a public-facing surface callable by real agents, over the internet, we had no business leaving it open. This wasn’t just for demo use. It was a real integration point, and we treated it like one from the start.

From the very beginning, we approached the MCP server with a platform mindset:

Every tool should be explicitly scoped
Every request should be traceable to who (or what) triggered it
And no tool should be callable without being deliberately authorized

This wasn’t about locking things down for the sake of it. It was about clarity — for us, and for anyone calling the tools.

We also knew agents needed to discover and understand what access they had. So we aligned on using OAuth 2.1 and the protected resource metadata spec — so that any agent could programmatically understand what scopes were required, how to get tokens, and what the rules of the game were.

This decision addressed concerns about accidental overreach.

In the next section, we’ll break down exactly how we implemented this — the four pieces that made our auth layer work.

Step 3: Securing the server: OAuth in four straightforward steps

We didn’t have to look far for auth — we used Scalekit’s own OAuth2-based MCP auth. It gave us everything we needed to lock down access while keeping agent interactions smooth.

1. Issue OAuth2 tokens via client credentials flow

Every agent or system that talks to the MCP server needs a token. We used the OAuth 2.1 client credentials grant, which is ideal for machine-to-machine use cases — like AI agents acting on behalf of a workspace or org.

These tokens are scoped, short-lived, and issued from the same authorization server we already run for all our Scalekit integrations.

2. Validate the JWT on every request

Every incoming request carries a Bearer token in the header:

const token = req.headers.authorization?.split(' ')[1]; const claims = await verifyToken(token); // throws if invalid

The token is a JWT — signed, structured, and verifiable. If the signature or claims are invalid, the request is rejected right away. No tool logic runs until auth is confirmed.

3. Expose the protected resource metadata

To support dynamic discovery, especially for standards-aware agents, we exposed a protected resource metadata endpoint:

GET /.well-known/oauth-protected-resource

It returns a JSON document that tells clients how this resource is protected and what scopes it supports:

const metadata = { "resource": "https://mcp.scalekit.com", "authorization_servers": [ "https://mcp.scalekit.com/.well-known/oauth-authorization-server" ], "bearer_methods_supported": [ "header" ], "resource_documentation": "https://docs.scalekit.com", "scopes_supported": [ "wks:read", "wks:write", "env:read", "env:write", "org:read", "org:write" ] }

This makes the server self-describing, an important feature when agents are dynamically choosing tools and preparing their auth flows.

4. Enforce scope-based access control at the tool level

Once the token is verified, we inspect the scopes in the claims:

if (!claims.scope.includes("org:write")) { throw new Error("Missing required scope: org:write"); }

Every tool defines what scopes it needs. If the token doesn’t carry those scopes, the request is rejected — clearly, with a structured error message.

This keeps access granular and explicit — no tool can be called without the exact permission attached.

That’s it. Auth-first, scoped by default, and standards-aligned.

This setup gave us confidence to expose tools to external agents, knowing they’d be properly isolated, traceable, and revocable.

Step 4:validate agent interactions

Once auth was wired in and a few tools were scoped, we needed to answer the real question: “Can agents actually use this?”

It’s one thing to build a working endpoint. It’s another to see how an actual agent interacts with your tool — what it discovers, what it fails on, and how it reacts to unclear definitions or missing scopes.

So we tested with real clients:

Claude Desktop
Windsurf
ChatGPT, via the MCP SuperAssistant plugin

To connect them, we used ^mcp-remote — which turned out to be critical.

npx mcp-remote https://mcp.scalekit.com/sse

This little proxy handled a bunch of headaches for us:

CORS: Especially for ChatGPT in the browser
Header forwarding: Passing Bearer tokens cleanly

It became our primary bridge between dev instances and agent clients — and made local debugging feel almost production-grade.

What we caught

Real client testing surfaced a ton of useful issues early:

Scope mismatches: Tools being registered with one scope but tested with another
Missing descriptions or metadata: Leading agents to ignore a tool entirely
Unexpected inputs: Where agents made optimistic guesses about parameters we hadn’t documented clearly

We’d tweak a tool, rerun ^mcp-remote, and immediately see how Claude or ChatGPT reacted. That feedback loop — build, proxy, test helped us debug fast, ship safely, and shape better defaults.

From one dev to another building an MCP server

Testing with live agents gave us real-world validation, but when you're in the middle of development, opening up Claude or ChatGPT just to check if a tool is wired correctly isn’t exactly efficient.

That’s why MCP Inspector quickly became our go-to. We kept it open constantly, like a second terminal.

It gave us a live view of the tools we’d registered:

Their names, descriptions, and input schemas
The scopes each tool required
And most importantly, the ability to test them manually

We could plug in sample inputs, see what responses came back, and tweak things without jumping into a full agent session. That tight loop helped us catch all kinds of small mistakes early — missing scopes, unclear param shapes, unexpected return formats.

Inspector became our daily safety net. If you’re building an MCP server, this kind of introspection tool isn’t a nice-to-have. It’ll save you hours of blind debugging, especially once you’ve got more than a handful of tools to manage.

What’s next

We’re far from done. This first version was about standing something up, proving that a secure, agent-friendly MCP server could be more than a demo. Now we’re pushing further.

Next on our list:

More tools: Not just test tools, but ones that interact with actual resources: environments, org templates, etc.
Smarter metadata: Things like parameter examples, expected return types, and better documentation
Usage tracking: To understand what agents are calling, how often, and where failures are happening

We're building toward something that's not just agent-compatible, but agent-native, a tool surface designed from the ground up to support secure, scoped, LLM-triggered actions.

A few things we learned

If you’re thinking of building your own MCP server or even just experimenting, here’s what we’d tell you, dev to dev:

Start with auth: Seriously. Don’t leave it for later.
Use scopes early: You’ll thank yourself once multiple tools and clients are in play.
Don’t wait for it to be perfect: Real clients (Claude, ChatGPT, etc.) will break your assumptions fast, that’s a good thing.
Make it introspectable: Whether it’s Inspector or something else, have a way to debug outside the agent loop.
Think of tools like APIs: Inputs, outputs, return types, errors, they all matter when the consumer is an LLM, not a human.

We’re still building, but excited about where this is going. If the first era of software was humans calling APIs, and the second is agents calling tools, then MCP is where it all gets real.

On this page

Toc link here

Share this article