Jun 29, 2026

Microsoft Word MCP vs API for developing AI Agents

TL;DR

The official Work IQ Word MCP server (preview, part of Microsoft Agent 365) exposes four tools: create a document, read document content and comments, add a comment, and reply to a comment. It does not edit document body content, apply formatting, or manage tracked changes. Those gaps are structural, not a backlog item.
Neither path gives you a structured Word content API. Unlike Excel, which has a rich ^/workbook surface in Microsoft Graph, Word ^.docx files are binary driveItems. The Graph path is to download the file, edit it locally with a library, then re-upload. Surgical paragraph and run edits happen in your code, not in the API.
Both paths are OAuth-first. The Work IQ Word MCP authenticates through Microsoft Entra with a delegated On-Behalf-Of flow or an agent identity, and requires a Microsoft 365 Copilot license. There is no API key shortcut on the MCP path.
For multi-tenant B2B agents, both paths produce one credential per user. Neither stores, refreshes, or revokes those tokens for you. That stays an infrastructure problem regardless of which path you choose.
Scalekit's Microsoft Word connector and Token Vault handle the per-user OAuth flow, encrypted token storage, and automatic refresh for both paths, so the MCP versus API decision does not change your auth infrastructure.

Your agent needs to read and write Microsoft Word documents. Microsoft now ships an official Word MCP server, the Work IQ Word server inside Microsoft Agent 365, and Word has long been reachable through Microsoft Graph. They are not the same object: different capability coverage, a different auth path, and a different operational surface in production. The choice is not permanent either. What your agent actually does with Word decides which path fits. Here is how to pick.

What Microsoft Word MCP and Microsoft Word API actually are

Microsoft Word MCP

The official Word MCP server is the Work IQ Word server, identified as mcp_WordServer, shipped as part of Microsoft Agent 365. It is in preview. It is a Microsoft-built, Microsoft-hosted remote MCP server, and the remote endpoint takes the form https://agent365.svc.cloud.microsoft/agents/tenants/{tenant_id}/servers/mcp_WordServer.

An agent connects through Microsoft Entra. Authentication is OAuth, either an On-Behalf-Of delegated user flow or an agent identity, and a Microsoft 365 Copilot license is required. There is no API key or static credential option on this path.

Microsoft Word API

There is no standalone Word REST API. Word documents are reached through Microsoft Graph as driveItem resources in OneDrive or SharePoint. You download a .docx with GET /drives/{drive-id}/items/{item-id}/content, and replace it with PUT .../content for small files or a resumable upload session for files over 4 MB.

Authentication is the Microsoft identity platform: OAuth Authorization Code for delegated access, and client credentials or JWT-bearer patterns for app-only, server-to-server access with no user in the loop.

Comparing them where it matters for agents

What your agent can actually do

The Work IQ Word MCP covers a narrow, document-lifecycle slice: create a document from HTML or plain text, read its text and comments, and run comment threads. The Graph path covers more, but with a sharp limit of its own: it has no structured Word body model, so any real editing happens in your code after a download.

Capability

Work IQ Word MCP

Microsoft Graph (Word)

Create a new document

Yes (^{WordCreateNewDocument})

Yes (upload ^.docx bytes)

Read document text content

Yes (^{WordGetDocumentContent})

Partial (download file, parse locally; or convert to PDF)

Read document comments

Yes

No (download and parse the OOXML yourself)

Add a comment

Yes (^{WordCreateNewComment})

Reply to a comment

Yes (^{WordReplyToComment})

Edit body content in place

No structured API; edit locally and re-upload

Apply formatting or styles

No structured API; edit locally and re-upload

Tracked changes

No structured API; edit locally and re-upload

List and search documents

Yes (driveItem list, search, delta)

Convert document to PDF

Yes (^{GET .../content?format=pdf})

React to document changes

Partial (folder-level driveItem webhooks, ^updated only)

Where the ceilings are

The MCP ceiling is the tool count. Four tools is enough for a comment-and-create assistant, not enough for an agent that edits documents or browses a workspace. The server has no list, no search, and no body-editing surface.

The Graph ceiling is different but just as real. Word has no equivalent to the Excel /workbook content model. You cannot patch a paragraph or a run through an endpoint. The agent downloads the binary, edits it with a library such as python-docx or the Open XML SDK, and uploads the result. That is more control and more code.

The auth path each one puts you on

The Work IQ Word MCP is OAuth-only through Microsoft Entra, and it requires a Microsoft 365 Copilot license. An agent either runs the On-Behalf-Of flow, exchanging a signed-in user's token to act with that user's permissions, or uses an agent identity with its own assigned permissions. Interactive consent or a pre-provisioned agent identity is required. There is no static credential.

Microsoft Graph is broader. Delegated OAuth Authorization Code works for user-present agents, and app-only client credentials or JWT-bearer flows give clean server-to-server access for headless work. A background job that processes documents on a schedule has a first-class auth path on Graph; on the MCP path it does not.

Why this matters for multi-tenant agents

For a B2B agent serving 40 users across several tenants, both paths converge on the same shape: one OAuth identity per user, scoped to that user's OneDrive and SharePoint permissions. The Copilot license requirement on the MCP path is the extra gate to plan for. Neither path stores, refreshes, or revokes those per-user tokens for you. That is infrastructure you build or buy.

What you own in production

On the MCP path, Microsoft manages hosting, tool schemas, and runtime governance through the Microsoft 365 admin center, and tool calls are traceable in Microsoft Defender. You still own per-user token acquisition through Entra, the Copilot license footprint, and adapting to schema changes while the server is in preview.

On the Graph path, you own everything above the wire: endpoint selection, the download-edit-upload loop, local document parsing, retry and pagination logic, change-notification subscriptions and their renewal, and the full token lifecycle. The maintenance surface is larger; the determinism is higher because you pin behavior rather than consume a preview contract.

When to use MCP, when to use the API

Use the Work IQ Word MCP when:

Your agent is interactive and user-present, inside Copilot Studio, Microsoft Foundry, or a coding agent, and the user can complete the Entra consent flow.
The job is creating documents and running comment workflows, not editing document bodies or formatting.
Your organization already holds Microsoft 365 Copilot licenses and wants admin-center governance and Defender tracing by default.
You want Microsoft to own server hosting and tool maintenance.

Use Microsoft Graph directly when:

Your agent runs headless on a schedule, where app-only or JWT-bearer auth fits and the MCP path's interactive consent does not.
The agent must edit document content, apply formatting, or handle tracked changes, which means a download-edit-upload loop in your own code.
The agent needs to list, search, or browse documents across a drive, or convert documents to PDF.
You need a deterministic pipeline where you pin Graph behavior rather than depend on a preview server's evolving schema.

The credential problem that exists on both paths

Both paths hand you a token per user. Neither hands you a vault, rotation logic, or a revocation flow. In a multi-tenant agent, which is the B2B default, every user has their own Microsoft credential. That is N tokens to store encrypted, refresh proactively, and revoke when someone leaves.

The same problem, a different token type

The MCP path gives you an Entra-issued delegated token per user. The Graph path gives you an OAuth access token per user, or an app-only token per tenant for background work. In both cases the token must live somewhere isolated per tenant, never in the agent runtime, never in LLM context, and revocable on disconnect. The token type differs; the credential infrastructure required does not.

Where Scalekit fits

Scalekit's Microsoft Word connector handles the OAuth flow, encrypted token storage, and automatic refresh for both paths, so the MCP versus API decision does not change your auth infrastructure. What the user can't do, the agent can't do, because access is resolved from the user's own connected account, not a shared service credential.

Building a Word agent with Scalekit

The value for a document agent

A document agent rarely needs raw Word alone. It reads a proposal, drafts a follow-up, files a contract, and posts a summary somewhere else. Every one of those steps is a separate OAuth dialect if you wire it yourself. Scalekit's AgentKit gives the agent authenticated access to Microsoft Word, and to the other connectors the workflow touches, behind one execution model. The agent calls a tool; Scalekit resolves the right per-user token server-side and runs the call.

Token Vault keeps credentials out of the agent

Credentials live in Scalekit's managed Token Vault, encrypted at rest and namespaced per tenant. They are resolved at request time and never appear in the agent runtime or the LLM context. User A's documents are never reachable by an agent acting for user B, even on the same connection. Revocation is a single action, and the next tool call for that user fails closed without affecting anyone else.

Connect the tools with the Python SDK

The Scalekit Microsoft Word connector exposes prebuilt, LLM-ready tools. The two documented tools are microsoftword_read_document, which exports a .docx from OneDrive as a PDF for client-side parsing, and microsoftword_create_document, which starts a resumable upload session for a new document. You call them through execute_tool after the user authorizes once. Note that connection_name must match the connection you configured in the Scalekit dashboard exactly; a mismatch is the most common integration error.

import os from scalekit.client import ScalekitClient from dotenv import load_dotenv load_dotenv() scalekit_client = ScalekitClient( env_url=os.getenv("SCALEKIT_ENV_URL"), client_id=os.getenv("SCALEKIT_CLIENT_ID"), client_secret=os.getenv("SCALEKIT_CLIENT_SECRET"), ) actions = scalekit_client.actions connection_name = "microsoftword" # must match the dashboard connection name exactly identifier = "user_123" # One-time per user: send them through the OAuth consent flow link_response = actions.get_authorization_link( connection_name=connection_name, identifier=identifier, ) print("Authorize Microsoft Word:", link_response.link) input("Press Enter after authorizing...") # Execute a scoped tool as the authorized user result = actions.execute_tool( tool_name="microsoftword_read_document", connection_name=connection_name, identifier=identifier, tool_input={"item_id": "YOUR_ITEM_ID"}, ) print(result)

Scope the surface before the agent runs

The agent should see only the tools the current user is authorized to call, not a full catalog. A bloated tool surface degrades selection accuracy and burns tokens before the agent does any work. Scoping the surface is the lever, and you set it with a tool filter when you list the user's scoped tools.

scoped = actions.list_scoped_tools( identifier="user_123", filter={ "connection_names": ["microsoftword"], "tool_names": ["microsoftword_read_document", "microsoftword_create_document"], }, )

Or expose Word through a Virtual MCP server

If you would rather hand any MCP-compatible framework a pre-authenticated endpoint instead of driving the loop yourself, use Scalekit's Virtual MCP server. You declare a config once with create_config, naming the connections and the exact tools to expose, then call ensure_instance per user. Scalekit mints a per-user URL of the form https://yourcompany.scalekit.com/mcp/v3/servers/<server-id>. One server definition serves every user; each run resolves that user's own credentials. The endpoint is static; the identity is per-user, and there is no MCP server to deploy, host, or maintain.

Audit logs make multi-tool, multi-tenant agents observable

Every Microsoft Word tool call is logged: who triggered it, which document was touched, and what came back, tied to the user who authorized it rather than a shared bot account. For a multi-tenant agent that also reaches into Slack, a CRM, or a ticketing system, that per-user attribution is what makes the audit trail accurate and the agent answerable to a security review.

Need a Word tool Scalekit does not expose yet

The documented connector centers on creating and reading documents. If your agent needs a Word action that is not yet in the catalog, request it: join the Scalekit Slack community or talk to the team. New tool requests are typically turned around quickly, on the same auth plumbing as every other connector.

Which one to build against

If your agent is interactive, user-present, and its job is creating documents and managing comments, the Work IQ Word MCP is the faster path, provided you have Microsoft 365 Copilot licenses and can live with a preview surface. If your agent runs headless, edits document content, or needs to list, search, and convert across a drive, build on Microsoft Graph and own the download-edit-upload loop. Most production Word agents end up using both: the MCP for the interactive assistant, Graph for the background pipeline. Either way, the credential management problem is the same, and that is what needs production-grade infrastructure.

Browse the Scalekit Microsoft Word connector.
Read the Scalekit's Word connector docs, official Work IQ Word MCP reference, and Microsoft Graph driveItem reference.

No items found.

On this page

Introduction
‍

This is some text inside of a div block.

Microsoft Word MCP vs API for developing AI Agents

TL;DR

What Microsoft Word MCP and Microsoft Word API actually are

Microsoft Word MCP

Microsoft Word API

Comparing them where it matters for agents

What your agent can actually do

Where the ceilings are

The auth path each one puts you on

Why this matters for multi-tenant agents

What you own in production

When to use MCP, when to use the API

The credential problem that exists on both paths

The same problem, a different token type

Where Scalekit fits

Building a Word agent with Scalekit

The value for a document agent

Token Vault keeps credentials out of the agent

Connect the tools with the Python SDK

Scope the surface before the agent runs

Or expose Word through a Virtual MCP server

Audit logs make multi-tool, multi-tenant agents observable

Need a Word tool Scalekit does not expose yet

Which one to build against

Acquire enterprise customers with zero upfront cost