Connectors
/
Apify MCP
Live · 8 tools

Apify MCP Integration for AI Agents

The Apify API is well-documented. Getting your agent to discover, call, and retrieve results from Actors correctly — across users, with token isolation, and without silent failures — is the part that takes longer than it should.
Apify MCP
Live

AI Automation

Developer Tools

Status
Live
Tools
8 pre-built
Auth
Bearer Token
Credential storage
Sandbox support

Bearer Token

Per-user isolation

Actor execution

RAG-ready web scraping

The real problem

Why this is harder than it looks

Apify's API is straightforward and well-documented. You can get a working Actor run against your own account in under an hour. The complexity arrives when you're building a multi-tenant product — where each of your users brings their own Apify account and API token.

Apify authenticates via API token — a credential that is account-scoped. In a multi-tenant system, you're now managing a credential vault: one token per user, each with its own quota pool and Actor run history. There's no OAuth redirect to standardize collection — you need a UI to gather tokens, store them encrypted, and route every Actor call to the right account. Tokens don't expire automatically, but users can revoke or rotate them in the Apify console at any time. When that happens, your agent gets silent 401 errors with no automated recovery path unless you've built revocation detection and re-credentialing flows yourself.

Beyond credential management, Actor execution has its own failure surface. Actors can run synchronously or asynchronously — sync mode times out for long-running tasks, async mode requires polling apifymcp_get_actor_run and a separate apifymcp_get_actor_output call once status is SUCCEEDED. Calling an Actor without first fetching its input schema via apifymcp_fetch_actor_details produces silent failures or incorrect results. And the Analytics and Data API quota pools are tracked independently per Apify account — exceeding one doesn't affect the other, but failures look identical.

Scalekit handles token storage, per-user credential isolation, and account status tracking. Your agent names a tool and passes parameters. The auth plumbing is not your problem.

Capabilities

What your agent can do with Apify MCP

Once connected, your agent has 8 pre-built tools covering Actor discovery, execution, result retrieval, and real-time web browsing:

  • Discover Actors for any use case: search the Apify Store by keyword to find scrapers for specific platforms before deciding what to run
  • Inspect Actor schemas before calling: fetch input schema, pricing, and MCP tool list for any Actor to ensure correct invocation
  • Run any Actor synchronously or asynchronously: call Actors with the exact input they require; use async mode for long-running tasks and poll for completion
  • Retrieve full Actor output: fetch paginated dataset results with field selection after a run completes
  • Real-time web scraping for RAG: search Google and return clean Markdown from top pages — no Actor selection needed for immediate web lookups
  • Search and fetch Apify documentation: look up platform docs, SDK references, and Crawlee guides programmatically
Setup context

What we're building

This guide connects a data extraction agent to Apify — helping users discover and run web scraping Actors, retrieve structured results, and ground LLM responses with real-time web content, all without leaving your product.

🤖
Example agent
Data extraction assistant discovering and running Apify Actors to scrape, retrieve, and structure web data on behalf of each user
🔐
Auth model
API Key (Bearer Token) — each user supplies their Apify API token. identifier = your user ID
⚙️
Scalekit account
app.scalekit.com — Client ID, Secret, Env URL
🔑
Apify account
Create a free account at console.apify.com. Generate an API token under Settings → API & Integrations → API tokens
Setup

1 Setup: One SDK, One credential

Install the Scalekit SDK. The only credential your application manages is the Scalekit API key — no Apify secrets, no user tokens, nothing belonging to your users.

pip install scalekit-sdk-python
npm install @scalekit-sdk/node
import scalekit.client import os from dotenv import load_dotenv load_dotenv() scalekit = scalekit.client.ScalekitClient( client_id=os.getenv("SCALEKIT_CLIENT_ID"), client_secret=os.getenv("SCALEKIT_CLIENT_SECRET"), env_url=os.getenv("SCALEKIT_ENV_URL"), ) actions = scalekit.actions
import { ScalekitClient } from '@scalekit-sdk/node'; import 'dotenv/config'; const scalekit = new ScalekitClient( process.env.SCALEKIT_ENV_URL, process.env.SCALEKIT_CLIENT_ID, process.env.SCALEKIT_CLIENT_SECRET ); const actions = scalekit.actions;
Already have credentials?
Connected Accounts

2 Per-User Auth: Creating connected accounts

Apify MCP uses API token authentication — there is no OAuth redirect flow. Each user's connected account is provisioned with their Apify API token. The identifier is any unique string from your system.

In the Scalekit dashboard, go to AgentKit → Connected Accounts for your Apify MCP connection and click Add account. Supply the user's identifier and their Apify API token. Scalekit stores the credential encrypted and injects it into every tool call automatically.

response = actions.get_or_create_connected_account( connection_name="apifymcp", identifier="user_apify_123" # your internal user ID ) connected_account = response.connected_account print(f"Status: {connected_account.status}") # Status: ACTIVE — credentials were supplied at account creation
const response = await actions.getOrCreateConnectedAccount({ connectionName: "apifymcp", identifier: "user_apify_123" // your internal user ID }); const connectedAccount = response.connectedAccount; console.log(`Status: ${connectedAccount.status}`); // Status: ACTIVE — credentials were supplied at account creation

This call is idempotent — safe to call on every session start. Returns the existing account if one already exists.

Credential handling

3 Credential management

Because Apify MCP uses API tokens rather than OAuth, there is no user-facing authorization step. Credentials are entered once per connected account. Scalekit stores them encrypted and uses them on every tool call.

Credential storage is automatic
Once a connected account is provisioned with an Apify API token, Scalekit stores it in its encrypted vault and the account is immediately ACTIVE. Every tool call is authenticated with the stored token — your agent code never touches it. If a user rotates or revokes their token in the Apify console, the account moves to REVOKED. Check account.status before critical operations and surface a re-credentialing prompt.
Generating an Apify API token
In the Apify console, go to Settings → API & Integrations → API tokens and click + Create new token. Give it a name (e.g. Agent Auth) and click Create token. Copy the token immediately — it is only shown once. Use a token scoped to the operations your agent will perform.
Calling Apify MCP

4 Calling Apify MCP: What your agent writes

With the connected account active, your agent calls Apify actions using execute_tool. Name the tool, pass parameters. Scalekit handles token retrieval and request construction.

Search for an Actor

Discover what scraping Actors exist for a platform or use case before deciding what to run. Always search before calling an unknown Actor.

result = actions.execute_tool( identifier="user_apify_123", tool_name="apifymcp_search_actors", tool_input={ "keywords": "LinkedIn company posts", "limit": 5 } ) # Returns: list of matching Actors with id, name, description, stats
const result = await actions.executeTool({ identifier: "user_apify_123", toolName: "apifymcp_search_actors", toolInput: { "keywords": "LinkedIn company posts", "limit": 5 } }); // Returns: list of matching Actors with id, name, description, stats

Fetch an Actor's input schema

Always retrieve the input schema before calling an Actor. Passing inputSchema: true in the output filter keeps the response token-efficient.

result = actions.execute_tool( identifier="user_apify_123", tool_name="apifymcp_fetch_actor_details", tool_input={ "actor": "apify/linkedin-company-posts-scraper", "output": {"inputSchema": True} } ) # Returns: Actor input schema with required fields and types
const result = await actions.executeTool({ identifier: "user_apify_123", toolName: "apifymcp_fetch_actor_details", toolInput: { "actor": "apify/linkedin-company-posts-scraper", "output": { "inputSchema": true } } }); // Returns: Actor input schema with required fields and types

Call an Actor synchronously

Run an Actor and wait for results. Use the input schema from the previous step to pass the correct fields. For long-running Actors, use async: true instead.

result = actions.execute_tool( identifier="user_apify_123", tool_name="apifymcp_call_actor", tool_input={ "actor": "apify/linkedin-company-posts-scraper", "input": { "companyUrls": [{"url": "https://www.linkedin.com/company/apify/"}], "maxResults": 20 }, "previewOutput": True } ) # Returns: run metadata + preview of dataset items
const result = await actions.executeTool({ identifier: "user_apify_123", toolName: "apifymcp_call_actor", toolInput: { "actor": "apify/linkedin-company-posts-scraper", "input": { "companyUrls": [{ "url": "https://www.linkedin.com/company/apify/" }], "maxResults": 20 }, "previewOutput": true } }); // Returns: run metadata + preview of dataset items

Real-time web scraping for RAG

Fetch clean Markdown from the top Google Search results for LLM grounding — no Actor selection needed. Use for time-sensitive queries like current prices, news, or live data.

result = actions.execute_tool( identifier="user_apify_123", tool_name="apifymcp_rag_web_browser", tool_input={ "query": "Apify pricing plans 2026", "maxResults": 3, "outputFormats": ["markdown"] } ) # Returns: Markdown content from top 3 search result pages
const result = await actions.executeTool({ identifier: "user_apify_123", toolName: "apifymcp_rag_web_browser", toolInput: { "query": "Apify pricing plans 2026", "maxResults": 3, "outputFormats": ["markdown"] } }); // Returns: Markdown content from top 3 search result pages
Framework wiring

5 Wiring into your agent framework

Scalekit integrates directly with LangChain. The agent decides what to call; Scalekit handles token injection on every invocation. No credential plumbing in your agent logic.

from langchain_anthropic import ChatAnthropic from langchain.agents import AgentExecutor, create_tool_calling_agent from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder from scalekit.langchain import get_tools apify_tools = get_tools( connection_name="apifymcp", identifier="user_apify_123" ) prompt = ChatPromptTemplate.from_messages([ ("system", "You are a web data extraction assistant. Use the available tools to discover Apify Actors, run scraping tasks, and retrieve structured results."), MessagesPlaceholder("chat_history", optional=True), ("human", "{input}"), MessagesPlaceholder("agent_scratchpad"), ]) agent = create_tool_calling_agent(ChatAnthropic(model="claude-sonnet-4-6"), apify_tools, prompt) result = AgentExecutor(agent=agent, tools=apify_tools).invoke({ "input": "Find an Actor that scrapes Amazon product reviews and run it for ASIN B09V3KXJPB" })
import { ChatAnthropic } from "@langchain/anthropic"; import { AgentExecutor, createToolCallingAgent } from "langchain/agents"; import { ChatPromptTemplate, MessagesPlaceholder } from "@langchain/core/prompts"; import { getTools } from "@scalekit-sdk/langchain"; const apifyTools = getTools({ connectionName: "apifymcp", identifier: "user_apify_123" }); const prompt = ChatPromptTemplate.fromMessages([ ["system", "You are a web data extraction assistant. Use the available tools to discover Apify Actors, run scraping tasks, and retrieve structured results."], new MessagesPlaceholder("chat_history", true), ["human", "{input}"], new MessagesPlaceholder("agent_scratchpad"), ]); const agent = await createToolCallingAgent({ llm: new ChatAnthropic({ model: "claude-sonnet-4-6" }), tools: apifyTools, prompt }); const result = await AgentExecutor.fromAgentAndTools({ agent, tools: apifyTools }).invoke({ input: "Find an Actor that scrapes Amazon product reviews and run it for ASIN B09V3KXJPB" });
Other frameworks supported
Tool reference

All 8 Apify MCP tools

Grouped by capability. Your agent calls tools by name — no API wrappers to write.

Actor Discovery
apifymcp_search_actors
Search the Apify Store by keyword to discover available scraping Actors for a platform or use case. Does not run any Actor — discovery only
apifymcp_fetch_actor_details
Retrieve Actor metadata, input schema, pricing, and MCP tool list. Always pass output: {"inputSchema": true} to avoid a token-heavy full response
Actor Execution & Results
apifymcp_call_actor
Call any Actor with input matching its schema. Runs synchronously by default; use async: true for long-running tasks. Fetch the input schema first
apifymcp_get_actor_run
Poll the status and metadata of an async Actor run by runId. Returns datasetId, keyValueStoreId, and performance stats
apifymcp_get_actor_output
Retrieve paginated dataset items from a completed run using datasetId. Supports field selection with dot notation
Web Browsing
apifymcp_rag_web_browser
Search Google and return clean Markdown from the top N result pages. Use for real-time data retrieval and LLM grounding — no Actor selection needed
Documentation
apifymcp_search_apify_docs
Full-text search across Apify platform docs, Crawlee JS, or Crawlee Python documentation. Use keywords, not sentences
apifymcp_fetch_apify_docs
Fetch the full content of a specific Apify or Crawlee documentation page by URL
Connector notes

Apify MCP-specific behavior

Always fetch the input schema before calling an Actor
Calling apifymcp_call_actor without the Actor's input schema produces silent failures or incorrect results. Use apifymcp_fetch_actor_details with output: {"inputSchema": true} first. Omitting the output filter returns the full README and all metadata — extremely token-heavy and unnecessary for most calls.
Async runs require polling and a separate output fetch
When calling an Actor with async: true, the response returns a runId immediately but no results. Poll apifymcp_get_actor_run until status is SUCCEEDED, then call apifymcp_get_actor_output with the returned datasetId to retrieve results. Sync mode (default) blocks until completion and returns a preview inline.
API tokens don't expire — but they do get rotated
Apify API tokens have no automatic expiry, but users can revoke or regenerate them in the Apify console at any time. When that happens, the connected account moves to REVOKED in Scalekit. Surface a re-credentialing prompt rather than returning a generic error to the user.
Infrastructure decision

Why not build this yourself

The Apify API is documented. Token storage isn't technically hard. But here's what you're actually signing up for:

PROBLEM 01
Per-user token isolation in a multi-tenant system — one user's Apify credentials must never authenticate calls for another user, even within the same organization
PROBLEM 02
API tokens don't expire but do get rotated — you need revocation detection and re-credentialing flows without the OAuth lifecycle hooks other connectors provide
PROBLEM 03
Async Actor execution requires polling and separate output retrieval — your agent logic must handle run states, timeouts, and datasetId handoff without a built-in abstraction
PROBLEM 04
Encrypted credential storage, account status tracking, and a UI for collecting user API tokens at onboarding — all before you've written a single Actor call

That's one connector. Your agent product will eventually need Salesforce, Gmail, GitHub, Zendesk, and whatever else your customers ask for. Each has its own auth quirks and failure modes.

Scalekit maintains every connector. You maintain none of them.

Ready to ship

Scrape the web, instantly

Free to start. Token storage fully handled.
Apify MCP
Live

AI Automation

Developer Tools

Status
Live
Tools
8 pre-built
Auth
Bearer Token
Credential storage
Sandbox support