Connectors
/
BigQuery (Service Account)
Live · 16 tools

BigQuery (Service Account) Integration for AI Agents

The BigQuery API is well-documented. Getting your agent to query real customers' warehouses — correctly, across multiple GCP projects, without exposing service account credentials — is the part that takes longer than it should.
BigQuery (Service Account)
Live

Data & Analytics

Data Warehouse

Status
Live
Tools
16 pre-built
Auth
Service Account
Credential storage
Sandbox support

Service Account Auth

Per-user isolation

Multi-project routing

Query + metadata tools

The real problem

Why this is harder than it looks

BigQuery's REST API is well-documented and the SQL dialect is familiar. You can get a working query running against your own GCP project in under an hour. The complexity arrives when you try to do this for real customers in a multi-tenant product — and it's different from OAuth-based connectors because BigQuery Service Account auth doesn't involve a redirect flow at all.

Service account authentication means each of your customers supplies a GCP service account JSON credential — a long-lived private key that grants access to their BigQuery project. Unlike OAuth tokens, these don't expire on a schedule, don't provide a standardized revocation signal, and carry whatever IAM permissions were granted at the time the key was created. In a multi-tenant product, you're now managing a credential vault: one service account JSON per customer, each scoping to a different GCP project ID. Every API call must be routed to the correct project — BigQuery's base URL includes the project ID, so a routing error doesn't produce a permission error, it produces a "project not found" response that looks identical whether the project ID is wrong or the project simply doesn't exist.

Beyond storage and routing, the operational surface compounds quickly. Customers rotate service account keys without warning — your calls start returning 401s with no OAuth lifecycle hook to trigger re-credentialing. You need per-account status tracking and a re-credentialing flow you built yourself. Customers may also scope their service account to a specific GCP region, meaning location parameters on jobs and datasets must match the region where data actually lives, or queries will fail with opaque location mismatch errors.

Scalekit handles credential storage, per-tenant project routing, and account status tracking. Your agent names a tool and passes parameters. The plumbing is not your problem.

Capabilities

What your agent can do with BigQuery (Service Account)

Once connected, your agent has 16 pre-built tools covering the full BigQuery data workflow:

  • Run queries synchronously or asynchronously: execute SQL against any dataset with bigqueryserviceaccount_run_query for short queries, or submit async jobs with bigqueryserviceaccount_insert_query_job for long-running workloads
  • Validate queries before running: dry-run any SQL to check syntax and estimate byte cost before committing to execution
  • Explore datasets and schema: list all datasets in a project, list tables within a dataset, and retrieve full column-level schema for any table or view
  • Read table data directly: fetch rows from a table without writing SQL — useful for sampling or paginating through known datasets
  • Manage and monitor jobs: list running and completed jobs, retrieve job status, cancel in-flight jobs, and paginate through query results
  • Inspect ML models and routines: retrieve BigQuery ML model metadata and list stored procedures and UDFs in any dataset
Setup context

What we're building

This guide connects a data assistant agent to BigQuery using a GCP service account — helping analysts run queries, explore schemas, and retrieve results without leaving your product.

🤖
Example agent
Data assistant running SQL queries, exploring dataset schemas, and retrieving BigQuery results on behalf of each analyst
🔐
Auth model
Service Account — each customer supplies their GCP service account JSON. identifier = your user ID
⚙️
Scalekit account
app.scalekit.com — Client ID, Secret, Env URL
☁️
GCP service account
A GCP service account JSON key with BigQuery permissions. Create at console.cloud.google.com
Setup

1 Setup: One SDK, One credential

Install the Scalekit SDK. The only credential your application manages is the Scalekit API key — no GCP secrets, no service account keys, nothing belonging to your customers.

pip install scalekit-sdk-python
npm install @scalekit-sdk/node
import scalekit.client import os from dotenv import load_dotenv load_dotenv() scalekit = scalekit.client.ScalekitClient( client_id=os.getenv("SCALEKIT_CLIENT_ID"), client_secret=os.getenv("SCALEKIT_CLIENT_SECRET"), env_url=os.getenv("SCALEKIT_ENV_URL"), ) actions = scalekit.actions
import { ScalekitClient } from '@scalekit-sdk/node'; import 'dotenv/config'; const scalekit = new ScalekitClient( process.env.SCALEKIT_ENV_URL, process.env.SCALEKIT_CLIENT_ID, process.env.SCALEKIT_CLIENT_SECRET ); const actions = scalekit.actions;
Already have credentials?
Connected Accounts

2 Per-User Auth: Creating connected accounts

BigQuery Service Account uses static credential auth — there is no OAuth redirect flow. Each connected account is provisioned directly with the customer's GCP service account JSON. The identifier is any unique string from your system.

Pass the service account JSON at account creation time. Scalekit stores it encrypted, extracts the GCP project ID automatically, and routes every tool call to the correct project.

response = actions.get_or_create_connected_account( connection_name="bigqueryserviceaccount", identifier="user_bq_123", # your internal user ID authorization_details={ "static_auth": { "service_account_json": SERVICE_ACCOUNT_JSON # customer's GCP service account JSON string } } ) connected_account = response.connected_account print(f"Status: {connected_account.status}") # Status: ACTIVE — credentials supplied at account creation
const response = await actions.getOrCreateConnectedAccount({ connectionName: "bigqueryserviceaccount", identifier: "user_bq_123", // your internal user ID authorizationDetails: { staticAuth: { serviceAccountJson: SERVICE_ACCOUNT_JSON // customer's GCP service account JSON string } } }); const connectedAccount = response.connectedAccount; console.log(`Status: ${connectedAccount.status}`); // Status: ACTIVE — credentials supplied at account creation

This call is idempotent — safe to call on every session start. Returns the existing account if one already exists.

Credential handling

3 Credential management

Because BigQuery Service Account uses static credentials, there is no user-facing authorization step. Credentials are supplied once per connected account and Scalekit handles everything from there.

Credential storage is automatic
Once a connected account is provisioned with a service account JSON, Scalekit stores it in its encrypted vault, extracts the GCP project ID, and the account is immediately ACTIVE. Every tool call is routed to the correct BigQuery project — your agent code never touches the credentials. If credentials become invalid (e.g. the customer deletes or rotates the service account key), the account moves to REVOKED. Check account.status before critical operations and surface a re-credentialing prompt.
Creating a GCP service account key
In Google Cloud Console, go to IAM & Admin → Service Accounts. Create or select a service account, then go to Keys → Add Key → Create new key → JSON. Download the JSON file. Grant the service account at minimum the BigQuery Data Viewer and BigQuery Job User roles on the target project. For write operations, BigQuery Data Editor is required. The JSON key is shown only once — store it securely before passing it to Scalekit.
Calling BigQuery (Service Account)

4 Calling BigQuery: What your agent writes

With the connected account active, your agent calls BigQuery actions using execute_tool. Name the tool, pass parameters. Scalekit handles credential retrieval, project routing, and response parsing.

Run a synchronous query

Execute SQL and get results immediately. Best for short-running queries. Returns rows inline without requiring job polling.

result = actions.execute_tool( identifier="user_bq_123", tool_name="bigqueryserviceaccount_run_query", tool_input={ "query": "SELECT id, name, revenue FROM `myproject.sales.accounts` WHERE revenue > 1000000 ORDER BY revenue DESC LIMIT 20", "location": "US", "max_results": 20 } ) # Returns: rows with schema and data inline
const result = await actions.executeTool({ identifier: "user_bq_123", toolName: "bigqueryserviceaccount_run_query", toolInput: { "query": "SELECT id, name, revenue FROM `myproject.sales.accounts` WHERE revenue > 1000000 ORDER BY revenue DESC LIMIT 20", "location": "US", "max_results": 20 } }); // Returns: rows with schema and data inline

Validate a query before running

Dry-run any SQL to check syntax and estimate byte cost without executing. Use before submitting large or expensive queries.

result = actions.execute_tool( identifier="user_bq_123", tool_name="bigqueryserviceaccount_dry_run_query", tool_input={ "query": "SELECT * FROM `myproject.analytics.events` WHERE event_date = '2026-04-01'", "location": "US" } ) # Returns: { "statistics": { "totalBytesProcessed": "2483648000" } } # — check byte estimate before committing to the real job
const result = await actions.executeTool({ identifier: "user_bq_123", toolName: "bigqueryserviceaccount_dry_run_query", toolInput: { "query": "SELECT * FROM `myproject.analytics.events` WHERE event_date = '2026-04-01'", "location": "US" } }); // Returns: { statistics: { totalBytesProcessed: "2483648000" } } // — check byte estimate before committing to the real job

Submit an async query job and retrieve results

For long-running queries, submit an async job and poll until done. Use bigqueryserviceaccount_insert_query_job to submit, bigqueryserviceaccount_get_job to poll, and bigqueryserviceaccount_get_query_results to paginate through results.

# 1. Submit the async job job = actions.execute_tool( identifier="user_bq_123", tool_name="bigqueryserviceaccount_insert_query_job", tool_input={ "query": "SELECT user_id, SUM(amount) AS total FROM `myproject.billing.transactions` GROUP BY user_id", "location": "US", "priority": "INTERACTIVE" } ) job_id = job["jobReference"]["jobId"] # 2. Poll until DONE status = actions.execute_tool( identifier="user_bq_123", tool_name="bigqueryserviceaccount_get_job", tool_input={"job_id": job_id, "location": "US"} ) # status["status"]["state"] == "DONE" when complete # 3. Retrieve paginated results results = actions.execute_tool( identifier="user_bq_123", tool_name="bigqueryserviceaccount_get_query_results", tool_input={"job_id": job_id, "location": "US", "max_results": 1000} )
// 1. Submit the async job const job = await actions.executeTool({ identifier: "user_bq_123", toolName: "bigqueryserviceaccount_insert_query_job", toolInput: { "query": "SELECT user_id, SUM(amount) AS total FROM `myproject.billing.transactions` GROUP BY user_id", "location": "US", "priority": "INTERACTIVE" } }); const jobId = job.jobReference.jobId; // 2. Poll until DONE const status = await actions.executeTool({ identifier: "user_bq_123", toolName: "bigqueryserviceaccount_get_job", toolInput: { job_id: jobId, location: "US" } }); // status.status.state === "DONE" when complete // 3. Retrieve paginated results const results = await actions.executeTool({ identifier: "user_bq_123", toolName: "bigqueryserviceaccount_get_query_results", toolInput: { job_id: jobId, location: "US", max_results: 1000 } });

Explore dataset schema

List all datasets in a project, then inspect the tables and schema in any dataset. Useful for agents that need to discover available data before writing queries.

# List all datasets datasets = actions.execute_tool( identifier="user_bq_123", tool_name="bigqueryserviceaccount_list_datasets", tool_input={"max_results": 50} ) # List tables in a dataset tables = actions.execute_tool( identifier="user_bq_123", tool_name="bigqueryserviceaccount_list_tables", tool_input={"dataset_id": "sales", "max_results": 100} ) # Get schema for a specific table schema = actions.execute_tool( identifier="user_bq_123", tool_name="bigqueryserviceaccount_get_table", tool_input={"dataset_id": "sales", "table_id": "accounts"} ) # Returns: column names, types, descriptions, and table properties
// List all datasets const datasets = await actions.executeTool({ identifier: "user_bq_123", toolName: "bigqueryserviceaccount_list_datasets", toolInput: { max_results: 50 } }); // List tables in a dataset const tables = await actions.executeTool({ identifier: "user_bq_123", toolName: "bigqueryserviceaccount_list_tables", toolInput: { dataset_id: "sales", max_results: 100 } }); // Get schema for a specific table const schema = await actions.executeTool({ identifier: "user_bq_123", toolName: "bigqueryserviceaccount_get_table", toolInput: { dataset_id: "sales", table_id: "accounts" } }); // Returns: column names, types, descriptions, and table properties
Framework wiring

5 Wiring into your agent framework

Scalekit integrates directly with LangChain. The agent decides what to query; Scalekit handles credential routing on every invocation. No credential plumbing in your agent logic.

from langchain_anthropic import ChatAnthropic from langchain.agents import AgentExecutor, create_tool_calling_agent from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder from scalekit.langchain import get_tools bq_tools = get_tools( connection_name="bigqueryserviceaccount", identifier="user_bq_123" ) prompt = ChatPromptTemplate.from_messages([ ("system", "You are a data assistant. Use the available tools to query BigQuery datasets, explore schemas, and retrieve results on behalf of the user."), MessagesPlaceholder("chat_history", optional=True), ("human", "{input}"), MessagesPlaceholder("agent_scratchpad"), ]) agent = create_tool_calling_agent(ChatAnthropic(model="claude-sonnet-4-6"), bq_tools, prompt) result = AgentExecutor(agent=agent, tools=bq_tools).invoke({ "input": "Show me the top 10 customers by revenue this quarter from the sales dataset" })
import { ChatAnthropic } from "@langchain/anthropic"; import { AgentExecutor, createToolCallingAgent } from "langchain/agents"; import { ChatPromptTemplate, MessagesPlaceholder } from "@langchain/core/prompts"; import { getTools } from "@scalekit-sdk/langchain"; const bqTools = getTools({ connectionName: "bigqueryserviceaccount", identifier: "user_bq_123" }); const prompt = ChatPromptTemplate.fromMessages([ ["system", "You are a data assistant. Use the available tools to query BigQuery datasets, explore schemas, and retrieve results on behalf of the user."], new MessagesPlaceholder("chat_history", true), ["human", "{input}"], new MessagesPlaceholder("agent_scratchpad"), ]); const agent = await createToolCallingAgent({ llm: new ChatAnthropic({ model: "claude-sonnet-4-6" }), tools: bqTools, prompt }); const result = await AgentExecutor.fromAgentAndTools({ agent, tools: bqTools }).invoke({ input: "Show me the top 10 customers by revenue this quarter from the sales dataset" });
Other frameworks supported
Tool reference

All 16 BigQuery (Service Account) tools

Grouped by capability. Your agent calls tools by name — no API wrappers to write.

Query Execution
bigqueryserviceaccount_run_query
Execute a SQL query synchronously and return results immediately. Best for short-running queries. Use Insert Query Job for long-running workloads.
bigqueryserviceaccount_insert_query_job
Submit an asynchronous BigQuery query job. Returns a job ID to poll with Get Job or Get Query Results. Supports destination table, write disposition, and byte billing limits.
bigqueryserviceaccount_dry_run_query
Validate a SQL query and estimate its byte cost without executing. Returns totalBytesProcessed so agents can gate expensive queries before submission.
Jobs
bigqueryserviceaccount_get_job
Retrieve the status and configuration of a BigQuery job by ID. Use to poll for completion of async query jobs.
bigqueryserviceaccount_list_jobs
List BigQuery jobs in the project. Filter by state (done, pending, running) and control field projection. Supports pagination.
bigqueryserviceaccount_cancel_job
Request cancellation of a running BigQuery job. Cancellation is best-effort — the job may complete before it can be stopped.
bigqueryserviceaccount_get_query_results
Retrieve results of a completed query job with pagination support. Use after polling Get Job until status is DONE.
Datasets
bigqueryserviceaccount_list_datasets
List all BigQuery datasets in the project. Supports label filtering and pagination. Pass all: true to include hidden datasets.
bigqueryserviceaccount_get_dataset
Retrieve metadata for a specific dataset including location, description, labels, access controls, and creation/modification times.
Tables
bigqueryserviceaccount_list_tables
List all tables and views in a BigQuery dataset with pagination support.
bigqueryserviceaccount_get_table
Retrieve full schema and metadata for a table or view — column names, types, descriptions, and table properties.
bigqueryserviceaccount_list_table_data
Read rows directly from a table without SQL. Supports pagination, row offset, and field selection. Useful for sampling known datasets.
ML Models & Routines
bigqueryserviceaccount_list_models
List all BigQuery ML models in a dataset including model type, training status, and creation time.
bigqueryserviceaccount_get_model
Retrieve metadata for a specific BigQuery ML model — model type, feature columns, label columns, and training run details.
bigqueryserviceaccount_list_routines
List all stored procedures and UDFs in a dataset. Supports type filtering and pagination.
bigqueryserviceaccount_get_routine
Retrieve the definition and metadata of a stored procedure or UDF — arguments, return type, and body.
Connector notes

BigQuery (Service Account)-specific behavior

Service account keys don't expire — but they do get deleted or rotated
Unlike OAuth tokens, GCP service account keys have no automatic expiry. But customers can delete or rotate keys at any time in Google Cloud Console, and there is no revocation webhook. When this happens, tool calls will return 401 errors and Scalekit will move the connected account to REVOKED. Surface a re-credentialing prompt rather than returning a generic error to the end user.
Location must match where the data lives
BigQuery datasets and jobs are location-scoped. If a dataset is in EU and your query job is submitted to US, BigQuery returns a location mismatch error. Pass the correct location parameter — matching the dataset's region — on all job submission and query calls. Retrieve the dataset's location with bigqueryserviceaccount_get_dataset if unknown.
Use run_query for short queries; insert_query_job for large workloads
bigqueryserviceaccount_run_query is synchronous and returns results inline — it times out for queries running longer than the timeout_ms limit. For large aggregations, full-table scans, or queries writing to destination tables, use bigqueryserviceaccount_insert_query_job and poll for completion with bigqueryserviceaccount_get_job.
Infrastructure decision

Why not build this yourself

The BigQuery REST API is documented. Credential storage isn't technically hard. But here's what you're actually signing up for:

PROBLEM 01
Per-customer GCP project routing — every API call must embed the correct project ID in the base URL, with the correct service account credentials, with no shared endpoint
PROBLEM 02
Service account keys don't expire but do get rotated or deleted — you need revocation detection and re-credentialing flows without any OAuth lifecycle hooks
PROBLEM 03
Per-user credential isolation in a multi-tenant system — one customer's service account key must never be used to query another customer's project
PROBLEM 04
Encrypted credential storage, account status tracking, and a UI for collecting service account JSON at customer onboarding — all before you've run a single query

That's one connector. Your agent product will eventually need Salesforce, Zendesk, Snowflake, GitHub, and whatever else your customers ask for. Each has its own auth quirks and failure modes.

Scalekit maintains every connector. You maintain none of them.

Ready to ship

Query BigQuery in minutes

Free to start. Service account routing fully handled.
BigQuery (Service Account)
Live

Data & Analytics

Data Warehouse

Status
Live
Tools
16 pre-built
Auth
Service Account
Credential storage
Sandbox support