MCP Proxy Server vs MCP Gateway: Complete Comparison
When you move beyond a single MCP server running on localhost, you immediately face an infrastructure decision that most tutorials skip entirely: do you need an MCP Proxy Server, an MCP Gateway, or both?
Get this wrong and you'll either under-engineer your deployment (no auth, no observability, no rate limiting) or over-engineer it (a full gateway for a two-server setup that needed a five-line proxy config). This guide gives you the mental model, architecture diagrams, and decision framework to get it right the first time.
What Is an MCP Proxy Server?
An MCP Proxy Server is a transparent intermediary that sits between an MCP client and one or more upstream MCP servers. Its primary job is protocol translation and message forwarding, not policy enforcement.
Think of it as a smart pipe. It speaks MCP on both ends, but may translate between transports (stdio ↔ HTTP/SSE, HTTP ↔ WebSocket), normalize message formats, or fan out a single client connection to multiple upstream servers and merge their tool registries.
What an MCP Proxy Server typically does:
- Transport bridging (stdio → SSE, SSE → WebSocket)
- Tool namespace aggregation from multiple upstream servers
- Transparent message forwarding with minimal transformation
- Basic connection pooling to upstream servers
- Protocol version normalization
What it typically does NOT do:
- Validate authentication tokens
- Enforce authorization policies
- Rate limit individual clients
- Route based on tool name semantics
- Provide centralized observability
Proxy Architecture
Want to analyze your API security?
Import your OpenAPI spec and generate a Security Report automatically.
┌─────────────────────────────────────────────────────────────┐ │ MCP CLIENT LAYER │ │ (Claude Desktop, Cursor, AI Agent) │ └────────────────────────┬────────────────────────────────────┘ │ stdio / SSE / WebSocket ▼ ┌─────────────────────────────────────────────────────────────┐ │ MCP PROXY SERVER │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │ │ Transport │ │ Tool │ │ Message │ │ │ │ Adapter │ │ Registry │ │ Forwarder │ │ │ │ (stdio→SSE) │ │ (merged) │ │ (passthrough) │ │ │ └──────────────┘ └──────────────┘ └──────────────────┘ │ └────────┬──────────────────┬───────────────────┬─────────────┘ │ │ │ ▼ ▼ ▼ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │ MCP Server │ │ MCP Server │ │ MCP Server │ │ (Files) │ │ (Database) │ │ (Web Search) │ └──────────────┘ └──────────────┘ └──────────────────────┘
The proxy exposes a **single unified MCP endpoint** to the client. The client calls `tools/list` and gets back a merged registry of every tool across all upstream servers. When the client calls `tools/call` for a specific tool, the proxy knows which upstream owns that tool and forwards accordingly.
### A Minimal MCP Proxy in Node.js
Here's a functional proxy that aggregates two upstream MCP servers:
```typescript
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { SSEClientTransport } from "@modelcontextprotocol/sdk/client/sse.js";
const UPSTREAM_SERVERS = [
{ name: "files", url: "http://localhost:3001/sse" },
{ name: "database", url: "http://localhost:3002/sse" },
];
async function createProxy() {
// Connect to all upstream servers
const clients: Array<{ name: string; client: Client; tools: string[] }> = [];
const allTools: Record<string, unknown>[] = [];
for (const upstream of UPSTREAM_SERVERS) {
const client = new Client(
{ name: `proxy-${upstream.name}`, version: "1.0.0" },
{ capabilities: {} }
);
await client.connect(new SSEClientTransport(new URL(upstream.url)));
const toolList = await client.listTools();
const toolNames = toolList.tools.map((t) => t.name);
clients.push({ name: upstream.name, client, tools: toolNames });
allTools.push(...toolList.tools);
}
// Expose aggregated interface
const server = new Server(
{ name: "mcp-proxy", version: "1.0.0" },
{ capabilities: { tools: {} } }
);
server.setRequestHandler("tools/list", async () => ({ tools: allTools }));
server.setRequestHandler("tools/call", async (request) => {
const toolName = request.params.name;
const owner = clients.find((c) => c.tools.includes(toolName));
if (!owner) throw new Error(`Unknown tool: ${toolName}`);
return owner.client.callTool(request.params);
});
const transport = new StdioServerTransport();
await server.connect(transport);
console.error("MCP Proxy running on stdio");
}
createProxy().catch(console.error);
This is production-viable for simple multi-server aggregation. Notice what's missing: no auth, no rate limiting, no audit log. For a local developer setup, that's fine. For a team of 20 engineers sharing the same MCP infrastructure, it's a security gap.
What Is an MCP Gateway?
An MCP Gateway is a control plane for MCP traffic. It's what you build when a proxy is no longer sufficient — when you need policy enforcement, identity-aware routing, observability, and enterprise-grade reliability.
If an MCP Proxy is a smart pipe, an MCP Gateway is a traffic cop, auditor, and load balancer combined.
What an MCP Gateway does:
- Validates authentication (OAuth 2.0, API keys, mTLS)
- Enforces authorization per tool, per client, per team
- Rate limits at multiple granularities
- Routes tool calls to the correct upstream cluster based on tool name
- Aggregates logs and metrics from all MCP traffic
- Manages upstream server health and circuit breaking
- Enforces TLS everywhere
- Provides a developer portal or admin API for managing upstream registrations
Gateway Architecture
┌─────────────────────────────────────────────────────────────────┐
│ CLIENT LAYER │
│ Claude Desktop │ Cursor │ AI Agents │ CI Pipelines │
└────────┬─────────────┬────────────┬──────────────┬──────────────┘
│ │ │ │
└─────────────┴────────────┴──────────────┘
│ HTTPS/SSE or WebSocket
│ Bearer Token / mTLS
▼
┌─────────────────────────────────────────────────────────────────┐
│ MCP GATEWAY │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ Auth & │ │ Rate │ │ Tool Router │ │
│ │ AuthZ │ │ Limiter │ │ (name → upstream) │ │
│ │ (OAuth/ │ │ (per client │ │ │ │
│ │ API Keys) │ │ per tool) │ │ │ │
│ └─────────────┘ └──────────────┘ └────────────────────────┘ │
│ ┌─────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ Audit Log │ │ Circuit │ │ Load Balancer │ │
│ │ & Metrics │ │ Breaker │ │ (upstream pool) │ │
│ └─────────────┘ └──────────────┘ └────────────────────────┘ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Upstream Registry ││
│ │ (server configs, health endpoints, capability cache) ││
│ └─────────────────────────────────────────────────────────────┘│
└───────────┬──────────────┬───────────────┬────────────┬─────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ MCP │ │ MCP │ │ MCP │ │ MCP │
│ Server │ │ Server │ │ Cluster │ │ Server │
│ (Files) │ │ (DB) │ │ (Exec×3) │ │ (Search) │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
The key architectural distinction: the Gateway owns the policy layer. Upstream MCP servers don't need to know anything about authentication or rate limiting — they just receive validated, authorized tool call requests from the Gateway.
Side-by-Side Comparison
| Dimension | MCP Proxy Server | MCP Gateway | |---|---|---|---| | Primary function | Transport translation, tool aggregation | Policy enforcement, traffic control | | Authentication | Pass-through (rarely validates) | Full validation (OAuth, API keys, mTLS) | | Authorization | None | Per-tool, per-client, per-team ACLs | | Rate limiting | None or basic | Multi-dimensional (client/tool/global) | | Tool routing | Static (by ownership at startup) | Dynamic (name-based, weighted, canary) | | Load balancing | Minimal or none | Full (round-robin, least-conn, weighted) | | Circuit breaking | Rarely | Yes | | Observability | Basic logging | Metrics, traces, audit logs, dashboards | | TLS termination | Optional | Required | | Upstream health checks | No | Yes (active + passive) | | Configuration | Code or config file | Admin API / control plane | | Deployment complexity | Low | Medium–High | | Latency overhead | 0.1–1ms | 1–10ms | | Best for | Developer local setups, simple aggregation | Production multi-tenant, enterprise | | Horizontal scalability | Limited | Designed for it | | Spec compliance validation | No | Optional (integrate with MCPForge Verify) |
Deep Dive: Authentication and Authorization
How Proxies Handle Auth
Most MCP Proxy implementations treat authentication as an upstream concern. If Client A sends a Bearer token in an HTTP header, the proxy forwards that header unchanged to the upstream MCP server, which is responsible for validation.
This works, but it creates a distributed auth problem: every upstream server must independently validate tokens. If you have 8 upstream MCP servers, you have 8 places where auth logic can diverge, token validation libraries can be outdated, or JWT secrets can be misconfigured.
Client → [Bearer token] → Proxy → [Bearer token forwarded] → Server A
→ [Bearer token forwarded] → Server B
→ [Bearer token forwarded] → Server C
Each server validates independently. Rotating a secret requires touching every server.
How Gateways Handle Auth
An MCP Gateway validates auth once, at the edge, then passes a verified identity context to upstream servers — typically as an internal header or a short-lived internal token that upstream servers trust unconditionally from the Gateway.
Client → [Bearer token] → Gateway validates → Internal JWT with identity claims
→ Server A (trusts Gateway internal token)
→ Server B (trusts Gateway internal token)
→ Server C (trusts Gateway internal token)
Key benefits:
- Token validation logic lives in one place
- Secret rotation is a Gateway-only operation
- Upstream servers can be internal-only (not internet-exposed)
- Identity context (user ID, team, scopes) propagates to all upstreams automatically
OAuth 2.0 Flow for MCP Gateway
The MCP specification aligns with OAuth 2.0 for HTTP-based transports. A production Gateway should implement:
// Gateway auth middleware (Express example)
import { expressjwt } from "express-jwt";
import jwksRsa from "jwks-rsa";
const validateToken = expressjwt({
secret: jwksRsa.expressJwtSecret({
cache: true,
rateLimit: true,
jwksUri: `${process.env.AUTH_DOMAIN}/.well-known/jwks.json`,
}),
audience: process.env.MCP_API_AUDIENCE,
issuer: process.env.AUTH_DOMAIN,
algorithms: ["RS256"],
});
// Tool-level authorization
function authorizeToolCall(
userScopes: string[],
toolName: string,
toolPermissions: Record<string, string[]>
): boolean {
const required = toolPermissions[toolName] ?? ["mcp:tools:read"];
return required.every((scope) => userScopes.includes(scope));
}
// Example tool permission map
const toolPermissions: Record<string, string[]> = {
read_file: ["mcp:files:read"],
write_file: ["mcp:files:write"],
execute_code: ["mcp:execution:run", "mcp:execution:admin"],
list_databases: ["mcp:db:read"],
run_query: ["mcp:db:read"],
drop_table: ["mcp:db:admin"], // Requires elevated scope
};
This authorization model is tool-aware — not just endpoint-aware. It's one of the most important capabilities a Gateway provides that a proxy cannot replicate without becoming a gateway itself.
Tool Routing: Static vs Dynamic
Proxy Tool Routing
In a proxy, routing is ownership-based: at startup, the proxy queries each upstream server's tools/list, builds a map of tool name → upstream server, and uses that map for all subsequent tools/call routing. This map is typically static for the lifetime of the proxy process.
Problem: If a new tool is registered on an upstream server after proxy startup, the proxy won't know about it until it restarts or re-queries. This is a real operational issue in dynamic environments where MCP servers are deployed frequently.
Gateway Tool Routing
A Gateway implements dynamic tool routing with several routing strategies:
1. Name-based routing (most common)
# Gateway routing config
routes:
- match:
tool_prefix: "file_"
upstream: files-cluster
- match:
tool_prefix: "db_"
upstream: database-cluster
- match:
tool_name: "execute_python"
upstream: execution-sandbox
- match:
default: true
upstream: general-cluster
2. Weighted routing (canary deployments)
routes:
- match:
tool_name: "search_web"
upstreams:
- target: search-v1
weight: 90
- target: search-v2
weight: 10 # Canary: 10% to new version
3. Header-based routing (A/B testing)
routes:
- match:
header:
x-mcp-environment: "staging"
upstream: staging-cluster
- match:
default: true
upstream: production-cluster
The Gateway polls upstream servers' tools/list endpoints on a configurable interval and updates its routing table without downtime. New tools appear in the aggregated registry within one polling cycle.
Rate Limiting in MCP Infrastructure
Rate limiting MCP traffic is more nuanced than rate limiting REST APIs because MCP is protocol-aware — you want to rate limit at the tool call level, not just the connection level.
Proxy Rate Limiting
Most proxy implementations do not implement rate limiting. If they do, it's a blunt connection-level limit:
Max 100 requests/minute per connection IP
This is inadequate for production because it doesn't distinguish between:
- A client calling
list_files1000 times (cheap) - A client calling
execute_code1000 times (expensive, potentially dangerous)
Gateway Rate Limiting
A production MCP Gateway implements multi-dimensional rate limiting:
interface RateLimitConfig {
// Per authenticated client
perClient: {
requestsPerMinute: number;
burstSize: number;
};
// Per tool name
perTool: Record<string, {
requestsPerMinute: number;
requestsPerDay?: number;
}>;
// Per upstream server (protect upstream capacity)
perUpstream: Record<string, {
maxConcurrent: number;
requestsPerSecond: number;
}>;
// Global fallback
global: {
requestsPerSecond: number;
};
}
const config: RateLimitConfig = {
perClient: {
requestsPerMinute: 600,
burstSize: 50,
},
perTool: {
execute_code: { requestsPerMinute: 30, requestsPerDay: 500 },
search_web: { requestsPerMinute: 60 },
read_file: { requestsPerMinute: 600 },
},
perUpstream: {
"execution-sandbox": { maxConcurrent: 10, requestsPerSecond: 5 },
"database-cluster": { maxConcurrent: 50, requestsPerSecond: 100 },
},
global: {
requestsPerSecond: 500,
},
};
Rate limit violations return a structured MCP error response:
{
"jsonrpc": "2.0",
"id": "req_123",
"error": {
"code": -32029,
"message": "Rate limit exceeded",
"data": {
"limit_type": "per_tool",
"tool": "execute_code",
"retry_after": 45,
"limit": 30,
"window": "60s"
}
}
}
This response follows JSON-RPC 2.0 error format and gives clients actionable retry information — a detail that most custom proxy implementations forget entirely.
Observability: The Biggest Gap Between Proxies and Gateways
If you're running MCP in production and you can't answer these questions in under 30 seconds, you don't have enough observability:
- Which tool is being called most frequently right now?
- Which client is generating the most load?
- What's the p99 latency of
execute_codetoday? - Did any upstream server fail in the last hour?
- Which tool calls returned errors and why?
A proxy gives you log lines. A Gateway gives you answers.
Metrics to Collect
// Gateway metrics (Prometheus format)
const metrics = {
// Counter: total tool calls
mcp_tool_calls_total: new Counter({
name: 'mcp_tool_calls_total',
help: 'Total MCP tool calls',
labelNames: ['tool_name', 'upstream', 'client_id', 'status'],
}),
// Histogram: tool call duration
mcp_tool_call_duration_seconds: new Histogram({
name: 'mcp_tool_call_duration_seconds',
help: 'MCP tool call duration in seconds',
labelNames: ['tool_name', 'upstream'],
buckets: [0.01, 0.05, 0.1, 0.5, 1, 5, 30, 120],
}),
// Gauge: active connections
mcp_active_connections: new Gauge({
name: 'mcp_active_connections',
help: 'Current active MCP client connections',
labelNames: ['client_type'],
}),
// Counter: auth failures
mcp_auth_failures_total: new Counter({
name: 'mcp_auth_failures_total',
help: 'Authentication failures',
labelNames: ['reason', 'client_ip'],
}),
// Gauge: circuit breaker state
mcp_circuit_breaker_state: new Gauge({
name: 'mcp_circuit_breaker_state',
help: 'Circuit breaker state (0=closed, 1=open, 2=half-open)',
labelNames: ['upstream'],
}),
};
Structured Audit Logging
For compliance and debugging, every tool call should produce a structured audit log entry:
{
"timestamp": "2025-01-28T14:32:10.421Z",
"event": "tool_call",
"request_id": "req_7f3a9b2c",
"client": {
"id": "client_abc123",
"type": "claude_desktop",
"ip": "10.0.1.45",
"user_id": "user_456",
"team_id": "team_engineering"
},
"tool": {
"name": "execute_code",
"upstream": "execution-sandbox",
"upstream_instance": "sandbox-pod-3"
},
"duration_ms": 1847,
"status": "success",
"tokens_used": null,
"rate_limit_remaining": 27
}
This structured format is queryable in any log aggregation system (Datadog, Grafana Loki, Elastic, CloudWatch) and supports compliance requirements like SOC 2 audit trails.
Load Balancing in MCP Gateway Deployments
When a single MCP server can't handle your tool call volume, you need to run multiple instances and load balance across them. This is a Gateway capability — proxies don't have the upstream pool management to do this reliably.
Health Check Configuration
upstreams:
execution-sandbox:
instances:
- url: http://sandbox-1:8080
- url: http://sandbox-2:8080
- url: http://sandbox-3:8080
health_check:
path: /health
interval: 10s
timeout: 3s
healthy_threshold: 2
unhealthy_threshold: 3
load_balancing:
algorithm: least_connections # Route to instance with fewest active calls
circuit_breaker:
enabled: true
failure_threshold: 5 # Open after 5 failures
success_threshold: 2 # Close after 2 successes in half-open
timeout: 30s # Stay open for 30s before half-open
The MCP-Specific Load Balancing Challenge
SSE (Server-Sent Events) transports create a sticky session requirement: once a client establishes an SSE connection to an upstream instance, subsequent messages in that session should go to the same instance (because MCP session state may be held in memory on that instance).
This means your Gateway needs session-aware load balancing for SSE transports, not pure round-robin:
// Session-affinity load balancer
class MCPSessionBalancer {
private sessionMap = new Map<string, string>(); // sessionId → instanceUrl
private instances: string[];
private roundRobinIndex = 0;
assignInstance(sessionId: string): string {
// Check if session already has an instance
if (this.sessionMap.has(sessionId)) {
const assigned = this.sessionMap.get(sessionId)!;
if (this.isHealthy(assigned)) return assigned;
// Instance unhealthy — fall through to reassign
}
// Assign new instance (round-robin over healthy instances)
const healthy = this.instances.filter(this.isHealthy.bind(this));
if (healthy.length === 0) throw new Error('No healthy upstream instances');
const selected = healthy[this.roundRobinIndex % healthy.length];
this.roundRobinIndex++;
this.sessionMap.set(sessionId, selected);
return selected;
}
releaseSession(sessionId: string): void {
this.sessionMap.delete(sessionId);
}
private isHealthy(url: string): boolean {
// Check against health check results
return this.healthStatus.get(url) === 'healthy';
}
}
For stateless MCP servers (those that don't hold session state in memory), standard least-connections or round-robin load balancing works fine. Designing your upstream MCP servers to be stateless is a best practice that simplifies your Gateway configuration significantly.
Security Implications: The Whole Picture
Threat Model Comparison
| Threat | Proxy Protection | Gateway Protection |
|---|---|---|
| Unauthenticated tool calls | ❌ None | ✅ Auth required at gateway |
| Tool call injection (malicious input) | ❌ None | ⚠️ Input validation optional |
| Excessive tool call volume | ❌ None | ✅ Rate limiting |
| Unauthorized tool access | ❌ None | ✅ Per-tool ACLs |
| Credential leakage in logs | ⚠️ Risk if logging headers | ✅ Redact sensitive fields |
| Upstream server enumeration | ❌ Exposed topology | ✅ Topology hidden behind gateway |
| Man-in-the-middle (client→proxy) | ⚠️ If no TLS | ✅ TLS termination enforced |
| Compromised upstream server | ❌ Affects all clients | ⚠️ Isolated by circuit breaker |
| Audit trail for compliance | ❌ None | ✅ Structured audit log |
Network Topology Security
One of the most underappreciated security benefits of a Gateway: your upstream MCP servers become internal-only.
# Without Gateway
Internet → Proxy → MCP Server (must be internet-accessible)
# With Gateway
Internet → Gateway (public, TLS, auth) → MCP Servers (internal network only)
(no public exposure)
Upstream MCP servers in a Gateway architecture only need to accept connections from the Gateway's IP range. This eliminates an entire class of attack surface — external actors can't probe your MCP server capabilities or attempt direct auth bypass.
Security Scanning and Compliance Validation
Before promoting any MCP Proxy or Gateway deployment to production, run it through MCPForge Verify to catch protocol compliance issues that create security gaps. Common findings include:
- Missing
Content-Typevalidation on HTTP transports - JSON-RPC error messages that leak internal stack traces
tools/listresponses that expose internal server metadata- Capability negotiation accepting unsupported capability flags
Review the MCPForge Security Reports page for known vulnerability patterns in MCP implementations — several proxy libraries have historically forwarded authorization headers to logs, creating credential exposure incidents.
Enterprise Deployment Patterns
Pattern 1: Developer Workstation (Proxy Only)
Claude Desktop (stdio)
↓
Local MCP Proxy (stdio → aggregates 3–5 local/remote MCP servers)
↓
[Files MCP] [GitHub MCP] [Slack MCP] [DB MCP]
When to use: Individual developer productivity. No shared infrastructure. No multi-tenancy. Simple setup with a config file.
Tools: mcp-remote, custom Node.js proxy, Claude Desktop mcpServers config with multiple entries.
Pattern 2: Team Shared Infrastructure (Gateway)
[Dev laptops] → Claude Desktop
[CI/CD agents] → AI automation
[Cursor IDEs] ────────────────→ MCP Gateway (auth, routing, rate limits)
↓
[Files] [DB] [Code Execution] [APIs]
When to use: 5–50 engineers sharing MCP infrastructure. Need audit trails, per-team rate limits, access control.
Key requirements: Identity provider integration (Okta, Auth0, Google), per-team tool ACLs, centralized logging.
Pattern 3: Multi-Region Enterprise (Gateway + Proxy)
┌─── Global Load Balancer ───┐
↓ ↓
Gateway (us-east-1) Gateway (eu-west-1)
↓ ↓ ↓ ↓
[Proxy A] [Proxy B] [Proxy C] [Proxy D]
↓ ↓ ↓ ↓
[MCP Servers] [MCP Servers] [MCP Servers] [MCP Servers]
(regional) (regional) (regional) (regional)
When to use: Enterprise with data residency requirements (GDPR), global AI deployments, multiple business units with different tool sets.
Pattern detail: The Gateway handles authentication and global routing decisions. Regional Proxy Servers handle transport normalization and local server aggregation. Data never crosses region boundaries without explicit routing policy.
Pattern 4: Claude + Cursor Multi-Transport
This is a common real-world scenario: Claude Desktop uses stdio transport, but Cursor and web-based agents use HTTP/SSE. A Gateway solves the multi-transport problem:
Claude Desktop (stdio)
↓
Local stdio proxy (thin bridge) ──→ MCP Gateway (HTTP/SSE)
↓
Cursor IDE (HTTP/SSE) ───────────→ MCP Gateway (HTTP/SSE)
↓
Web Agent (WebSocket) ───────────→ MCP Gateway (WebSocket)
↓
[Single upstream MCP fleet]
All three client types share the same MCP server fleet, the same audit log, and the same rate limits — but connect via their native transport. The Gateway normalizes transport differences before hitting upstreams.
Performance Considerations
Latency Budget
For every MCP tool call, understand your latency budget:
Total call time = Client processing
+ Network (client → gateway)
+ Gateway overhead (auth + routing + rate check)
+ Network (gateway → upstream)
+ Upstream processing time
+ Network (upstream → gateway)
+ Network (gateway → client)
In a local network:
- Proxy overhead: 0.1–1ms (pure forwarding)
- Gateway overhead: 1–10ms (auth validation, routing lookup, rate limit check)
- Upstream processing: 1ms to 120 seconds (depends on tool)
For tools that take > 500ms (web search, code execution, database queries), Gateway overhead is negligible — under 2% of total call time. For ultra-low-latency tools (< 10ms), Gateway overhead may be meaningful.
Connection Pooling
Don't create new connections to upstream MCP servers per tool call. Maintain persistent connection pools:
// Gateway upstream connection pool
class UpstreamConnectionPool {
private pools = new Map<string, Client[]>();
private maxPoolSize: number;
constructor(maxPoolSize = 10) {
this.maxPoolSize = maxPoolSize;
}
async acquire(upstreamUrl: string): Promise<Client> {
const pool = this.pools.get(upstreamUrl) ?? [];
// Return idle connection if available
const idle = pool.find(c => c.isIdle());
if (idle) return idle;
// Create new connection if pool not at capacity
if (pool.length < this.maxPoolSize) {
const client = await this.createClient(upstreamUrl);
pool.push(client);
this.pools.set(upstreamUrl, pool);
return client;
}
// Wait for an available connection
return this.waitForAvailable(upstreamUrl);
}
private async createClient(url: string): Promise<Client> {
const client = new Client(
{ name: 'gateway-upstream-client', version: '1.0.0' },
{ capabilities: {} }
);
await client.connect(new SSEClientTransport(new URL(url)));
return client;
}
}
Without connection pooling, every tool call incurs SSE handshake overhead (100–300ms). With pooling, connections are warm and tool calls use an already-established channel.
Decision Matrix: When to Use What
Quick Decision Guide
Start here:
│
├─ Single developer, local tools only?
│ └─→ No proxy or gateway needed. Configure Claude Desktop directly.
│
├─ Multiple local MCP servers, single developer?
│ └─→ MCP Proxy. Simple aggregation, no auth needed.
│
├─ Remote MCP servers + Claude Desktop (stdio only)?
│ └─→ MCP Proxy for stdio→HTTP bridging.
│
├─ Multiple developers sharing MCP infrastructure?
│ └─→ MCP Gateway. Auth + rate limits + audit log.
│
├─ Need per-tool access control?
│ └─→ MCP Gateway. Proxies can't do this.
│
├─ Multiple teams, different tool sets per team?
│ └─→ MCP Gateway with team-scoped tool routing.
│
├─ Compliance requirements (SOC 2, GDPR)?
│ └─→ MCP Gateway. Audit log is non-negotiable.
│
├─ High-volume AI pipeline (>1000 tool calls/minute)?
│ └─→ MCP Gateway with load balancing + circuit breaking.
│
└─ Multi-region or data residency requirements?
└─→ MCP Gateway (global) + MCP Proxies (regional aggregation).
Detailed Decision Table
| Scenario | Proxy | Gateway | Both |
|---|---|---|---|
| Solo developer, local tools | ✅ | ❌ Overkill | ❌ |
| Team of 2–5, internal tools | ✅ | ⚠️ Consider | ❌ |
| Team of 10+, shared MCP | ❌ Insufficient | ✅ | ❌ |
| Claude Desktop → remote servers | ✅ (bridge) | ❌ | ✅ (bridge + gateway) |
| External clients (SaaS AI) | ❌ | ✅ Required | ❌ |
| Compliance/audit requirements | ❌ | ✅ Required | ❌ |
| Multi-region deployment | ❌ | ✅ | ✅ (gateway + regional proxy) |
| Legacy MCP server (old transport) | ✅ (normalize) | ❌ | ✅ (proxy normalizes, gateway routes) |
| High availability (99.9%+ SLA) | ❌ | ✅ | ❌ |
| Canary deployments for MCP servers | ❌ | ✅ | ❌ |
Production Deployment Checklist
Before going to production with either component, validate the following:
MCP Proxy Checklist
- Upstream server URLs are environment-variable-driven, not hardcoded
- Proxy restarts gracefully (re-queries upstream tool lists on reconnect)
- Error responses from upstreams are properly propagated to clients (not swallowed)
- Connection timeouts are configured (avoid hanging indefinitely on unresponsive upstreams)
- Proxy logs include correlation IDs for tracing tool calls end-to-end
- Transport choice is documented and matches client capabilities
- Tested with MCPForge Verify for protocol compliance
MCP Gateway Checklist
- TLS configured for all client-facing and upstream connections
- Authentication validated (test with expired token, malformed token, missing token)
- Rate limits configured per tool category (not just globally)
- Circuit breakers tested (manually kill upstream, verify graceful degradation)
- Health check endpoints verified for all upstream servers
- Audit logging streams to SIEM or log aggregation
- Metrics exported to monitoring stack (Prometheus/Grafana or equivalent)
- Zero-downtime deployment tested (rolling restart without dropping active SSE sessions)
- Upstream server topology not exposed in error messages
- Admin API protected (separate auth from client API)
- Run MCPForge Verify against Gateway endpoint for spec compliance
- Review MCPForge Security Reports for known gateway vulnerabilities
See the MCP in Production guide for a complete infrastructure walkthrough covering deployment pipelines, secret management, and monitoring setup.
Common Mistakes to Avoid
1. Using a proxy when you need a gateway The most common mistake. Teams start with a proxy, it gets used by more engineers, and suddenly there's no audit trail for a compliance review. Design for your team size in 6 months, not today.
2. Building gateway features into upstream MCP servers Auth logic, rate limiting, and logging scattered across 8 upstream servers means 8 places to update when requirements change. Centralize all policy in the Gateway.
3. Not accounting for SSE session affinity in load balancing Round-robining SSE connections breaks sessions when an upstream MCP server holds in-memory state. Either design stateless upstream servers or implement session-affinity in your Gateway.
4. Logging full tool call payloads without redaction Tool call inputs often contain sensitive data (API keys passed as arguments, PII in queries). Gateway audit logs should redact fields matching patterns for secrets, credentials, and PII before writing to storage.
5. Skipping protocol compliance validation An MCP Gateway or proxy that forwards malformed JSON-RPC messages, returns incorrect error codes, or mishandles capability negotiation will silently break client integrations. Use MCPForge Verify before any production deployment.
6. Single-region Gateway with no fallback A Gateway is now a critical path component. Design for Gateway HA from day one: multiple instances behind a load balancer, stateless Gateway design, shared session store if needed.
Key Takeaways
-
MCP Proxy Server = transport translation + tool aggregation. Solves the "I have multiple MCP servers and one client" problem. No policy enforcement.
-
MCP Gateway = policy enforcement layer. Solves auth, authz, rate limiting, observability, and reliability for multi-client, multi-server MCP deployments.
-
Use both when: Claude Desktop (stdio) needs to reach a remote Gateway (HTTP), or when regional MCP proxy aggregation sits behind a global Gateway.
-
The key architectural principle: push policy to the Gateway, keep upstream servers simple and stateless. This makes individual MCP servers easier to build, test, and scale.
-
Validate any production MCP deployment — proxy or gateway — with MCPForge Verify for spec compliance and Security Reports for known vulnerability patterns.
-
As AI agent architectures mature, the Gateway pattern will become the standard for any non-trivial MCP deployment — just as API gateways became standard for microservices. Plan your MCP infrastructure accordingly.