What is the key difference between an MCP Proxy Server and an MCP Gateway?

An MCP Proxy Server is a transparent intermediary that forwards MCP protocol messages between a single client and one or more upstream servers, typically adding transport translation or simple routing. An MCP Gateway is a full control plane that enforces authentication, authorization, rate limiting, tool routing, and observability across multiple MCP servers in a fleet — closer in spirit to an API gateway than a simple proxy.

Can I run an MCP Proxy Server and an MCP Gateway together?

Yes, and this is the recommended pattern for enterprise deployments. The Gateway sits at the network edge handling auth, rate limiting, and routing. Individual MCP Proxy Servers sit between the Gateway and specific upstream MCP servers, handling transport translation (e.g., HTTP/SSE to stdio) or protocol normalization for legacy servers.

Does Claude Desktop work with MCP Gateways?

Claude Desktop currently only supports stdio transport natively, so connecting it directly to a remote MCP Gateway requires a local MCP Proxy Server that bridges stdio (local) to HTTP/SSE or WebSocket (remote Gateway). Tools like mcp-remote or a local proxy sidecar solve this bridging problem.

How does authentication work differently in an MCP Proxy vs an MCP Gateway?

An MCP Proxy typically passes authentication credentials through unchanged — it may forward Bearer tokens or API keys but rarely validates them itself. An MCP Gateway actively enforces authentication: it validates OAuth tokens, API keys, or mTLS certificates before any tool call reaches an upstream server, acting as the single authentication enforcement point.

What happens to MCP tool calls during upstream server failures in a Gateway setup?

A well-designed MCP Gateway implements circuit breaker patterns and health checks per upstream server. When a server becomes unhealthy, the Gateway removes it from the routing pool and returns structured MCP error responses to clients rather than hanging. Individual proxy connections to that server are drained. Recovery uses health probe polling before re-adding the server to the pool.

Is rate limiting possible at the MCP protocol level?

Yes. MCP Gateways can implement rate limiting at multiple granularities: per client identity, per tool name, per upstream server, or globally. Because MCP uses JSON-RPC 2.0 over a persistent transport, rate limiting is typically applied at the connection or request level, not HTTP-layer — though HTTP-based MCP deployments can also use standard HTTP rate limiting headers.

What observability data should I collect from an MCP Gateway?

At minimum: tool call latency (p50/p95/p99), error rates by tool and upstream server, active connection count, token usage per client (if accessible), authentication failure rates, and circuit breaker state changes. For AI-specific observability, log the full tool call request and response payloads in a structured format for debugging agent behavior.

Can an MCP Gateway do semantic tool routing based on the tool name or input?

Yes. Advanced MCP Gateways implement tool-aware routing, where the routing decision is made based on the tools/call method's tool name parameter. This allows you to route `search_web` to one upstream cluster and `run_code` to a sandboxed execution cluster — all from a single MCP endpoint exposed to clients.

How do I validate that my MCP Proxy or Gateway is spec-compliant before production?

Use MCPForge Verify (/verify) to run automated protocol compliance checks against your proxy or gateway endpoint. It tests JSON-RPC message framing, capability negotiation, tool discovery, error response formatting, and transport behavior — catching spec deviations before they break client integrations.

What are the performance trade-offs of adding a Gateway layer to MCP?

A Gateway adds one network hop and processing overhead per tool call — typically 1–5ms for auth validation and routing on a local network. For long-running tool calls (e.g., code execution, web search), this overhead is negligible. For latency-sensitive, high-frequency tool calls, minimize Gateway middleware and use connection pooling to upstream servers to reduce per-call overhead.

MCP Proxy Server vs MCP Gateway: Complete Comparison

When you move beyond a single MCP server running on localhost, you immediately face an infrastructure decision that most tutorials skip entirely: do you need an MCP Proxy Server, an MCP Gateway, or both?

Get this wrong and you'll either under-engineer your deployment (no auth, no observability, no rate limiting) or over-engineer it (a full gateway for a two-server setup that needed a five-line proxy config). This guide gives you the mental model, architecture diagrams, and decision framework to get it right the first time.

What Is an MCP Proxy Server?

An MCP Proxy Server is a transparent intermediary that sits between an MCP client and one or more upstream MCP servers. Its primary job is protocol translation and message forwarding, not policy enforcement.

Think of it as a smart pipe. It speaks MCP on both ends, but may translate between transports (stdio ↔ HTTP/SSE, HTTP ↔ WebSocket), normalize message formats, or fan out a single client connection to multiple upstream servers and merge their tool registries.

What an MCP Proxy Server typically does:

Transport bridging (stdio → SSE, SSE → WebSocket)
Tool namespace aggregation from multiple upstream servers
Transparent message forwarding with minimal transformation
Basic connection pooling to upstream servers
Protocol version normalization

What it typically does NOT do:

Validate authentication tokens
Enforce authorization policies
Rate limit individual clients
Route based on tool name semantics
Provide centralized observability

Proxy Architecture

Want to analyze your API security?

Import your OpenAPI spec and generate a Security Report automatically.

┌─────────────────────────────────────────────────────────────┐ │ MCP CLIENT LAYER │ │ (Claude Desktop, Cursor, AI Agent) │ └────────────────────────┬────────────────────────────────────┘ │ stdio / SSE / WebSocket ▼ ┌─────────────────────────────────────────────────────────────┐ │ MCP PROXY SERVER │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ │ │ Transport │ │ Tool │ │ Message │ │ │ │ Adapter │ │ Registry │ │ Forwarder │ │ │ │ (stdio→SSE) │ │ (merged) │ │ (passthrough) │ │ │ └──────────────┘ └──────────────┘ └──────────────────┘ │ └────────┬──────────────────┬───────────────────┬─────────────┘ │ │ │ ▼ ▼ ▼ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │ MCP Server │ │ MCP Server │ │ MCP Server │ │ (Files) │ │ (Database) │ │ (Web Search) │ └──────────────┘ └──────────────┘ └──────────────────────┘


The proxy exposes a **single unified MCP endpoint** to the client. The client calls `tools/list` and gets back a merged registry of every tool across all upstream servers. When the client calls `tools/call` for a specific tool, the proxy knows which upstream owns that tool and forwards accordingly.

### A Minimal MCP Proxy in Node.js

Here's a functional proxy that aggregates two upstream MCP servers:

```typescript
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { SSEClientTransport } from "@modelcontextprotocol/sdk/client/sse.js";

const UPSTREAM_SERVERS = [
  { name: "files", url: "http://localhost:3001/sse" },
  { name: "database", url: "http://localhost:3002/sse" },
];

async function createProxy() {
  // Connect to all upstream servers
  const clients: Array<{ name: string; client: Client; tools: string[] }> = [];
  const allTools: Record<string, unknown>[] = [];

  for (const upstream of UPSTREAM_SERVERS) {
    const client = new Client(
      { name: `proxy-${upstream.name}`, version: "1.0.0" },
      { capabilities: {} }
    );
    await client.connect(new SSEClientTransport(new URL(upstream.url)));
    
    const toolList = await client.listTools();
    const toolNames = toolList.tools.map((t) => t.name);
    clients.push({ name: upstream.name, client, tools: toolNames });
    allTools.push(...toolList.tools);
  }

  // Expose aggregated interface
  const server = new Server(
    { name: "mcp-proxy", version: "1.0.0" },
    { capabilities: { tools: {} } }
  );

  server.setRequestHandler("tools/list", async () => ({ tools: allTools }));

  server.setRequestHandler("tools/call", async (request) => {
    const toolName = request.params.name;
    const owner = clients.find((c) => c.tools.includes(toolName));
    if (!owner) throw new Error(`Unknown tool: ${toolName}`);
    return owner.client.callTool(request.params);
  });

  const transport = new StdioServerTransport();
  await server.connect(transport);
  console.error("MCP Proxy running on stdio");
}

createProxy().catch(console.error);

This is production-viable for simple multi-server aggregation. Notice what's missing: no auth, no rate limiting, no audit log. For a local developer setup, that's fine. For a team of 20 engineers sharing the same MCP infrastructure, it's a security gap.

What Is an MCP Gateway?

An MCP Gateway is a control plane for MCP traffic. It's what you build when a proxy is no longer sufficient — when you need policy enforcement, identity-aware routing, observability, and enterprise-grade reliability.

If an MCP Proxy is a smart pipe, an MCP Gateway is a traffic cop, auditor, and load balancer combined.

What an MCP Gateway does:

Validates authentication (OAuth 2.0, API keys, mTLS)
Enforces authorization per tool, per client, per team
Rate limits at multiple granularities
Routes tool calls to the correct upstream cluster based on tool name
Aggregates logs and metrics from all MCP traffic
Manages upstream server health and circuit breaking
Enforces TLS everywhere
Provides a developer portal or admin API for managing upstream registrations

Gateway Architecture

┌─────────────────────────────────────────────────────────────────┐
│                      CLIENT LAYER                                │
│  Claude Desktop  │  Cursor  │  AI Agents  │  CI Pipelines       │
└────────┬─────────────┬────────────┬──────────────┬──────────────┘
         │             │            │              │
         └─────────────┴────────────┴──────────────┘
                              │  HTTPS/SSE or WebSocket
                              │  Bearer Token / mTLS
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                     MCP GATEWAY                                  │
│                                                                  │
│  ┌─────────────┐  ┌──────────────┐  ┌────────────────────────┐ │
│  │   Auth &    │  │    Rate      │  │   Tool Router          │ │
│  │   AuthZ     │  │   Limiter    │  │   (name → upstream)    │ │
│  │  (OAuth/    │  │  (per client │  │                        │ │
│  │  API Keys)  │  │  per tool)   │  │                        │ │
│  └─────────────┘  └──────────────┘  └────────────────────────┘ │
│  ┌─────────────┐  ┌──────────────┐  ┌────────────────────────┐ │
│  │  Audit Log  │  │   Circuit    │  │   Load Balancer        │ │
│  │  & Metrics  │  │   Breaker    │  │   (upstream pool)      │ │
│  └─────────────┘  └──────────────┘  └────────────────────────┘ │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                  Upstream Registry                           ││
│  │   (server configs, health endpoints, capability cache)       ││
│  └─────────────────────────────────────────────────────────────┘│
└───────────┬──────────────┬───────────────┬────────────┬─────────┘
            │              │               │            │
            ▼              ▼               ▼            ▼
     ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐
     │  MCP     │  │  MCP     │  │  MCP     │  │  MCP     │
     │ Server   │  │ Server   │  │ Cluster  │  │ Server   │
     │ (Files)  │  │  (DB)    │  │ (Exec×3) │  │ (Search) │
     └──────────┘  └──────────┘  └──────────┘  └──────────┘

The key architectural distinction: the Gateway owns the policy layer. Upstream MCP servers don't need to know anything about authentication or rate limiting — they just receive validated, authorized tool call requests from the Gateway.

Side-by-Side Comparison

| Dimension | MCP Proxy Server | MCP Gateway | |---|---|---|---| | Primary function | Transport translation, tool aggregation | Policy enforcement, traffic control | | Authentication | Pass-through (rarely validates) | Full validation (OAuth, API keys, mTLS) | | Authorization | None | Per-tool, per-client, per-team ACLs | | Rate limiting | None or basic | Multi-dimensional (client/tool/global) | | Tool routing | Static (by ownership at startup) | Dynamic (name-based, weighted, canary) | | Load balancing | Minimal or none | Full (round-robin, least-conn, weighted) | | Circuit breaking | Rarely | Yes | | Observability | Basic logging | Metrics, traces, audit logs, dashboards | | TLS termination | Optional | Required | | Upstream health checks | No | Yes (active + passive) | | Configuration | Code or config file | Admin API / control plane | | Deployment complexity | Low | Medium–High | | Latency overhead | 0.1–1ms | 1–10ms | | Best for | Developer local setups, simple aggregation | Production multi-tenant, enterprise | | Horizontal scalability | Limited | Designed for it | | Spec compliance validation | No | Optional (integrate with MCPForge Verify) |

Deep Dive: Authentication and Authorization

How Proxies Handle Auth

Most MCP Proxy implementations treat authentication as an upstream concern. If Client A sends a Bearer token in an HTTP header, the proxy forwards that header unchanged to the upstream MCP server, which is responsible for validation.

This works, but it creates a distributed auth problem: every upstream server must independently validate tokens. If you have 8 upstream MCP servers, you have 8 places where auth logic can diverge, token validation libraries can be outdated, or JWT secrets can be misconfigured.

Client → [Bearer token] → Proxy → [Bearer token forwarded] → Server A
                                  → [Bearer token forwarded] → Server B
                                  → [Bearer token forwarded] → Server C

Each server validates independently. Rotating a secret requires touching every server.

How Gateways Handle Auth

An MCP Gateway validates auth once, at the edge, then passes a verified identity context to upstream servers — typically as an internal header or a short-lived internal token that upstream servers trust unconditionally from the Gateway.

Client → [Bearer token] → Gateway validates → Internal JWT with identity claims
                                             → Server A (trusts Gateway internal token)
                                             → Server B (trusts Gateway internal token)
                                             → Server C (trusts Gateway internal token)

Key benefits:

Token validation logic lives in one place
Secret rotation is a Gateway-only operation
Upstream servers can be internal-only (not internet-exposed)
Identity context (user ID, team, scopes) propagates to all upstreams automatically

OAuth 2.0 Flow for MCP Gateway

The MCP specification aligns with OAuth 2.0 for HTTP-based transports. A production Gateway should implement:

typescript

// Gateway auth middleware (Express example)
import { expressjwt } from "express-jwt";
import jwksRsa from "jwks-rsa";

const validateToken = expressjwt({
  secret: jwksRsa.expressJwtSecret({
    cache: true,
    rateLimit: true,
    jwksUri: `${process.env.AUTH_DOMAIN}/.well-known/jwks.json`,
  }),
  audience: process.env.MCP_API_AUDIENCE,
  issuer: process.env.AUTH_DOMAIN,
  algorithms: ["RS256"],
});

// Tool-level authorization
function authorizeToolCall(
  userScopes: string[],
  toolName: string,
  toolPermissions: Record<string, string[]>
): boolean {
  const required = toolPermissions[toolName] ?? ["mcp:tools:read"];
  return required.every((scope) => userScopes.includes(scope));
}

// Example tool permission map
const toolPermissions: Record<string, string[]> = {
  read_file: ["mcp:files:read"],
  write_file: ["mcp:files:write"],
  execute_code: ["mcp:execution:run", "mcp:execution:admin"],
  list_databases: ["mcp:db:read"],
  run_query: ["mcp:db:read"],
  drop_table: ["mcp:db:admin"], // Requires elevated scope
};

This authorization model is tool-aware — not just endpoint-aware. It's one of the most important capabilities a Gateway provides that a proxy cannot replicate without becoming a gateway itself.

Tool Routing: Static vs Dynamic

Proxy Tool Routing

In a proxy, routing is ownership-based: at startup, the proxy queries each upstream server's tools/list, builds a map of tool name → upstream server, and uses that map for all subsequent tools/call routing. This map is typically static for the lifetime of the proxy process.

Problem: If a new tool is registered on an upstream server after proxy startup, the proxy won't know about it until it restarts or re-queries. This is a real operational issue in dynamic environments where MCP servers are deployed frequently.

Gateway Tool Routing

A Gateway implements dynamic tool routing with several routing strategies:

1. Name-based routing (most common)

yaml

# Gateway routing config
routes:
  - match:
      tool_prefix: "file_"
    upstream: files-cluster
  - match:
      tool_prefix: "db_"
    upstream: database-cluster  
  - match:
      tool_name: "execute_python"
    upstream: execution-sandbox
  - match:
      default: true
    upstream: general-cluster

2. Weighted routing (canary deployments)

yaml

routes:
  - match:
      tool_name: "search_web"
    upstreams:
      - target: search-v1
        weight: 90
      - target: search-v2
        weight: 10  # Canary: 10% to new version

3. Header-based routing (A/B testing)

yaml

routes:
  - match:
      header:
        x-mcp-environment: "staging"
    upstream: staging-cluster
  - match:
      default: true
    upstream: production-cluster

The Gateway polls upstream servers' tools/list endpoints on a configurable interval and updates its routing table without downtime. New tools appear in the aggregated registry within one polling cycle.

Rate Limiting in MCP Infrastructure

Rate limiting MCP traffic is more nuanced than rate limiting REST APIs because MCP is protocol-aware — you want to rate limit at the tool call level, not just the connection level.

Proxy Rate Limiting

Most proxy implementations do not implement rate limiting. If they do, it's a blunt connection-level limit:

Max 100 requests/minute per connection IP

This is inadequate for production because it doesn't distinguish between:

A client calling list_files 1000 times (cheap)
A client calling execute_code 1000 times (expensive, potentially dangerous)

Gateway Rate Limiting

A production MCP Gateway implements multi-dimensional rate limiting:

typescript

interface RateLimitConfig {
  // Per authenticated client
  perClient: {
    requestsPerMinute: number;
    burstSize: number;
  };
  // Per tool name
  perTool: Record<string, {
    requestsPerMinute: number;
    requestsPerDay?: number;
  }>;
  // Per upstream server (protect upstream capacity)
  perUpstream: Record<string, {
    maxConcurrent: number;
    requestsPerSecond: number;
  }>;
  // Global fallback
  global: {
    requestsPerSecond: number;
  };
}

const config: RateLimitConfig = {
  perClient: {
    requestsPerMinute: 600,
    burstSize: 50,
  },
  perTool: {
    execute_code: { requestsPerMinute: 30, requestsPerDay: 500 },
    search_web: { requestsPerMinute: 60 },
    read_file: { requestsPerMinute: 600 },
  },
  perUpstream: {
    "execution-sandbox": { maxConcurrent: 10, requestsPerSecond: 5 },
    "database-cluster": { maxConcurrent: 50, requestsPerSecond: 100 },
  },
  global: {
    requestsPerSecond: 500,
  },
};

Rate limit violations return a structured MCP error response:

json

{
  "jsonrpc": "2.0",
  "id": "req_123",
  "error": {
    "code": -32029,
    "message": "Rate limit exceeded",
    "data": {
      "limit_type": "per_tool",
      "tool": "execute_code",
      "retry_after": 45,
      "limit": 30,
      "window": "60s"
    }
  }
}

This response follows JSON-RPC 2.0 error format and gives clients actionable retry information — a detail that most custom proxy implementations forget entirely.

Observability: The Biggest Gap Between Proxies and Gateways

If you're running MCP in production and you can't answer these questions in under 30 seconds, you don't have enough observability:

Which tool is being called most frequently right now?
Which client is generating the most load?
What's the p99 latency of execute_code today?
Did any upstream server fail in the last hour?
Which tool calls returned errors and why?

A proxy gives you log lines. A Gateway gives you answers.

Metrics to Collect

typescript

// Gateway metrics (Prometheus format)
const metrics = {
  // Counter: total tool calls
  mcp_tool_calls_total: new Counter({
    name: 'mcp_tool_calls_total',
    help: 'Total MCP tool calls',
    labelNames: ['tool_name', 'upstream', 'client_id', 'status'],
  }),
  
  // Histogram: tool call duration
  mcp_tool_call_duration_seconds: new Histogram({
    name: 'mcp_tool_call_duration_seconds',
    help: 'MCP tool call duration in seconds',
    labelNames: ['tool_name', 'upstream'],
    buckets: [0.01, 0.05, 0.1, 0.5, 1, 5, 30, 120],
  }),
  
  // Gauge: active connections
  mcp_active_connections: new Gauge({
    name: 'mcp_active_connections',
    help: 'Current active MCP client connections',
    labelNames: ['client_type'],
  }),
  
  // Counter: auth failures
  mcp_auth_failures_total: new Counter({
    name: 'mcp_auth_failures_total',
    help: 'Authentication failures',
    labelNames: ['reason', 'client_ip'],
  }),
  
  // Gauge: circuit breaker state
  mcp_circuit_breaker_state: new Gauge({
    name: 'mcp_circuit_breaker_state',
    help: 'Circuit breaker state (0=closed, 1=open, 2=half-open)',
    labelNames: ['upstream'],
  }),
};

Structured Audit Logging

For compliance and debugging, every tool call should produce a structured audit log entry:

json

{
  "timestamp": "2025-01-28T14:32:10.421Z",
  "event": "tool_call",
  "request_id": "req_7f3a9b2c",
  "client": {
    "id": "client_abc123",
    "type": "claude_desktop",
    "ip": "10.0.1.45",
    "user_id": "user_456",
    "team_id": "team_engineering"
  },
  "tool": {
    "name": "execute_code",
    "upstream": "execution-sandbox",
    "upstream_instance": "sandbox-pod-3"
  },
  "duration_ms": 1847,
  "status": "success",
  "tokens_used": null,
  "rate_limit_remaining": 27
}

This structured format is queryable in any log aggregation system (Datadog, Grafana Loki, Elastic, CloudWatch) and supports compliance requirements like SOC 2 audit trails.

Load Balancing in MCP Gateway Deployments

When a single MCP server can't handle your tool call volume, you need to run multiple instances and load balance across them. This is a Gateway capability — proxies don't have the upstream pool management to do this reliably.

Health Check Configuration

yaml

upstreams:
  execution-sandbox:
    instances:
      - url: http://sandbox-1:8080
      - url: http://sandbox-2:8080
      - url: http://sandbox-3:8080
    health_check:
      path: /health
      interval: 10s
      timeout: 3s
      healthy_threshold: 2
      unhealthy_threshold: 3
    load_balancing:
      algorithm: least_connections  # Route to instance with fewest active calls
    circuit_breaker:
      enabled: true
      failure_threshold: 5          # Open after 5 failures
      success_threshold: 2          # Close after 2 successes in half-open
      timeout: 30s                  # Stay open for 30s before half-open

The MCP-Specific Load Balancing Challenge

SSE (Server-Sent Events) transports create a sticky session requirement: once a client establishes an SSE connection to an upstream instance, subsequent messages in that session should go to the same instance (because MCP session state may be held in memory on that instance).

This means your Gateway needs session-aware load balancing for SSE transports, not pure round-robin:

typescript

// Session-affinity load balancer
class MCPSessionBalancer {
  private sessionMap = new Map<string, string>(); // sessionId → instanceUrl
  private instances: string[];
  private roundRobinIndex = 0;

  assignInstance(sessionId: string): string {
    // Check if session already has an instance
    if (this.sessionMap.has(sessionId)) {
      const assigned = this.sessionMap.get(sessionId)!;
      if (this.isHealthy(assigned)) return assigned;
      // Instance unhealthy — fall through to reassign
    }
    
    // Assign new instance (round-robin over healthy instances)
    const healthy = this.instances.filter(this.isHealthy.bind(this));
    if (healthy.length === 0) throw new Error('No healthy upstream instances');
    
    const selected = healthy[this.roundRobinIndex % healthy.length];
    this.roundRobinIndex++;
    this.sessionMap.set(sessionId, selected);
    return selected;
  }

  releaseSession(sessionId: string): void {
    this.sessionMap.delete(sessionId);
  }
  
  private isHealthy(url: string): boolean {
    // Check against health check results
    return this.healthStatus.get(url) === 'healthy';
  }
}

For stateless MCP servers (those that don't hold session state in memory), standard least-connections or round-robin load balancing works fine. Designing your upstream MCP servers to be stateless is a best practice that simplifies your Gateway configuration significantly.

Security Implications: The Whole Picture

Threat Model Comparison

Threat	Proxy Protection	Gateway Protection
Unauthenticated tool calls	❌ None	✅ Auth required at gateway
Tool call injection (malicious input)	❌ None	⚠️ Input validation optional
Excessive tool call volume	❌ None	✅ Rate limiting
Unauthorized tool access	❌ None	✅ Per-tool ACLs
Credential leakage in logs	⚠️ Risk if logging headers	✅ Redact sensitive fields
Upstream server enumeration	❌ Exposed topology	✅ Topology hidden behind gateway
Man-in-the-middle (client→proxy)	⚠️ If no TLS	✅ TLS termination enforced
Compromised upstream server	❌ Affects all clients	⚠️ Isolated by circuit breaker
Audit trail for compliance	❌ None	✅ Structured audit log

Network Topology Security

One of the most underappreciated security benefits of a Gateway: your upstream MCP servers become internal-only.

# Without Gateway
Internet → Proxy → MCP Server (must be internet-accessible)

# With Gateway  
Internet → Gateway (public, TLS, auth) → MCP Servers (internal network only)
                                                           (no public exposure)

Upstream MCP servers in a Gateway architecture only need to accept connections from the Gateway's IP range. This eliminates an entire class of attack surface — external actors can't probe your MCP server capabilities or attempt direct auth bypass.

Security Scanning and Compliance Validation

Before promoting any MCP Proxy or Gateway deployment to production, run it through MCPForge Verify to catch protocol compliance issues that create security gaps. Common findings include:

Missing Content-Type validation on HTTP transports
JSON-RPC error messages that leak internal stack traces
tools/list responses that expose internal server metadata
Capability negotiation accepting unsupported capability flags

Review the MCPForge Security Reports page for known vulnerability patterns in MCP implementations — several proxy libraries have historically forwarded authorization headers to logs, creating credential exposure incidents.

Enterprise Deployment Patterns

Pattern 1: Developer Workstation (Proxy Only)

Claude Desktop (stdio)
    ↓
Local MCP Proxy (stdio → aggregates 3–5 local/remote MCP servers)
    ↓
[Files MCP] [GitHub MCP] [Slack MCP] [DB MCP]

When to use: Individual developer productivity. No shared infrastructure. No multi-tenancy. Simple setup with a config file.

Tools: mcp-remote, custom Node.js proxy, Claude Desktop mcpServers config with multiple entries.

Pattern 2: Team Shared Infrastructure (Gateway)

[Dev laptops] → Claude Desktop
[CI/CD agents] → AI automation
[Cursor IDEs]  ────────────────→  MCP Gateway (auth, routing, rate limits)
                                        ↓
                     [Files] [DB] [Code Execution] [APIs]

When to use: 5–50 engineers sharing MCP infrastructure. Need audit trails, per-team rate limits, access control.

Key requirements: Identity provider integration (Okta, Auth0, Google), per-team tool ACLs, centralized logging.

Pattern 3: Multi-Region Enterprise (Gateway + Proxy)

                     ┌─── Global Load Balancer ───┐
                     ↓                            ↓
            Gateway (us-east-1)         Gateway (eu-west-1)
               ↓         ↓                ↓         ↓
         [Proxy A]  [Proxy B]       [Proxy C]  [Proxy D]
            ↓              ↓           ↓              ↓  
      [MCP Servers]  [MCP Servers] [MCP Servers] [MCP Servers]
      (regional)     (regional)    (regional)    (regional)

When to use: Enterprise with data residency requirements (GDPR), global AI deployments, multiple business units with different tool sets.

Pattern detail: The Gateway handles authentication and global routing decisions. Regional Proxy Servers handle transport normalization and local server aggregation. Data never crosses region boundaries without explicit routing policy.

Pattern 4: Claude + Cursor Multi-Transport

This is a common real-world scenario: Claude Desktop uses stdio transport, but Cursor and web-based agents use HTTP/SSE. A Gateway solves the multi-transport problem:

Claude Desktop (stdio)
    ↓
Local stdio proxy (thin bridge)  ──→  MCP Gateway (HTTP/SSE)
                                            ↓
Cursor IDE (HTTP/SSE) ───────────→  MCP Gateway (HTTP/SSE)
                                            ↓
Web Agent (WebSocket) ───────────→  MCP Gateway (WebSocket)
                                            ↓
                              [Single upstream MCP fleet]

All three client types share the same MCP server fleet, the same audit log, and the same rate limits — but connect via their native transport. The Gateway normalizes transport differences before hitting upstreams.

Performance Considerations

Latency Budget

For every MCP tool call, understand your latency budget:

Total call time = Client processing
               + Network (client → gateway)
               + Gateway overhead (auth + routing + rate check)
               + Network (gateway → upstream)
               + Upstream processing time
               + Network (upstream → gateway)
               + Network (gateway → client)

In a local network:

Proxy overhead: 0.1–1ms (pure forwarding)
Gateway overhead: 1–10ms (auth validation, routing lookup, rate limit check)
Upstream processing: 1ms to 120 seconds (depends on tool)

For tools that take > 500ms (web search, code execution, database queries), Gateway overhead is negligible — under 2% of total call time. For ultra-low-latency tools (< 10ms), Gateway overhead may be meaningful.

Connection Pooling

Don't create new connections to upstream MCP servers per tool call. Maintain persistent connection pools:

typescript

// Gateway upstream connection pool
class UpstreamConnectionPool {
  private pools = new Map<string, Client[]>();
  private maxPoolSize: number;
  
  constructor(maxPoolSize = 10) {
    this.maxPoolSize = maxPoolSize;
  }
  
  async acquire(upstreamUrl: string): Promise<Client> {
    const pool = this.pools.get(upstreamUrl) ?? [];
    
    // Return idle connection if available
    const idle = pool.find(c => c.isIdle());
    if (idle) return idle;
    
    // Create new connection if pool not at capacity
    if (pool.length < this.maxPoolSize) {
      const client = await this.createClient(upstreamUrl);
      pool.push(client);
      this.pools.set(upstreamUrl, pool);
      return client;
    }
    
    // Wait for an available connection
    return this.waitForAvailable(upstreamUrl);
  }
  
  private async createClient(url: string): Promise<Client> {
    const client = new Client(
      { name: 'gateway-upstream-client', version: '1.0.0' },
      { capabilities: {} }
    );
    await client.connect(new SSEClientTransport(new URL(url)));
    return client;
  }
}

Without connection pooling, every tool call incurs SSE handshake overhead (100–300ms). With pooling, connections are warm and tool calls use an already-established channel.

Decision Matrix: When to Use What

Quick Decision Guide

Start here:
│
├─ Single developer, local tools only?
│   └─→ No proxy or gateway needed. Configure Claude Desktop directly.
│
├─ Multiple local MCP servers, single developer?
│   └─→ MCP Proxy. Simple aggregation, no auth needed.
│
├─ Remote MCP servers + Claude Desktop (stdio only)?
│   └─→ MCP Proxy for stdio→HTTP bridging.
│
├─ Multiple developers sharing MCP infrastructure?
│   └─→ MCP Gateway. Auth + rate limits + audit log.
│
├─ Need per-tool access control?
│   └─→ MCP Gateway. Proxies can't do this.
│
├─ Multiple teams, different tool sets per team?
│   └─→ MCP Gateway with team-scoped tool routing.
│
├─ Compliance requirements (SOC 2, GDPR)?
│   └─→ MCP Gateway. Audit log is non-negotiable.
│
├─ High-volume AI pipeline (>1000 tool calls/minute)?
│   └─→ MCP Gateway with load balancing + circuit breaking.
│
└─ Multi-region or data residency requirements?
    └─→ MCP Gateway (global) + MCP Proxies (regional aggregation).

Detailed Decision Table

Scenario	Proxy	Gateway	Both
Solo developer, local tools	✅	❌ Overkill	❌
Team of 2–5, internal tools	✅	⚠️ Consider	❌
Team of 10+, shared MCP	❌ Insufficient	✅	❌
Claude Desktop → remote servers	✅ (bridge)	❌	✅ (bridge + gateway)
External clients (SaaS AI)	❌	✅ Required	❌
Compliance/audit requirements	❌	✅ Required	❌
Multi-region deployment	❌	✅	✅ (gateway + regional proxy)
Legacy MCP server (old transport)	✅ (normalize)	❌	✅ (proxy normalizes, gateway routes)
High availability (99.9%+ SLA)	❌	✅	❌
Canary deployments for MCP servers	❌	✅	❌

Production Deployment Checklist

Before going to production with either component, validate the following:

MCP Proxy Checklist

Upstream server URLs are environment-variable-driven, not hardcoded
Proxy restarts gracefully (re-queries upstream tool lists on reconnect)
Error responses from upstreams are properly propagated to clients (not swallowed)
Connection timeouts are configured (avoid hanging indefinitely on unresponsive upstreams)
Proxy logs include correlation IDs for tracing tool calls end-to-end
Transport choice is documented and matches client capabilities
Tested with MCPForge Verify for protocol compliance

MCP Gateway Checklist

TLS configured for all client-facing and upstream connections
Authentication validated (test with expired token, malformed token, missing token)
Rate limits configured per tool category (not just globally)
Circuit breakers tested (manually kill upstream, verify graceful degradation)
Health check endpoints verified for all upstream servers
Audit logging streams to SIEM or log aggregation
Metrics exported to monitoring stack (Prometheus/Grafana or equivalent)
Zero-downtime deployment tested (rolling restart without dropping active SSE sessions)
Upstream server topology not exposed in error messages
Admin API protected (separate auth from client API)
Run MCPForge Verify against Gateway endpoint for spec compliance
Review MCPForge Security Reports for known gateway vulnerabilities

See the MCP in Production guide for a complete infrastructure walkthrough covering deployment pipelines, secret management, and monitoring setup.

Common Mistakes to Avoid

1. Using a proxy when you need a gateway The most common mistake. Teams start with a proxy, it gets used by more engineers, and suddenly there's no audit trail for a compliance review. Design for your team size in 6 months, not today.

2. Building gateway features into upstream MCP servers Auth logic, rate limiting, and logging scattered across 8 upstream servers means 8 places to update when requirements change. Centralize all policy in the Gateway.

3. Not accounting for SSE session affinity in load balancing Round-robining SSE connections breaks sessions when an upstream MCP server holds in-memory state. Either design stateless upstream servers or implement session-affinity in your Gateway.

4. Logging full tool call payloads without redaction Tool call inputs often contain sensitive data (API keys passed as arguments, PII in queries). Gateway audit logs should redact fields matching patterns for secrets, credentials, and PII before writing to storage.

5. Skipping protocol compliance validation An MCP Gateway or proxy that forwards malformed JSON-RPC messages, returns incorrect error codes, or mishandles capability negotiation will silently break client integrations. Use MCPForge Verify before any production deployment.

6. Single-region Gateway with no fallback A Gateway is now a critical path component. Design for Gateway HA from day one: multiple instances behind a load balancer, stateless Gateway design, shared session store if needed.

Key Takeaways

MCP Proxy Server = transport translation + tool aggregation. Solves the "I have multiple MCP servers and one client" problem. No policy enforcement.
MCP Gateway = policy enforcement layer. Solves auth, authz, rate limiting, observability, and reliability for multi-client, multi-server MCP deployments.
Use both when: Claude Desktop (stdio) needs to reach a remote Gateway (HTTP), or when regional MCP proxy aggregation sits behind a global Gateway.
The key architectural principle: push policy to the Gateway, keep upstream servers simple and stateless. This makes individual MCP servers easier to build, test, and scale.
Validate any production MCP deployment — proxy or gateway — with MCPForge Verify for spec compliance and Security Reports for known vulnerability patterns.
As AI agent architectures mature, the Gateway pattern will become the standard for any non-trivial MCP deployment — just as API gateways became standard for microservices. Plan your MCP infrastructure accordingly.

MCP Proxy Server vs MCP Gateway: Complete Comparison

MCP Proxy Server vs MCP Gateway: Complete Comparison

What Is an MCP Proxy Server?

Proxy Architecture

Want to analyze your API security?

What Is an MCP Gateway?

Gateway Architecture

Side-by-Side Comparison

Deep Dive: Authentication and Authorization

How Proxies Handle Auth

How Gateways Handle Auth

OAuth 2.0 Flow for MCP Gateway

Tool Routing: Static vs Dynamic

Proxy Tool Routing

Gateway Tool Routing

Rate Limiting in MCP Infrastructure

Proxy Rate Limiting

Gateway Rate Limiting

Observability: The Biggest Gap Between Proxies and Gateways

Metrics to Collect

Structured Audit Logging

Load Balancing in MCP Gateway Deployments

Health Check Configuration

The MCP-Specific Load Balancing Challenge

Security Implications: The Whole Picture

Threat Model Comparison

Network Topology Security

Security Scanning and Compliance Validation

Enterprise Deployment Patterns

Pattern 1: Developer Workstation (Proxy Only)

Pattern 2: Team Shared Infrastructure (Gateway)

Pattern 3: Multi-Region Enterprise (Gateway + Proxy)

Pattern 4: Claude + Cursor Multi-Transport

Performance Considerations

Latency Budget

Connection Pooling

Decision Matrix: When to Use What

Quick Decision Guide

Detailed Decision Table

Production Deployment Checklist

MCP Proxy Checklist

MCP Gateway Checklist

Common Mistakes to Avoid

Key Takeaways

Frequently Asked Questions

What is the key difference between an MCP Proxy Server and an MCP Gateway?

Can I run an MCP Proxy Server and an MCP Gateway together?

Does Claude Desktop work with MCP Gateways?

How does authentication work differently in an MCP Proxy vs an MCP Gateway?

What happens to MCP tool calls during upstream server failures in a Gateway setup?

Is rate limiting possible at the MCP protocol level?

What observability data should I collect from an MCP Gateway?

Can an MCP Gateway do semantic tool routing based on the tool name or input?

How do I validate that my MCP Proxy or Gateway is spec-compliant before production?

What are the performance trade-offs of adding a Gateway layer to MCP?

Check your MCP security posture

Related Articles

Related MCPForge Tools