Skip to content

Instantly share code, notes, and snippets.

@w601sxs
Last active April 15, 2025 16:15
Show Gist options
  • Select an option

  • Save w601sxs/de7ef915be4e7051687bffff971272cf to your computer and use it in GitHub Desktop.

Select an option

Save w601sxs/de7ef915be4e7051687bffff971272cf to your computer and use it in GitHub Desktop.
WTF MCP

Not heard enough about MCP yet?

Unlike traditional APIs that require rigid pre-defined integrations, MCP acts as a universal translator between large language models and enterprise systems (and other loosely defined "tools"), maintaining context across interactions and enabling real-time discovery of resources. You knew this part already. Let's answer some common questions that have come up (including whats in the title). Later in this post, we'll dissect an actual MCP implementation for Amazon Bedrock Knowledge Bases to understand how this protocol bridges the gap between human-like queries and machine-readable data.

The Question Everyone Asks: "But someone still needs to build the APIs, right?"

Let's address this confusion directly: Yes, developers still create interfaces to data and tools - but MCP fundamentally changes how, when, and by whom these interfaces are used.

The key difference is that MCP creates a standardized way for these interfaces to be connected at runtime by users rather than hardcoded at design time by developers.

API integrations are like that socket behind your microwave. You never touch it. You can change the microwave but that's costly. If the microwave breaks, you need your engineering team to fix it. Or worse, call a contractor and eat the cost of delays.

MCP is like a loose hanging power strip. It can be moved and extended. Power strips may also come with surge protection built in, and can save your microwave. They also may come with USB ports and a wireless charging pad. However, your microwave might still need replacing/updating.

Why We Need MCP: The LLM Context Problem

Large Language Models face three critical limitations:

  1. Knowledge Cutoffs: LLMs can't access information beyond their training data
  2. Tool Manipulation: LLMs can't directly interact with external systems
  3. Context Windows: LLMs have limited memory for conversation history

Traditional solutions involve developers creating custom API integrations for each use case. This means:

  • Every new data source requires developer intervention
  • Updates need new code deployments
  • Users/Agents are limited to what developers anticipated

MCP creates a standardized protocol for runtime connections, solving these issues without requiring constant developer updates.

Traditional APIs are fixed connections buried in your infrastructure, while MCP provides flexible connection points that users can access, move, and extend without calling in specialists.

Understanding MCP vs Traditional APIs: A Visual Comparison

graph TD

subgraph "MCP Integration"
        A17[MCP Developer] -->|Creates/Maintains MCP Client| B17[MCP Client Library]
        A2[Tool Developer] -->|Creates Tool| B2[Tool Library]
        C2[AI Application Developer] -->|Implements MCP| D2[AI Application]
        D2-->|Discover and use tools| D17[MCP Server]
        E2[User] -->|Accesses| D2
        D17 -->|SSE/STDIO| B17
        B2 <--> B17
    end
    
    subgraph "Traditional API Integration"
        A1[Developer 1] -->|Builds Custom Integration| B1[AI Application]
        A3[Developer 2] -->|Builds another Custom Integration| B1[AI Application]
        A4[Developer 3] -->|Builds yet another Custom Integration| B1[AI Application]
        B1 -->|Fixed Connections| C1[Data Sources]
        D1[User] -->|Limited to Pre-built Features| B1
    end
    
    
Loading

In traditional API integrations, developers must anticipate every tool and data source users might need, then hardcode those connections. With MCP, the application provides a standardized "socket" that users can plug different tools into as needed.

Yes, Tools Still Need Building - But Who Uses Them Changes

Let's be crystal clear: MCP doesn't eliminate the need for interfaces to data and functionality.

The key difference is in the separation of concerns. With MCP:

  1. Tool developers build MCP-compatible interfaces to their systems
  2. AI application developers implement the MCP standard
  3. Agents discover, and select which tools to connect and when

This runtime flexibility is impossible with traditional API approaches. Let's take a break and look at when To Use Each Approach

Need Traditional APIs Model Context Protocol
Simple, predefined workflows ✓ Often simpler May introduce build time complexity, but using in runtime is easier
Personalized tool connections Limited to what's built ✓ User-selected at runtime
Enterprise data access Requires custom integration ✓ Connect existing tools
Multiple specialist tools Multiple API integrations ✓ Standardized connections
Highly secure data Custom security per API ✓ Consistent security model

Also, MCP is not a replacement to APIs. Its a standard protocol that simplfies agent-tool communication (for now). So if you are not using agents/tools etc., but you still want to use Agents, please do educate us on what the use case would be!

Real-World Example: Financial Analysis

Let's take a look at a real world example. In teh flow diagram below, a user asks the AI agent to invest his/her portfolio. In this particular implementation we assume the agent needs to double check with the user as to which data sources to use (this is not necessary, and can be autonmous). The MCP server allows the agent to then discover tools, and restrict its usage to the tools the user responds with (fidelity and market data).

Without MCP, the AI application developer would need to build integrations with every possible financial institution and data source. With MCP, the user simply connects the AI to whatever financial tools they already are approved to use.

Ask yourself:

  • what would happen if Fidelity changes their underlying APIs?
  • could there be multiple MCP clients for the same set of APIs? (yes)
  • how would authentication work at each level?
  • ...and more. many of these questions don't have a single definitive answer
sequenceDiagram
    User->>+AI: "Analyze my investment portfolio"
    AI->>+User: What data sources should I connect to?
    User->>+AI: "My Fidelity account and market data"
    AI->>+MCP: Connect to Fidelity
    MCP->>+FidelityTool: Retrieve portfolio
    FidelityTool-->>-MCP: Account data
    AI->>+MCP: Connect to market data
    MCP->>+MarketDataTool: Get current metrics
    MarketDataTool-->>-MCP: Market information
    MCP-->>-AI: Combined data
    AI->>User: "Portfolio analysis shows..."
Loading

This is too generic, show me the code

Ok, lets look at the Bedrock Knowledgebases MCP published recently:

This can be found here - https://github.com/awslabs/mcp/tree/main/src/bedrock-kb-retrieval-mcp-server/awslabs/bedrock_kb_retrieval_mcp_server

    ├── bedrock-kb-retrieval-mcp-server
    │   ├── CHANGELOG.md
    │   ├── README.md
    │   ├── awslabs
    │   │   ├── __init__.py
    │   │   └── bedrock_kb_retrieval_mcp_server
    │   │       ├── __init__.py
    │   │       ├── knowledgebases
    │   │       │   ├── __init__.py
    │   │       │   ├── clients.py
    │   │       │   ├── discovery.py
    │   │       │   └── runtime.py
    │   │       ├── models.py
    │   │       └── server.py
    │   ├── pyproject.toml
    │   └── uv.lock

Let's start with the server...
The server imports clients to use Bedrock's underlying APIs:

from knowledgebases.clients import (
    get_bedrock_agent_client,  # Management API client
    get_bedrock_agent_runtime_client  # Query execution client
)

In server.py, we define one resource and one tool:

@mcp.resource(uri='resource://knowledgebases', ...)
async def knowledgebases_resource() -> str:
    """Lists available KBs and their data sources"""
    return json.dumps(await discover_knowledge_bases(kb_agent_mgmt_client))

@mcp.tool(name='QueryKnowledgeBases')
async def query_knowledge_bases_tool(...) -> str:
    """Executes natural language queries against KBs"""
    return await query_knowledge_base(...)
  • The resource acts as a dynamic registry of available knowledge bases
  • The tool handles natural language queries with automated result processing

Within the "knowledgebases" client folder:

knowledgebases/
├── clients.py      # Initializes boto3 clients (not MCP clients!)
├── discovery.py    # Lists KBs (management API)
└── runtime.py      # Executes queries (runtime API)

Key client implementation snippets:

# discovery.py
async def discover_knowledge_bases(client) -> dict:
    """Lists KBs using boto3 client"""
    response = client.list_knowledge_bases()
    return {kb['knowledgeBaseId']: {
        'name': kb['name'],
        'data_sources': [...]
    } for kb in response['knowledgeBaseSummaries']}

# runtime.py
async def query_knowledge_base(...) -> str:
    """Executes query using RetrieveAndGenerate API"""
    response = client.retrieve_and_generate(
        input={'text': query},
        retrieveAndGenerateConfiguration={
            'knowledgeBaseId': knowledge_base_id,
            'modelArn': 'anthropic.claude-v2'
        }
    )
    return response['output']['text']

"Aha!", you may say, "caught red handed. I see you actually made those API calls inside the client files".
Exactly! But here's the crucial distinction:

MCP Protocol Layer

graph LR
    A[MCP Client] -->|Natural Language| B[MCP Server]
    B -->|Standardized Protocol| C[Knowledge Base Clients]
    C -->|boto3 API Calls| D[Bedrock Service]
Loading

Key Differences:

  1. Abstraction Level

    • API Calls: Direct service-specific operations (list_knowledge_bases)
    • MCP Tools: Protocol-defined operations (QueryKnowledgeBases with natural language)
  2. Client Responsibilities

    • API Client: Needs service-specific SDK/credentials
    • MCP Client: Only needs MCP protocol implementation
  3. Result Processing

    • API Response: Raw service response
    • MCP Response: Standardized, pre-processed results with reranking/filtering

Reinforcing the Distinction

# Traditional API Usage
client = boto3.client('bedrock-agent-runtime')
response = client.retrieve_and_generate(...)  # Needs exact parameters

# MCP Client Usage
response = mcp_client.query("Explain AI safety principles")  # Natural language

Protocol Advantages:

  • Discovery Automation: Clients find KBs through resource://knowledgebases
  • Query Standardization: Natural language processing handled by protocol
  • Security Decoupling: MCP server manages credentials, clients only need protocol access

Final Architecture Flow

sequenceDiagram
    User->>MCP Client: Natural Language Query
    MCP Client->>MCP Server: Standardized MCP Request
    MCP Server->>Bedrock API: boto3 Request
    Bedrock API->>MCP Server: Raw Response
    MCP Server->>MCP Client: Processed MCP Response
    MCP Client->>User: Structured Answer
Loading

What is SSE? Does it make my agents run faster?"

No, SSE doesn't make your agents run faster. It's about the communication mechanism, not processing speed. Server-Sent Events (SSE) is one of the primary transport mechanisms in MCP, and there are some common misconceptions about what it does. Let's clarify:

sequenceDiagram
    participant Client
    participant Server
    Client->>Server: Initial HTTP request
    activate Server
    Note right of Server: Connection stays open
    Server-->>Client: Event 1
    Server-->>Client: Event 2
    Server-->>Client: Event 3
    Note right of Client: Client sends data via separate HTTP POST requests
    Client->>Server: HTTP POST (client-to-server data)
    Server-->>Client: Event 4
    deactivate Server
Loading

What SSE Actually Is

SSE (Server-Sent Events) is one of MCP's built-in transport types that handles how messages are transmitted between clients and servers. It's specifically designed for server-to-client streaming over HTTP, with separate HTTP POST requests handling client-to-server communication.

Under the hood, MCP uses JSON-RPC 2.0 as its wire format, and the transport layer (whether SSE or another option) is responsible for converting MCP protocol messages into JSON-RPC format for transmission.

The code example shows how to configure MCP to use SSE transport:

if args.sse:
    mcp.settings.port = args.port
    mcp.run(transport='sse')
else:
    mcp.run()  # Uses default WebSocket transport

Connection Options: SSE vs stdio

MCP supports multiple transport options:

  • SSE: Good for server-to-client streaming, particularly useful in environments with restricted networks where WebSockets might be blocked. Uses HTTP for all communication.

  • stdio (Standard Input/Output): Useful for local integrations, command-line tools, and simple process communication. Particularly valuable when building shell scripts or command-line utilities.

Important security note: When using SSE transport, be aware of potential DNS rebinding attacks. Always validate Origin headers, avoid binding servers to all network interfaces (use localhost instead), and implement proper authentication.

SSE StdIO
Allows a client to send data to a server through the standard input and receive responses via the standard output streams. Primarily used for inter-process communication within the same system (Local, synchronous) enables servers to push real-time updates to web clients over a single, long-lived HTTP connection (network based, real-time)
Local and fast, does not need a network connection Latency subject to network connection
particularly suitable for command-line tools and local integrations where the client and server operate within the same process. allows for efficient, one-way communication from the server to the client, making it suitable for applications that require real-time data updates.
multiple connections subject to local resources (usually single client) allows servers to handle multiple client connections efficiently
no native support for authentication supports features like authentication (JWT, API keys)
server provides two main endpoints: SSE Endpoint( Clients connect to this endpoint to receive messages from the server) and HTTP POST Endpoint (Clients use this endpoint to send messages to the server).
Server sample code


app = Server("example-server")
async with stdio_server() as streams:
await app.run(
streams[0],
streams[1],
app.create_initialization_options()
)
Server sample code

from mcp.server.sse import SseServerTransport
from starlette.applications import Starlette
from starlette.routing import Route

app = Server("example-server")
sse = SseServerTransport("/messages")

async def handle_sse(scope, receive, send):
async with sse.connect_sse(scope, receive, send) as streams:
await app.run(streams[0], streams[1], app.create_initialization_options())

async def handle_messages(scope, receive, send):
await sse.handle_post_message(scope, receive, send)

starlette_app = Starlette(
routes=[
Route("/sse", endpoint=handle_sse),
Route("/messages", endpoint=handle_messages, methods=["POST"]),
]
)
Client sample code


params = StdioServerParameters(
command="./server",
args=["--option", "value"]
)
async with stdio_client(params) as streams:
async with ClientSession(streams[0], streams[1]) as session:
await session.initialize()
Client sample code


async with sse_client("http://localhost:8000/sse") as streams:
async with ClientSession(streams[0], streams[1]) as session:
await session.initialize()

Key Takeaway

And now you're a certified MCP ninja! Maybe not. Seriously, though, MCP isn't replacing APIs - it's creating a standardized conversation layer that uses APIs as implementation details, not primary interfaces.

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment