MCP Context Gateway
Reference architecture for a "DevOps Copilot" agent fronting a 60-tool MCP-style gateway. Demonstrates the contextweaver launch narrative end-to-end: a large tool catalog, compact
ChoiceCards, lazy schema hydration, the context firewall on a large upstream result, and an answer-phase prompt that sees only a summary and an artifact handle — never the raw rowset.
!!! info "Simulated, not connected"
This example simulates the MCP gateway flow using contextweaver
primitives. It does not connect to a real MCP server. The same
ContextManager.ingest_mcp_result and Router APIs used here are
the APIs you wire to a live ProxyRuntime in production —
see MCP Integration and
Gateway Spec.
TL;DR
| What | Where |
|---|---|
| The script | examples/architectures/mcp_context_gateway/main.py |
| The catalog | examples/architectures/mcp_context_gateway/catalog.yaml |
| Captured output | examples/architectures/mcp_context_gateway/OUTPUT.md |
| Local README | examples/architectures/mcp_context_gateway/README.md |
Variants
| Variant | Script | What it adds |
|---|---|---|
| Single-turn (canonical) | main.py |
The reference shape. Uses the public routing.hydration.SchemaSource (#261) instead of a hand-rolled _FULL_SCHEMAS dict. |
| Live transport | main_live.py |
Same scenario but routed through a real mcp.server.Server + ClientSession over the in-memory MCP transport (#260). |
| Multi-turn | main_multi.py |
4-turn transcript (BigQuery → Linear → Slack → PagerDuty) with fact accumulation across turns; turn 1's artifact survives via dependency closure (#262). |
| Real-MCP catalogs | main_real.py |
Runs the same shape against committed snapshots of three real public MCP servers (server-time, server-filesystem, server-everything) under real_catalogs/ (#280). |
| CLI surface | contextweaver demo --scenario mcp-gateway-full |
The single-turn variant from the CLI without invoking the example script (#264). |
| Benchmark range | benchmarks/gateway_scorecard.md |
Five gateway-shaped scenarios — firewall reduction 0.0 % – 98.8 %. Regenerate with make benchmark-gateway && make gateway-scorecard (#270). |
Run it:
python examples/architectures/mcp_context_gateway/main.py
(Or make architectures / make example.)
The shape
| Step | contextweaver primitive | What happens |
|---|---|---|
| 1. Load catalog | load_catalog_yaml + Catalog.register |
60 mocked tools across 10 namespaces are registered. |
| 2. Route | Router.route(query) + make_choice_cards |
The 60-tool catalog is narrowed to 5 compact ChoiceCards (536 chars total, no schemas). |
| 3. Hydrate schema | Explicit lookup in _FULL_SCHEMAS |
Only the chosen tool's JSON Schema (~854 chars) is materialised. The other 59 stay at zero bytes. |
| 4. Call upstream | Mock function returns MCP wire shape {"content": [...], "isError": False} |
A ~16 KB rowset comes back. |
| 5. Firewall | ContextManager.ingest_mcp_result |
Raw 16,507 chars → 194-char summary + extracted facts + artifact artifact:result:tc1. The full bytes live in the artifact store. |
| 6. Persist fact | ContextManager.add_fact_sync |
A durable summary (growth -> starter ...) is written so the answer prompt doesn't need to re-read the rowset. |
| 7. Answer | ContextManager.build_sync(phase=Phase.answer, ...) |
A 142-token, 645-char final prompt with [FACTS], [TOOL RESULT] (summary only), [USER], [TOOL CALL]. No rowset. |
Captured metrics
From the deterministic run (see OUTPUT.md):
| Metric | Value |
|---|---|
catalog_tools |
60 |
exposed_choice_cards |
5 |
hydrated_schema_chars |
854 |
raw_result_chars |
16,507 |
injected_summary_chars |
194 |
firewall_reduction_pct |
98.8 % |
final_prompt_tokens |
142 |
final_prompt_chars |
645 |
How this maps to a real MCP runtime
The swap from this example to a real MCP gateway is mostly two changes:
- Replace
load_catalog_yaml(...)withProxyRuntime.register_tool_defs_sync(upstream.list_tools())— the gateway pulls tool defs from upstream MCP servers. - Replace the mock function with
await runtime.tool_execute(tool_id, args), which delegates to the upstream over the MCP transport.
ContextManager.ingest_mcp_result(...), Router.route(...),
make_choice_cards(...), and the answer-phase build are identical in
both setups. See examples/mcp_gateway_demo.py
for the live-transport variant using ProxyRuntime + StubUpstream.
Pinned invariants
tests/test_architectures_mcp_context_gateway.py
runs main() and asserts:
- Catalog loads exactly 60 tools / 10 namespaces.
- The route phase shortlists to 5 cards out of 60.
- ChoiceCards rendered to the model do not contain any
"inputSchema"or"properties"strings. - Only
bigquery.run_queryhas its schema hydrated; the other 59 are zero bytes. - The raw result is > 10 KB; the firewall summary is < 500 chars.
- The sentinel
"mrr_delta_usd": -450(only in day-47's row, deep in the rowset) is not present in the final answer-phase prompt. - Every documented metric (
catalog_tools,exposed_choice_cards,hydrated_schema_chars,raw_result_chars,injected_summary_chars,firewall_reduction_pct,artifact_handle,final_prompt_tokens,final_prompt_chars) is present in stdout.
A failure on any of these is a regression in the load-bearing launch narrative.
Limitations
- No live MCP transport. Use
examples/mcp_gateway_demo.pyfor that. - Single turn — for multi-turn investigations with fact accumulation see Slack ops bot.
- The intent map is held explicit so the routing outcome is testable without involving an LLM.
Follow-ups
Tracked under issues #243
and #246 — when the
contextweaver mcp serve CLI lands, this example gains a "Run against a
live MCP server" appendix using the same ingest_mcp_result /
Router.route calls shown here.