Skip to main content

/mcp [BETA] - Model Context Protocol

LiteLLM Proxy provides an MCP Gateway that allows you to use a fixed endpoint for all MCP tools and control MCP access by Key, Team.

LiteLLM MCP Architecture: Use MCP tools with all LiteLLM supported models

Overview​

FeatureDescription
MCP Operations• List Tools
• Call Tools
Supported MCP Transports• Streamable HTTP
• SSE
• Standard Input/Output (stdio)
MCP Tool Cost Tracking✅ Supported
Grouping MCPs (Access Groups)✅ Supported
LiteLLM Permission Management✨ Enterprise Only
• By Key
• By Team
• By Organization

Adding your MCP​

On the LiteLLM UI, Navigate to "MCP Servers" and click "Add New MCP Server".

On this form, you should enter your MCP Server URL and the transport you want to use.

LiteLLM supports the following MCP transports:

  • Streamable HTTP
  • SSE (Server-Sent Events)
  • Standard Input/Output (stdio)

Adding a stdio MCP Server​

For stdio MCP servers, select "Standard Input/Output (stdio)" as the transport type and provide the stdio configuration in JSON format:

Using your MCP​

Quick Start​

Connect via OpenAI Responses API​

Use the OpenAI Responses API to connect to your LiteLLM MCP server:

cURL Example
curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "<your-litellm-proxy-base-url>/mcp",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY"
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'

Specific MCP Servers​

You can choose to access specific MCP servers and only list their tools using the x-mcp-servers header. This header allows you to:

  • Limit tool access to one or more specific MCP servers
  • Control which tools are available in different environments or use cases

The header accepts a comma-separated list of server names: "Zapier_Gmail,Server2,Server3"

Notes:

  • Server names with spaces should be replaced with underscores
  • If the header is not provided, tools from all available MCP servers will be accessible
cURL Example with Server Segregation
curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "<your-litellm-proxy-base-url>/mcp",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"x-mcp-servers": "Zapier_Gmail"
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'

In this example, the request will only have access to tools from the "Zapier_Gmail" MCP server.

Grouping MCPs (Access Groups)​

MCP Access Groups allow you to group multiple MCP servers together for easier management.

1. Create an Access Group​

To create an access group:

  • Go to MCP Servers in the LiteLLM UI
  • Click "Add a New MCP Server"
  • Under "MCP Access Groups", create a new group (e.g., "dev_group") by typing it
  • Add the same group name to other servers to group them together

2. Use Access Group in Cursor​

Include the access group name in the x-mcp-servers header:

Cursor Configuration with Access Groups
{
"mcpServers": {
"LiteLLM": {
"url": "<your-litellm-proxy-base-url>/mcp",
"headers": {
"x-litellm-api-key": "Bearer $LITELLM_API_KEY",
"x-mcp-servers": "dev_group"
}
}
}
}

This gives you access to all servers in the "dev_group" access group.

Advanced: Connecting Access Groups to API Keys​

When creating API keys, you can assign them to specific access groups for permission management:

  • Go to "Keys" in the LiteLLM UI and click "Create Key"
  • Select the desired MCP access groups from the dropdown
  • The key will have access to all MCP servers in those groups
  • This is reflected in the Test Key page

Using your MCP with client side credentials​

Use this if you want to pass a client side authentication token to LiteLLM to then pass to your MCP to auth to your MCP.

You can specify your MCP auth token using the header x-mcp-auth. LiteLLM will forward this token to your MCP server for authentication.

Connect via OpenAI Responses API with MCP Auth​

Use the OpenAI Responses API and include the x-mcp-auth header for your MCP server authentication:

cURL Example with MCP Auth
curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "<your-litellm-proxy-base-url>/mcp",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"x-mcp-auth": YOUR_MCP_AUTH_TOKEN
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'

Customize the MCP Auth Header Name​

By default, LiteLLM uses x-mcp-auth to pass your credentials to MCP servers. You can change this header name in one of the following ways:

  1. Set the LITELLM_MCP_CLIENT_SIDE_AUTH_HEADER_NAME environment variable
Environment Variable
export LITELLM_MCP_CLIENT_SIDE_AUTH_HEADER_NAME="authorization"
  1. Set the mcp_client_side_auth_header_name in the general settings on the config.yaml file
config.yaml
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: sk-xxxxxxx

general_settings:
mcp_client_side_auth_header_name: "authorization"

Using the authorization header​

In this example the authorization header will be passed to the MCP server for authentication.

cURL with authorization header
curl --location '<your-litellm-proxy-base-url>/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $LITELLM_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "<your-litellm-proxy-base-url>/mcp",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"authorization": "Bearer sk-zapier-token-123"
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'

MCP Cost Tracking​

LiteLLM provides cost tracking for MCP tool calls, allowing you to monitor and control expenses associated with MCP operations. You can configure costs at two levels:

  • Default cost per tool: Set a uniform cost for all tools from a specific MCP server
  • Tool-specific costs: Define individual costs for specific tools (e.g., search_tool costs $10, while get_weather costs $5)

Configure cost tracking​

LiteLLM offers two approaches to track MCP tool costs, each designed for different use cases:

MethodBest ForCapabilities
UI/Config-based Cost TrackingSimple, static cost tracking scenarios• Set default costs for all server tools
• Configure individual tool costs
• Automatic cost tracking based on configuration
Custom Post-MCP HookDynamic, complex cost tracking requirements• Custom cost calculation logic
• Real-time cost adjustments
• Response modification capabilities

Configuration on UI/config.yaml​

On the UI when adding a new MCP server, you can navigate to the "Cost Configuration" tab to configure the cost for the MCP server.

Custom Post-MCP Hook​

Use this when you need dynamic cost calculation or want to modify the MCP response before it's returned to the user.

1. Create a custom MCP hook file​

custom_mcp_hook.py
from typing import Optional
from litellm.integrations.custom_logger import CustomLogger
from litellm.types.mcp import MCPPostCallResponseObject


class CustomMCPCostTracker(CustomLogger):
"""
Custom handler for MCP cost tracking and response modification
"""

async def async_post_mcp_tool_call_hook(
self,
kwargs,
response_obj: MCPPostCallResponseObject,
start_time,
end_time
) -> Optional[MCPPostCallResponseObject]:
"""
Called after each MCP tool call.
Modify costs and response before returning to user.
"""

# Extract tool information from kwargs
tool_name = kwargs.get("name", "")
server_name = kwargs.get("server_name", "")

# Calculate custom cost based on your logic
custom_cost = 42.00

# Set the response cost
response_obj.hidden_params.response_cost = custom_cost



return response_obj


# Create instance for LiteLLM to use
custom_mcp_cost_tracker = CustomMCPCostTracker()

2. Configure in config.yaml​

config.yaml
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: sk-xxxxxxx

# Add your custom MCP hook
callbacks:
- custom_mcp_hook.custom_mcp_cost_tracker

mcp_servers:
zapier_server:
url: "https://actions.zapier.com/mcp/sk-xxxxx/sse"

3. Start the proxy​

$ litellm --config /path/to/config.yaml 

When MCP tools are called, your custom hook will:

  1. Calculate costs based on your custom logic
  2. Modify the response if needed
  3. Track costs in LiteLLM's logging system

✨ MCP Permission Management​

LiteLLM supports managing permissions for MCP Servers by Keys, Teams, Organizations (entities) on LiteLLM. When a MCP client attempts to list tools, LiteLLM will only return the tools the entity has permissions to access.

When Creating a Key, Team, or Organization, you can select the allowed MCP Servers that the entity has access to.

LiteLLM Proxy - Walk through MCP Gateway​

LiteLLM exposes an MCP Gateway for admins to add all their MCP servers to LiteLLM. The key benefits of using LiteLLM Proxy with MCP are:

  1. Use a fixed endpoint for all MCP tools
  2. MCP Permission management by Key, Team, or User

This video demonstrates how you can onboard an MCP server to LiteLLM Proxy, use it and set access controls.

LiteLLM Python SDK MCP Bridge​

LiteLLM Python SDK acts as a MCP bridge to utilize MCP tools with all LiteLLM supported models. LiteLLM offers the following features for using MCP

  • List Available MCP Tools: OpenAI clients can view all available MCP tools
    • litellm.experimental_mcp_client.load_mcp_tools to list all available MCP tools
  • Call MCP Tools: OpenAI clients can call MCP tools
    • litellm.experimental_mcp_client.call_openai_tool to call an OpenAI tool on an MCP server

1. List Available MCP Tools​

In this example we'll use litellm.experimental_mcp_client.load_mcp_tools to list all available MCP tools on any MCP server. This method can be used in two ways:

  • format="mcp" - (default) Return MCP tools
    • Returns: mcp.types.Tool
  • format="openai" - Return MCP tools converted to OpenAI API compatible tools. Allows using with OpenAI endpoints.
    • Returns: openai.types.chat.ChatCompletionToolParam
MCP Client List Tools
# Create server parameters for stdio connection
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import os
import litellm
from litellm import experimental_mcp_client


server_params = StdioServerParameters(
command="python3",
# Make sure to update to the full absolute path to your mcp_server.py file
args=["./mcp_server.py"],
)

async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the connection
await session.initialize()

# Get tools
tools = await experimental_mcp_client.load_mcp_tools(session=session, format="openai")
print("MCP TOOLS: ", tools)

messages = [{"role": "user", "content": "what's (3 + 5)"}]
llm_response = await litellm.acompletion(
model="gpt-4o",
api_key=os.getenv("OPENAI_API_KEY"),
messages=messages,
tools=tools,
)
print("LLM RESPONSE: ", json.dumps(llm_response, indent=4, default=str))

2. List and Call MCP Tools​

In this example we'll use

  • litellm.experimental_mcp_client.load_mcp_tools to list all available MCP tools on any MCP server
  • litellm.experimental_mcp_client.call_openai_tool to call an OpenAI tool on an MCP server

The first llm response returns a list of OpenAI tools. We take the first tool call from the LLM response and pass it to litellm.experimental_mcp_client.call_openai_tool to call the tool on the MCP server.

How litellm.experimental_mcp_client.call_openai_tool works​

  • Accepts an OpenAI Tool Call from the LLM response
  • Converts the OpenAI Tool Call to an MCP Tool
  • Calls the MCP Tool on the MCP server
  • Returns the result of the MCP Tool call
MCP Client List and Call Tools
# Create server parameters for stdio connection
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import os
import litellm
from litellm import experimental_mcp_client


server_params = StdioServerParameters(
command="python3",
# Make sure to update to the full absolute path to your mcp_server.py file
args=["./mcp_server.py"],
)

async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the connection
await session.initialize()

# Get tools
tools = await experimental_mcp_client.load_mcp_tools(session=session, format="openai")
print("MCP TOOLS: ", tools)

messages = [{"role": "user", "content": "what's (3 + 5)"}]
llm_response = await litellm.acompletion(
model="gpt-4o",
api_key=os.getenv("OPENAI_API_KEY"),
messages=messages,
tools=tools,
)
print("LLM RESPONSE: ", json.dumps(llm_response, indent=4, default=str))

openai_tool = llm_response["choices"][0]["message"]["tool_calls"][0]
# Call the tool using MCP client
call_result = await experimental_mcp_client.call_openai_tool(
session=session,
openai_tool=openai_tool,
)
print("MCP TOOL CALL RESULT: ", call_result)

# send the tool result to the LLM
messages.append(llm_response["choices"][0]["message"])
messages.append(
{
"role": "tool",
"content": str(call_result.content[0].text),
"tool_call_id": openai_tool["id"],
}
)
print("final messages with tool result: ", messages)
llm_response = await litellm.acompletion(
model="gpt-4o",
api_key=os.getenv("OPENAI_API_KEY"),
messages=messages,
tools=tools,
)
print(
"FINAL LLM RESPONSE: ", json.dumps(llm_response, indent=4, default=str)
)