Skip to main content

v1.74.9-stable - Auto-Router

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

Deploy this version

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.74.9-stable

Key Highlights

  • Auto-Router - Automatically route requests to specific models based on request content.
  • Model-level Guardrails - Only run guardrails when specific models are used.
  • MCP Header Propagation - Propagate headers from client to backend MCP.
  • New LLM Providers - Added Bedrock inpainting support and Recraft API image generation / image edits support.

Auto-Router


This release introduces auto-routing to models based on request content. This means Proxy Admins can define a set of keywords that always routes to specific models when users opt in to using the auto-router.

This is great for internal use cases where you don't want users to think about which model to use - for example, use Claude models for coding vs GPT models for generating ad copy.

Read More


Model-level Guardrails


This release brings model-level guardrails support to your config.yaml + UI. This is great for cases when you have an on-prem and hosted model, and just want to run prevent sending PII to the hosted model.

model_list:
- model_name: claude-sonnet-4
litellm_params:
model: anthropic/claude-sonnet-4-20250514
api_key: os.environ/ANTHROPIC_API_KEY
api_base: https://api.anthropic.com/v1
guardrails: ["azure-text-moderation"] # 👈 KEY CHANGE

guardrails:
- guardrail_name: azure-text-moderation
litellm_params:
guardrail: azure/text_moderations
mode: "post_call"
api_key: os.environ/AZURE_GUARDRAIL_API_KEY
api_base: os.environ/AZURE_GUARDRAIL_API_BASE

Read More


MCP Header Propagation


v1.74.9-stable allows you to propagate MCP server specific authentication headers via LiteLLM

  • Allowing users to specify which header_name is to be propagated to which mcp_server via headers
  • Allows adding of different deployments of same MCP server type to use different authentication headers

Read More


New Models / Updated Models

Pricing / Context Window Updates

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)
Fireworks AIfireworks/models/kimi-k2-instruct131k$0.6$2.5
OpenRouteropenrouter/qwen/qwen-vl-plus8192$0.21$0.63
OpenRouteropenrouter/qwen/qwen3-coder8192$1$5
OpenRouteropenrouter/bytedance/ui-tars-1.5-7b128k$0.10$0.20
Groqgroq/qwen/qwen3-32b131k$0.29$0.59
VertexAIvertex_ai/meta/llama-3.1-8b-instruct-maas128k$0.00$0.00
VertexAIvertex_ai/meta/llama-3.1-405b-instruct-maas128k$5$16
VertexAIvertex_ai/meta/llama-3.2-90b-vision-instruct-maas128k$0.00$0.00
Google AI Studiogemini/gemini-2.0-flash-live-0011,048,576$0.35$1.5
Google AI Studiogemini/gemini-2.5-flash-lite1,048,576$0.1$0.4
VertexAIvertex_ai/gemini-2.0-flash-lite-0011,048,576$0.35$1.5
OpenAIgpt-4o-realtime-preview-2025-06-03128k$5$20

Features

Bugs


LLM API Endpoints

Features

Bugs

  1. /batches
    1. Skip invalid batch during cost tracking check (prev. Would stop all checks) - PR #12782
  2. /chat/completions
    1. Fix async retryer on .acompletion() - PR #12886

MCP Gateway

Features

  • Permission Management
    • Make permission management by key/team OSS - PR #12988
  • MCP Alias
    • Support mcp server aliases (useful for calling long mcp server names on Cursor) - PR #12994
  • Header Propagation
    • Support propagating headers from client to backend MCP (useful for sending personal access tokens to backend MCP) - PR #13003

Management Endpoints / UI

Features

  • Usage
    • Support viewing usage by model group - PR #12890
  • Virtual Keys
    • New key_type field on /key/generate - allows specifying if key can call LLM API vs. Management routes - PR #12909
  • Models
    • Add ‘auto router’ on UI - PR #12960
    • Show global retry policy on UI - PR #12969
    • Add model-level guardrails on create + update - PR #13006

Bugs

  • SSO
    • Fix logout when SSO is enabled - PR #12703
    • Fix reset SSO when ui_access_mode is updated - PR #13011
  • Guardrails
    • Show correct guardrails when editing a team - PR #12823
  • Virtual Keys

Logging / Guardrail Integrations

Features

Bugs


Performance / Loadbalancing / Reliability improvements

Features

Bugs

  • forward_clientside_headers
    • Filter out content-length from headers (caused backend requests to hang) - PR #12886
  • Message Redaction
    • Fix cannot pickle coroutine object error - PR #13005

General Proxy Improvements

Features

  • Benchmarks
    • Updated litellm proxy benchmarks (p50, p90, p99 overhead) - PR #12842
  • Request Headers
    • Added new x-litellm-num-retries request header
  • Swagger
    • Support local swagger on custom root paths - PR #12911
  • Health
    • Track cost + add tags for health checks done by LiteLLM Proxy - PR #12880

Bugs

  • Proxy Startup
    • Fixes issue on startup where team member budget is None would block startup - PR #12843
  • Docker
    • Move non-root docker to chain guard image (fewer vulnerabilities) - PR #12707
    • add azure-keyvault==4.2.0 to Docker img - PR #12873
  • Separate Health App
    • Pass through cmd args via supervisord (enables user config to still work via docker) - PR #12871
  • Swagger
    • Bump DOMPurify version (fixes vulnerability) - PR #12911
    • Add back local swagger bundle (enables swagger to work in air gapped env.) - PR #12911
  • Request Headers
    • Make ‘user_header_name’ field check case insensitive (fixes customer budget enforcement for OpenWebUi) - PR #12950
  • SpendLogs
    • Fix issues writing to DB when custom_llm_provider is None - PR #13001

New Contributors

Full Changelog