[Preview] v1.80.10.rc.1 - Agent Gateway: Azure Foundry & Bedrock AgentCore
Deploy this version
- Docker
- Pip
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
docker.litellm.ai/berriai/litellm:v1.80.10.rc.1
pip install litellm==1.80.10
Key Highlights
- Agent (A2A) Gateway with Cost Tracking - Track agent costs per query, per token pricing, and view agent usage in the dashboard
- 2 New Agent Providers - LangGraph Agents and Azure AI Foundry Agents for agentic workflows
- New Provider: SAP Gen AI Hub - Full support for SAP Generative AI Hub with chat completions
- New Bedrock Writer Models - Add Palmyra-X4 and Palmyra-X5 models on Bedrock
- OpenAI GPT-5.2 Models - Full support for GPT-5.2, GPT-5.2-pro, and Azure GPT-5.2 models with reasoning support
- 227 New Fireworks AI Models - Comprehensive model coverage for Fireworks AI platform
- MCP Support on /chat/completions - Use MCP servers directly via chat completions endpoint
- Performance Improvements - Reduced memory leaks by 50%
Agent Gateway - 4 New Agent Providers
This release adds support for agents from the following providers:
- LangGraph Agents - Deploy and manage LangGraph-based agents
- Azure AI Foundry Agents - Enterprise agent deployments on Azure
- Bedrock AgentCore - AWS Bedrock agent integration
- A2A Agents - Agent-to-Agent protocol support
AI Gateway admins can now add agents from any of these providers, and developers can invoke them through a unified interface using the A2A protocol.
For all agent requests running through the AI Gateway, LiteLLM automatically tracks request/response logs, cost, and token usage.
Agent (A2A) Usage UI
Users can now filter usage statistics by agents, providing the same granular filtering capabilities available for teams, organizations, and customers.
Details:
- Filter usage analytics, spend logs, and activity metrics by agent ID
- View breakdowns on a per-agent basis
- Consistent filtering experience across all usage and analytics views
New Providers and Endpoints
New Providers (5 new providers)
| Provider | Supported LiteLLM Endpoints | Description |
|---|---|---|
| SAP Gen AI Hub | /chat/completions, /messages, /responses | SAP Generative AI Hub integration for enterprise AI |
| LangGraph | /chat/completions, /messages, /responses, /a2a | LangGraph agents for agentic workflows |
| Azure AI Foundry Agents | /chat/completions, /messages, /responses, /a2a | Azure AI Foundry Agents for enterprise agent deployments |
| Voyage AI Rerank | /rerank | Voyage AI rerank models support |
| Fireworks AI Rerank | /rerank | Fireworks AI rerank endpoint support |
New LLM API Endpoints (4 new endpoints)
| Endpoint | Method | Description | Documentation |
|---|---|---|---|
/containers/{id}/files | GET | List files in a container | Docs |
/containers/{id}/files/{file_id} | GET | Retrieve container file metadata | Docs |
/containers/{id}/files/{file_id} | DELETE | Delete a file from a container | Docs |
/containers/{id}/files/{file_id}/content | GET | Retrieve container file content | Docs |
New Models / Updated Models
New Model Support (270+ new models)
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| OpenAI | gpt-5.2 | 400K | $1.75 | $14.00 | Reasoning, vision, PDF, caching |
| OpenAI | gpt-5.2-pro | 400K | $21.00 | $168.00 | Reasoning, web search, vision |
| Azure | azure/gpt-5.2 | 400K | $1.75 | $14.00 | Reasoning, vision, PDF, caching |
| Azure | azure/gpt-5.2-pro | 400K | $21.00 | $168.00 | Reasoning, web search |
| Bedrock | us.writer.palmyra-x4-v1:0 | 128K | $2.50 | $10.00 | Function calling, PDF input |
| Bedrock | us.writer.palmyra-x5-v1:0 | 1M | $0.60 | $6.00 | Function calling, PDF input |
| Bedrock | eu.anthropic.claude-opus-4-5-20251101-v1:0 | 200K | $5.00 | $25.00 | Reasoning, computer use, vision |
| Bedrock | google.gemma-3-12b-it | 128K | $0.10 | $0.30 | Audio input |
| Bedrock | moonshot.kimi-k2-thinking | 128K | $0.60 | $2.50 | Reasoning |
| Bedrock | nvidia.nemotron-nano-12b-v2 | 128K | $0.20 | $0.60 | Vision |
| Bedrock | qwen.qwen3-next-80b-a3b | 128K | $0.15 | $1.20 | Function calling |
| Vertex AI | vertex_ai/deepseek-ai/deepseek-v3.2-maas | 164K | $0.56 | $1.68 | Reasoning, caching |
| Mistral | mistral/codestral-2508 | 256K | $0.30 | $0.90 | Function calling |
| Mistral | mistral/devstral-2512 | 256K | $0.40 | $2.00 | Function calling |
| Mistral | mistral/labs-devstral-small-2512 | 256K | $0.10 | $0.30 | Function calling |
| Cerebras | cerebras/zai-glm-4.6 | 128K | - | - | Chat completions |
| NVIDIA NIM | nvidia_nim/ranking/nvidia/llama-3.2-nv-rerankqa-1b-v2 | - | Free | Free | Rerank |
| Voyage | voyage/rerank-2.5 | 32K | $0.05/1K tokens | - | Rerank |
| Fireworks AI | 227 new models | Various | Various | Various | Full model catalog |
Features
- OpenAI
- Azure
- Add Azure GPT-5.2 models support - PR #17866
- Azure AI
- Anthropic
- Prevent duplicate tool_result blocks with same tool - PR #17632
- Handle partial JSON chunks in streaming responses - PR #17493
- Preserve server_tool_use and web_search_tool_result in multi-turn conversations - PR #17746
- Capture web_search_tool_result in streaming for multi-turn conversations - PR #17798
- Add retrieve batches and retrieve file content support - PR #17700
- Bedrock
- Gemini
- Vertex AI
- Mistral
- Add Codestral 2508, Devstral 2512 models - PR #17801
- Cerebras
- DeepSeek
- Add native support for thinking and reasoning_effort params - PR #17712
- NVIDIA NIM Rerank
- Add llama-3.2-nv-rerankqa-1b-v2 rerank model - PR #17670
- Fireworks AI
- Add 227 new Fireworks AI models - PR #17692
- Dashscope
- Fix default base_url error - PR #17584
Bug Fixes
- Anthropic
- Azure
- Fix error about encoding video id for Azure - PR #17708
- Azure AI
- Fix LLM provider for azure_ai in model map - PR #17805
- Watsonx
- Fix Watsonx Audio Transcription to only send supported params to API - PR #17840
- Router
LLM API Endpoints
Features
- Responses API
- Add usage details in responses usage object - PR #17641
- Fix error for response API polling - PR #17654
- Fix streaming tool_calls being dropped when text + tool_calls - PR #17652
- Transform image content in tool results for Responses API - PR #17799
- Fix responses api not applying tpm rate limits on api keys - PR #17707
- Containers API
- Rerank API
- Add support for forwarding client headers in /rerank endpoint - PR #17873
- Files API
- Add support for expires_after param in Files endpoint - PR #17860
- Video API
- Embeddings API
- Fix handling token array input decoding for embeddings - PR #17468
- Chat Completions API
- Add v0 target storage support - store files in Azure AI storage and use with chat completions API - PR #17758
- generateContent API
- Support model names with slashes on Gemini generateContent endpoints - PR #17743
- General
Bugs
- General
- Fix handle string content in is_cached_message - PR #17853
Management Endpoints / UI
Features
- UI Settings
- Agent & Usage UI
- Daily Agent Usage Backend - PR #17781
- Agent Usage UI - PR #17797
- Add agent cost tracking on UI - PR #17899
- New Badge for Agent Usage - PR #17883
- Usage Entity labels for filtering - PR #17896
- Agent Usage Page minor fixes - PR #17901
- Usage Page View Select component - PR #17854
- Usage Page Components refactor - PR #17848
- Logs & Spend
- Virtual Keys
- Fix x-litellm-key-spend header update - PR #17864
- Models & Endpoints
- SSO & Auth
- Teams
- MCP Server Management
- Add extra_headers and allowed_tools to UpdateMCPServerRequest - PR #17940
- Notifications
- Show progress and pause on hover for Notifications - PR #17942
- General
Bugs
- UI Fixes
- Fix links + old login page deprecation message - PR #17624
- Filtering for Chat UI Endpoint Selector - PR #17567
- Race Condition Handling in SCIM v2 - PR #17513
- Make /litellm_model_cost_map public - PR #16795
- Custom Callback on UI - PR #17522
- Add User Writable Directory to Non Root Docker for Logo - PR #17180
- Swap URL Input and Display Name inputs - PR #17682
- Change deprecation banner to only show on /sso/key/generate - PR #17681
- Change credential encryption to only affect db credentials - PR #17741
- Auth & Routes
AI Integrations
New Integrations (4 new integrations)
| Integration | Type | Description |
|---|---|---|
| SumoLogic | Logging | Native webhook integration for SumoLogic - PR #17630 |
| Arize Phoenix | Prompt Management | Arize Phoenix OSS prompt management integration - PR #17750 |
| Sendgrid | Sendgrid email notifications integration - PR #17775 | |
| Onyx | Guardrails | Onyx guardrail hooks integration - PR #16591 |
Logging
- Langfuse
- Prometheus
- Add 'exception_status' to prometheus logger - PR #17847
- OpenTelemetry
- Add latency metrics (TTFT, TPOT, Total Generation Time) to OTEL payload - PR #17888
- General
- Add polling via cache feature for async logging - PR #16862
Guardrails
- HiddenLayer
- Add HiddenLayer Guardrail Hooks - PR #17728
- Pillar Security
- Add opt-in evidence results for Pillar Security guardrail during monitoring - PR #17812
- PANW Prisma AIRS
- Add configurable fail-open, timeout, and app_user tracking - PR #17785
- Presidio
- Add support for configurable confidence score thresholds and scope in Presidio PII masking - PR #17817
- LiteLLM Content Filter
- Mask all regex pattern matches, not just first - PR #17727
- Regex Guardrails
- Add enhanced regex pattern matching for guardrails - PR #17915
- Gray Swan Guardrail
- Add passthrough mode for model response - PR #17102
Prompt Management
- General
- New API for integrating prompt management providers - PR #17829
Spend Tracking, Budgets and Rate Limiting
- Service Tier Pricing - Extract service_tier from response/usage for OpenAI flex pricing - PR #17748
- Agent Cost Tracking - Track agent_id in SpendLogs - PR #17795
- Tag Activity - Deduplicate /tag/daily/activity metadata - PR #16764
- Rate Limiting - Dynamic Rate Limiter - allow specifying ttl for in memory cache - PR #17679
MCP Gateway
- Chat Completions Integration - Add support for using MCPs on /chat/completions - PR #17747
- UI Session Permissions - Fix UI session MCP permissions across real teams - PR #17620
- OAuth Callback - Fix MCP OAuth callback routing and URL handling - PR #17789
- Tool Name Prefix - Fix MCP tool name prefix - PR #17908
Agent Gateway (A2A)
- Cost Per Query - Add cost per query for agent invocations - PR #17774
- Token Counting - Add token counting non streaming + streaming - PR #17779
- Cost Per Token - Add cost per token pricing for A2A - PR #17780
- LangGraph Provider - Add LangGraph provider for Agent Gateway - PR #17783
- Bedrock & LangGraph Agents - Allow using Bedrock AgentCore, LangGraph agents with A2A Gateway - PR #17786
- Agent Management - Allow adding LangGraph, Bedrock Agent Core agents - PR #17802
- Azure Foundry Agents - Add Azure AI Foundry Agents support - PR #17845
- Azure Foundry UI - Allow adding Azure Foundry Agents on UI - PR #17909
- Azure Foundry Fixes - Ensure Azure Foundry agents work correctly - PR #17943
Performance / Loadbalancing / Reliability improvements
- Memory Leak Fix - Cut memory leak in half - PR #17784
- Spend Logs Memory - Reduce memory accumulation of spend_logs - PR #17742
- Router Optimization - Replace time.perf_counter() with time.time() - PR #17881
- Filter Internal Params - Filter internal params in fallback code - PR #17941
- Gunicorn Suggestion - Suggest Gunicorn instead of uvicorn when using max_requests_before_restart - PR #17788
- Pydantic Warnings - Mitigate PydanticDeprecatedSince20 warnings - PR #17657
- Python 3.14 Support - Add Python 3.14 support via grpcio version constraints - PR #17666
- OpenAI Package - Bump openai package to 2.9.0 - PR #17818
Documentation Updates
- Contributing - Update clone instructions to recommend forking first - PR #17637
- Getting Started - Improve Getting Started page and SDK documentation structure - PR #17614
- JSON Mode - Make it clearer how to get Pydantic model output - PR #17671
- drop_params - Update litellm docs for drop_params - PR #17658
- Environment Variables - Document missing environment variables and fix incorrect types - PR #17649
- SumoLogic - Add SumoLogic integration documentation - PR #17647
- SAP Gen AI - Add SAP Gen AI provider documentation - PR #17667
- Authentication - Add Note for Authentication - PR #17733
- Known Issues - Adding known issues to 1.80.5-stable docs - PR #17738
- Supported Endpoints - Fix Supported Endpoints page - PR #17710
- Token Count - Document token count endpoint - PR #17772
- Overview - Made litellm proxy and SDK difference cleaner in overview with a table - PR #17790
- Containers API - Add docs for containers files API + code interpreter on LiteLLM - PR #17749
- Target Storage - Add documentation for target storage - PR #17882
- Agent Usage - Agent Usage documentation - PR #17931, PR #17932, PR #17934
- Cursor Integration - Cursor Integration documentation - PR #17855, PR #17939
- A2A Cost Tracking - A2A cost tracking docs - PR #17913
- Azure Search - Update azure search docs - PR #17726
- Milvus Client - Fix milvus client docs - PR #17736
- Streaming Logging - Remove streaming logging doc - PR #17739
- Integration Docs - Update integration docs location - PR #17644
- Links - Updated docs links for mistral and anthropic - PR #17852
- Community - Add community doc link - PR #17734
- Pricing - Update pricing for global.anthropic.claude-haiku-4-5-20251001-v1:0 - PR #17703
- gpt-image-1-mini - Correct model type for gpt-image-1-mini - PR #17635
Infrastructure / Deployment
- Docker - Use python instead of wget for healthcheck in docker-compose.yml - PR #17646
- Helm Chart - Add extraResources support for Helm chart deployments - PR #17627
- Helm Versioning - Add semver prerelease suffix to helm chart versions - PR #17678
- Database Schema - Add storage_backend and storage_url columns to schema.prisma for target storage feature - PR #17936
New Contributors
- @xianzongxie-stripe made their first contribution in PR #16862
- @krisxia0506 made their first contribution in PR #17637
- @chetanchoudhary-sumo made their first contribution in PR #17630
- @kevinmarx made their first contribution in PR #17632
- @expruc made their first contribution in PR #17627
- @rcII made their first contribution in PR #17626
- @tamirkiviti13 made their first contribution in PR #16591
- @Eric84626 made their first contribution in PR #17629
- @vasilisazayka made their first contribution in PR #16053
- @juliettech13 made their first contribution in PR #17663
- @jason-nance made their first contribution in PR #17660
- @yisding made their first contribution in PR #17671
- @emilsvennesson made their first contribution in PR #17656
- @kumekay made their first contribution in PR #17646
- @chenzhaofei01 made their first contribution in PR #17584
- @shivamrawat1 made their first contribution in PR #17733
- @ephrimstanley made their first contribution in PR #17723
- @hwittenborn made their first contribution in PR #17743
- @peterkc made their first contribution in PR #17727
- @saisurya237 made their first contribution in PR #17725
- @Ashton-Sidhu made their first contribution in PR #17728
- @CyrusTC made their first contribution in PR #17810
- @jichmi made their first contribution in PR #17703
- @ryan-crabbe made their first contribution in PR #17852
- @nlineback made their first contribution in PR #17851
- @butnarurazvan made their first contribution in PR #17468
- @yoshi-p27 made their first contribution in PR #17915

