SAP Generative AI Hub
LiteLLM supports SAP Generative AI Hub's Orchestration Service.
| Property | Details |
|---|---|
| Description | SAP's Generative AI Hub provides access to OpenAI, Anthropic, Gemini, Mistral, NVIDIA, Amazon, and SAP LLMs through the AI Core orchestration service. |
| Provider Route on LiteLLM | sap/ |
| Supported Endpoints | /chat/completions, /embeddings |
| API Reference | SAP AI Core Documentation |
Authentication​
SAP Generative AI Hub uses service key authentication. You can provide credentials via:
- Environment variable - Set
AICORE_SERVICE_KEYwith your service key JSON - Direct parameter - Pass
api_keywith the service key JSON string
Environment Variable
import os
os.environ["AICORE_SERVICE_KEY"] = '{"clientid": "...", "clientsecret": "...", ...}'
- Environment variables - Set the following list of credentials in .env file
AICORE_AUTH_URL = "https://* * * .authentication.sap.hana.ondemand.com/oauth/token", AICORE_CLIENT_ID = " *** ", AICORE_CLIENT_SECRET = " *** ", AICORE_RESOURCE_GROUP = " *** ", AICORE_BASE_URL = "https://api.ai.***.cfapps.sap.hana.ondemand.com/v2"
Usage - LiteLLM Python SDK​
SAP Chat Completion
from litellm import completion
import os
os.environ["AICORE_SERVICE_KEY"] = '{"clientid": "...", "clientsecret": "...", ...}'
response = completion(
model="sap/gpt-4",
messages=[{"role": "user", "content": "Hello from LiteLLM"}]
)
print(response)
SAP Chat Completion - Streaming
from litellm import completion
import os
os.environ["AICORE_SERVICE_KEY"] = '{"clientid": "...", "clientsecret": "...", ...}'
response = completion(
model="sap/gpt-4",
messages=[{"role": "user", "content": "Hello from LiteLLM"}],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.content or "", end="")
SAP Embedding
from litellm import embedding
import os
os.environ["AICORE_SERVICE_KEY"] = '{"clientid": "...", "clientsecret": "...", ...}'
result = embedding(
model="sap/text-embedding-3-small",
input="Answer to the ultimate question of life, the universe, and everything is 42")
print(result.data[0])
Usage - LiteLLM Proxy​
Add to your LiteLLM Proxy config:
config.yaml
model_list:
- model_name: "sap/*"
litellm_params:
model: "sap/*"
general_settings:
master_key: your-proxy-api-key
environment_variables:
AICORE_SERVICE_KEY: '{"clientid": "...", "clientsecret": "...", ...}'
Start the proxy:
Start Proxy
litellm --config config.yaml
- cURL
- OpenAI SDK
- LiteLLM SDK
Test Request
curl http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-proxy-api-key" \
-d '{
"model": "sap/gpt-4",
"messages": [{"role": "user", "content": "Hello"}]
}'
OpenAI SDK
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:4000",
api_key="your-proxy-api-key"
)
response = client.chat.completions.create(
model="sap/gpt-4",
messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)
LiteLLM SDK
import os
import litellm
os.environ["LITELLM_PROXY_API_KEY"] = "your-proxy-api-key"
litellm.use_litellm_proxy = True # it is important to set this parameter
response = litellm.completion(
model="sap/gpt-4o",
messages=[{ "content": "Hello, how are you?","role": "user"}],
api_base="http://your-proxy-api-base"
)
print(response)
Supported Parameters​
| Parameter | Description |
|---|---|
temperature | Controls randomness |
max_tokens | Maximum tokens in response |
top_p | Nucleus sampling |
tools | Function calling tools |
tool_choice | Tool selection behavior |
response_format | Output format (json_object, json_schema) |
stream | Enable streaming |