SAP Generative AI Hub

LiteLLM supports SAP Generative AI Hub's Orchestration Service.

Property	Details
Description	SAP's Generative AI Hub provides access to OpenAI, Anthropic, Gemini, Mistral, NVIDIA, Amazon, and SAP LLMs through the AI Core orchestration service.
Provider Route on LiteLLM	`sap/`
Supported Endpoints	`/chat/completions`, `/embeddings`
API Reference	SAP AI Core Documentation

Authentication

SAP Generative AI Hub uses service key authentication. You can provide credentials via:

Environment variable - Set AICORE_SERVICE_KEY with your service key JSON
Direct parameter - Pass api_key with the service key JSON string

Environment Variable
import os
os.environ["AICORE_SERVICE_KEY"] = '{"clientid": "...", "clientsecret": "...", ...}'

Environment variables - Set the following list of credentials in .env file

AICORE_AUTH_URL = "https://* * * .authentication.sap.hana.ondemand.com/oauth/token", AICORE_CLIENT_ID = " *** ", AICORE_CLIENT_SECRET = " *** ", AICORE_RESOURCE_GROUP = " *** ", AICORE_BASE_URL = "https://api.ai.***.cfapps.sap.hana.ondemand.com/v2"

Usage - LiteLLM Python SDK

SAP Chat Completion
from litellm import completion
import os

os.environ["AICORE_SERVICE_KEY"] = '{"clientid": "...", "clientsecret": "...", ...}'

response = completion(
    model="sap/gpt-4",
    messages=[{"role": "user", "content": "Hello from LiteLLM"}]
)
print(response)

SAP Chat Completion - Streaming
from litellm import completion
import os

os.environ["AICORE_SERVICE_KEY"] = '{"clientid": "...", "clientsecret": "...", ...}'

response = completion(
    model="sap/gpt-4",
    messages=[{"role": "user", "content": "Hello from LiteLLM"}],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

SAP Embedding
from litellm import embedding
import os

os.environ["AICORE_SERVICE_KEY"] = '{"clientid": "...", "clientsecret": "...", ...}'

result = embedding(
    model="sap/text-embedding-3-small", 
	input="Answer to the ultimate question of life, the universe, and everything is 42")
print(result.data[0])

Usage - LiteLLM Proxy

Add to your LiteLLM Proxy config:

config.yaml
model_list:
  - model_name: "sap/*"
    litellm_params:
      model: "sap/*"

general_settings: 
  master_key: your-proxy-api-key 

environment_variables:
  AICORE_SERVICE_KEY: '{"clientid": "...", "clientsecret": "...", ...}'

Start the proxy:

Start Proxy
litellm --config config.yaml

cURL
OpenAI SDK
LiteLLM SDK

Test Request
curl http://localhost:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-proxy-api-key" \
  -d '{
    "model": "sap/gpt-4",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

OpenAI SDK
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:4000",
    api_key="your-proxy-api-key"
)

response = client.chat.completions.create(
    model="sap/gpt-4",
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)

LiteLLM SDK
import os
import litellm
os.environ["LITELLM_PROXY_API_KEY"] = "your-proxy-api-key"
litellm.use_litellm_proxy = True  # it is important to set this parameter
response = litellm.completion(
    model="sap/gpt-4o",
    messages=[{ "content": "Hello, how are you?","role": "user"}],
    api_base="http://your-proxy-api-base"
)

print(response)

Supported Parameters

Parameter	Description
`temperature`	Controls randomness
`max_tokens`	Maximum tokens in response
`top_p`	Nucleus sampling
`tools`	Function calling tools
`tool_choice`	Tool selection behavior
`response_format`	Output format (json_object, json_schema)
`stream`	Enable streaming

Authentication​

Usage - LiteLLM Python SDK​

Usage - LiteLLM Proxy​

Supported Parameters​

Authentication

Usage - LiteLLM Python SDK

Usage - LiteLLM Proxy

Supported Parameters