Call all LLM APIs using the OpenAI format [Bedrock, Huggingface, VertexAI, TogetherAI, Azure, OpenAI, Groq etc.]
LiteLLM manages:
completion
, embedding
, and image_generation
endpoints['choices'][0]['message']['content']
Jump to LiteLLM Proxy (LLM Gateway) Docs
Jump to Supported LLM Providers
🚨 Stable Release: Use docker images with the -stable
tag. These have undergone 12 hour load tests, before being published. More information about the release cycle here
Support for more providers. Missing a provider or LLM Platform, raise a feature request.
Important
LiteLLM v1.0.0 now requires openai>=1.0.0
. Migration guide here LiteLLM v1.40.14+ now requires pydantic>=2.0.0
. No changes required.
from litellm import completion import os ## set ENV variables os.environ["OPENAI_API_KEY"] = "your-openai-key" os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-key" messages = [{ "content": "Hello, how are you?","role": "user"}] # openai call response = completion(model="openai/gpt-4o", messages=messages) # anthropic call response = completion(model="anthropic/claude-sonnet-4-20250514", messages=messages) print(response)
{ "id": "chatcmpl-1214900a-6cdd-4148-b663-b5e2f642b4de", "created": 1751494488, "model": "claude-sonnet-4-20250514", "object": "chat.completion", "system_fingerprint": null, "choices": [ { "finish_reason": "stop", "index": 0, "message": { "content": "Hello! I'm doing well, thank you for asking. I'm here and ready to help with whatever you'd like to discuss or work on. How are you doing today?", "role": "assistant", "tool_calls": null, "function_call": null } } ], "usage": { "completion_tokens": 39, "prompt_tokens": 13, "total_tokens": 52, "completion_tokens_details": null, "prompt_tokens_details": { "audio_tokens": null, "cached_tokens": 0 }, "cache_creation_input_tokens": 0, "cache_read_input_tokens": 0 } }
Call any model supported by a provider, with model=<provider_name>/<model_name>
. There might be provider-specific details here, so refer to provider docs for more information
from litellm import acompletion import asyncio async def test_get_response(): user_message = "Hello, how are you?" messages = [{"content": user_message, "role": "user"}] response = await acompletion(model="openai/gpt-4o", messages=messages) return response response = asyncio.run(test_get_response()) print(response)
liteLLM supports streaming the model response back, pass stream=True
to get a streaming iterator in response. Streaming is supported for all models (Bedrock, Huggingface, TogetherAI, Azure, OpenAI, etc.)
from litellm import completion response = completion(model="openai/gpt-4o", messages=messages, stream=True) for part in response: print(part.choices[0].delta.content or "") # claude sonnet 4 response = completion('anthropic/claude-sonnet-4-20250514', messages, stream=True) for part in response: print(part)Response chunk (OpenAI Format)
{ "id": "chatcmpl-fe575c37-5004-4926-ae5e-bfbc31f356ca", "created": 1751494808, "model": "claude-sonnet-4-20250514", "object": "chat.completion.chunk", "system_fingerprint": null, "choices": [ { "finish_reason": null, "index": 0, "delta": { "provider_specific_fields": null, "content": "Hello", "role": "assistant", "function_call": null, "tool_calls": null, "audio": null }, "logprobs": null } ], "provider_specific_fields": null, "stream_options": null, "citations": null }Logging Observability (Docs)
LiteLLM exposes pre defined callbacks to send data to Lunary, MLflow, Langfuse, DynamoDB, s3 Buckets, Helicone, Promptlayer, Traceloop, Athina, Slack
from litellm import completion ## set env variables for logging tools (when using MLflow, no API key set up is required) os.environ["LUNARY_PUBLIC_KEY"] = "your-lunary-public-key" os.environ["HELICONE_API_KEY"] = "your-helicone-auth-key" os.environ["LANGFUSE_PUBLIC_KEY"] = "" os.environ["LANGFUSE_SECRET_KEY"] = "" os.environ["ATHINA_API_KEY"] = "your-athina-api-key" os.environ["OPENAI_API_KEY"] = "your-openai-key" # set callbacks litellm.success_callback = ["lunary", "mlflow", "langfuse", "athina", "helicone"] # log input/output to lunary, langfuse, supabase, athina, helicone etc #openai call response = completion(model="openai/gpt-4o", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])LiteLLM Proxy Server (LLM Gateway) - (Docs)
Track spend + Load Balance across multiple projects
The proxy provides:
pip install 'litellm[proxy]'Step 1: Start litellm proxy
$ litellm --model huggingface/bigcode/starcoder #INFO: Proxy running on http://0.0.0.0:4000Step 2: Make ChatCompletions Request to Proxy
import openai # openai v1.0.0+ client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:4000") # set proxy to base_url # request sent to model set on litellm proxy, `litellm --model` response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [ { "role": "user", "content": "this is a test request, write a short poem" } ]) print(response)Proxy Key Management (Docs)
Connect the proxy with a Postgres DB to create proxy keys
# Get the code git clone https://github.com/BerriAI/litellm # Go to folder cd litellm # Add the master key - you can change this after setup echo 'LITELLM_MASTER_KEY="sk-1234"' > .env # Add the litellm salt key - you cannot change this after adding a model # It is used to encrypt / decrypt your LLM API Key credentials # We recommend - https://1password.com/password-generator/ # password generator to get a random hash for litellm salt key echo 'LITELLM_SALT_KEY="sk-1234"' >> .env source .env # Start docker-compose up
UI on /ui
on your proxy server
Set budgets and rate limits across multiple projects POST /key/generate
curl 'http://0.0.0.0:4000/key/generate' \ --header 'Authorization: Bearer sk-1234' \ --header 'Content-Type: application/json' \ --data-raw '{"models": ["gpt-3.5-turbo", "gpt-4", "claude-2"], "duration": "20m","metadata": {"user": "ishaan@berri.ai", "team": "core-infra"}}'
{ "key": "sk-kdEXbIqZRwEeEiHwdg7sFA", # Bearer token "expires": "2023-11-19T01:38:25.838000+00:00" # datetime object }Supported Providers (Docs)
Interested in contributing? Contributions to LiteLLM Python SDK, Proxy Server, and LLM integrations are both accepted and highly encouraged!
Quick start: git clone
→ make install-dev
→ make format
→ make lint
→ make test-unit
See our comprehensive Contributing Guide (CONTRIBUTING.md) for detailed instructions.
For companies that need better security, user management and professional support
This covers:
We welcome contributions to LiteLLM! Whether you're fixing bugs, adding features, or improving documentation, we appreciate your help.
Quick Start for ContributorsThis requires poetry to be installed.
git clone https://github.com/BerriAI/litellm.git cd litellm make install-dev # Install development dependencies make format # Format your code make lint # Run all linting checks make test-unit # Run unit tests make format-check # Check formatting only
For detailed contributing guidelines, see CONTRIBUTING.md.
LiteLLM follows the Google Python Style Guide.
Our automated checks include:
All these checks must pass before your PR can be merged.
Support / talk with foundersdocker-compose up db prometheus
python -m venv .venv
source .venv/bin/activate
pip install -e ".[all]"
python3 /path/to/litellm/proxy_cli.py
ui/litellm-dashboard
npm install
npm run dev
to start the dashboardRetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4