Spaces:
Paused
Paused
| What endpoints does the litellm proxy have 💥 LiteLLM Proxy Server | |
| LiteLLM Server manages: | |
| Calling 100+ LLMs Huggingface/Bedrock/TogetherAI/etc. in the OpenAI ChatCompletions & Completions format | |
| Set custom prompt templates + model-specific configs (temperature, max_tokens, etc.) | |
| Quick Start | |
| View all the supported args for the Proxy CLI here | |
| $ litellm --model huggingface/bigcode/starcoder | |
| #INFO: Proxy running on http://0.0.0.0:8000 | |
| Test | |
| In a new shell, run, this will make an openai.ChatCompletion request | |
| litellm --test | |
| This will now automatically route any requests for gpt-3.5-turbo to bigcode starcoder, hosted on huggingface inference endpoints. | |
| Replace openai base | |
| import openai | |
| openai.api_base = "http://0.0.0.0:8000" | |
| print(openai.chat.completions.create(model="test", messages=[{"role":"user", "content":"Hey!"}])) | |
| Supported LLMs | |
| Bedrock | |
| Huggingface (TGI) | |
| Anthropic | |
| VLLM | |
| OpenAI Compatible Server | |
| TogetherAI | |
| Replicate | |
| Petals | |
| Palm | |
| Azure OpenAI | |
| AI21 | |
| Cohere | |
| $ export AWS_ACCESS_KEY_ID="" | |
| $ export AWS_REGION_NAME="" # e.g. us-west-2 | |
| $ export AWS_SECRET_ACCESS_KEY="" | |
| $ litellm --model bedrock/anthropic.claude-v2 | |
| Server Endpoints | |
| POST /chat/completions - chat completions endpoint to call 100+ LLMs | |
| POST /completions - completions endpoint | |
| POST /embeddings - embedding endpoint for Azure, OpenAI, Huggingface endpoints | |
| GET /models - available models on server |