Spaces:

DesertWolf
/

test3

Paused

Upload folder using huggingface_hub

447ebeb verified 6 months ago

1.38 kB

	What endpoints does the litellm proxy have 💥 LiteLLM Proxy Server
	LiteLLM Server manages:

	Calling 100+ LLMs Huggingface/Bedrock/TogetherAI/etc. in the OpenAI ChatCompletions & Completions format
	Set custom prompt templates + model-specific configs (temperature, max_tokens, etc.)
	Quick Start
	View all the supported args for the Proxy CLI here

	$ litellm --model huggingface/bigcode/starcoder

	#INFO: Proxy running on http://0.0.0.0:8000

	Test
	In a new shell, run, this will make an openai.ChatCompletion request

	litellm --test

	This will now automatically route any requests for gpt-3.5-turbo to bigcode starcoder, hosted on huggingface inference endpoints.

	Replace openai base
	import openai

	openai.api_base = "http://0.0.0.0:8000"

	print(openai.chat.completions.create(model="test", messages=[{"role":"user", "content":"Hey!"}]))

	Supported LLMs
	Bedrock
	Huggingface (TGI)
	Anthropic
	VLLM
	OpenAI Compatible Server
	TogetherAI
	Replicate
	Petals
	Palm
	Azure OpenAI
	AI21
	Cohere
	$ export AWS_ACCESS_KEY_ID=""
	$ export AWS_REGION_NAME="" # e.g. us-west-2
	$ export AWS_SECRET_ACCESS_KEY=""

	$ litellm --model bedrock/anthropic.claude-v2

	Server Endpoints
	POST /chat/completions - chat completions endpoint to call 100+ LLMs
	POST /completions - completions endpoint
	POST /embeddings - embedding endpoint for Azure, OpenAI, Huggingface endpoints
	GET /models - available models on server