JSON mode
Effective August 1, 2024 Anyscale Endpoints API will be available exclusively through the fully Hosted Anyscale Platform. Multi-tenant access to LLM models will be removed.
With the Hosted Anyscale Platform, you can access the latest GPUs billed by the second, and deploy models on your own dedicated instances. Enjoy full customization to build your end-to-end applications with Anyscale. Get started today.
JSON Mode enables the LLM to produce a JSON-formatted responses, useful when you want to integrate the LLM with other systems that expect a specific JSON schema.
You can set the JSON schema by setting the response_format = {"type": "json_object"}
parameter in the POST request to the LLM. You can optionally provide a schema by setting the response_format = {"type": "json_object", "schema": <schema>}
parameter. If you don't provide a schema, the LLM outputs any JSON object that fits.
Supported models
The following models support JSON mode:
Support for mistralai/Mixtral-8x22B-Instruct-v0.1
is coming soon.
top_p
and top_k
sampling parameters are incompatible with JSON mode.
Example
- cURL
- Python
- OpenAI Python SDK
curl "$ANYSCALE_BASE_URL/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ANYSCALE_API_KEY" \
-d '{
"model": "mistralai/Mixtral-8x7B-Instruct-v0.1",
"messages": [{"role": "system", "content": "You are a helpful assistant that outputs in JSON."}, {"role": "user", "content": "Who won the world series in 2020"}],
"response_format": {"type": "json_object", "schema": {"type": "object", "properties": {"team_name": {"type": "string"}}, "required": ["team_name"]}},
"temperature": 0.7
}'
import os
import requests
s = requests.Session()
api_base = os.getenv("ANYSCALE_BASE_URL")
token = os.getenv("ANYSCALE_API_KEY")
url = f"{api_base}/chat/completions"
body = {
"model": "mistralai/Mixtral-8x7B-Instruct-v0.1",
"messages": [
{"role": "system", "content": "You are a helpful assistant that outputs in JSON."},
{"role": "user", "content": "Who won the world series in 2020"}
],
"response_format": {
"type": "json_object",
"schema": {
"type": "object",
"properties": {
"team_name": {"type": "string"}
},
"required": ["team_name"]
},
},
"temperature": 0.7
}
with s.post(url, headers={"Authorization": f"Bearer {token}"}, json=body) as resp:
print(resp.json())
import openai
client = openai.OpenAI(
base_url = "https://api.endpoints.anyscale.com/v1",
api_key = "esecret_YOUR_API_KEY")
# Note: not all arguments are currently supported and will be ignored by the backend.
chat_completion = client.chat.completions.create(
model="mistralai/Mixtral-8x7B-Instruct-v0.1",
messages=[
{"role": "system", "content": "You are a helpful assistant that outputs in JSON."},
{"role": "user", "content": "Who won the world series in 2020"}
],
response_format={
"type": "json_object",
"schema": {
"type": "object",
"properties": {
"team_name": {"type": "string"}
},
"required": ["team_name"]
},
},
temperature=0.7
)
print(chat_completion.model_dump())
The output will look like:
{'choices': [{'finish_reason': 'stop',
'index': 0,
'message': {'content': ' {\n'
'"team_name": " Los Angeles Dodgers "\n'
'}',
'function_call': None,
'role': 'assistant',
'tool_calls': None}}],
'created': 1701933511,
'id': 'mistralai/Mixtral-8x7B-Instruct-v0.1-4-7sX-l4EqKMl0FxZu7f5HQ8DXIKMIQgCkT1sWGLwn8',
'model': 'mistralai/Mixtral-8x7B-Instruct-v0.1',
'object': 'text_completion',
'system_fingerprint': None,
'usage': {'completion_tokens': 16, 'prompt_tokens': 30, 'total_tokens': 46}}
Not explicitly instructing the model to output in JSON is prone to outputting a sting of whitespaces before and after the JSON object, costing more tokens and time. To prevent this issue Anyscale raises an error if the key json
isn't found in your message, insensitive to case.
Streaming JSON mode completions
You can also use the streaming API to get JSON mode completions. Here's an example using the Python SDK:
import openai
client = openai.OpenAI(
base_url = "https://api.endpoints.anyscale.com/v1",
api_key = "esecret_YOUR_API_KEY")
# Note: not all arguments are currently supported and will be ignored by the backend.
chat_completion = client.chat.completions.create(
model="mistralai/Mixtral-8x7B-Instruct-v0.1",
messages=[
{"role": "system", "content": "You are a helpful assistant that outputs in JSON."},
{"role": "user", "content": "Who won the world series in 2020"}
],
response_format={
"type": "json_object",
"schema": {
"type": "object",
"properties": {
"team_name": {"type": "string"}
},
"required": ["team_name"]
},
},
temperature=0.7,
stream=True
)
for chunk in chat_completion:
print(chunk)
Example output:
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=ChoiceLogprobs(content=[]))], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content=' {', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='"', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='team', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='_', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='name', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='":', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content=' "', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='Los', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content=' Angeles', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content=' Dod', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='gers', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='"', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809316, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='}', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809317, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809317, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809317, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809317, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809317, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809317, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809317, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809317, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809317, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809317, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809317, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809318, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='\n', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809318, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content='', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1706809318, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage=None)
ChatCompletionChunk(id='mistralai/Mixtral-8x7B-Instruct-v0.1-ssaOTZXxxoS44tiPeIhJWcYKDnKLy3Lnxk20OzxAuwM', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1706809318, model='mistralai/Mixtral-8x7B-Instruct-v0.1', object='text_completion', system_fingerprint=None, usage={'prompt_tokens': 30, 'completion_tokens': 28, 'total_tokens': 58})