Available now on ModelsLab · Language Model

Mixtral-8x7B Instruct v0.1
Sparse Experts, Dense Power

Try Mixtral-8x7B Instruct v0.1 API Documentation

Run Mixtral Efficiently

Cost Efficient

12.9B Active Params

Mixtral-8x7B Instruct v0.1 uses 46.7B total parameters, activates 12.9B per token for 5x less compute than dense 70B models.

Top Benchmarks

Beats GPT-3.5 Turbo

Scores 8.30 on MT-Bench, 70.6% MMLU, 1114 Arena ELO, outperforming GPT-3.5 and Claude 2.1.

Long Context

32K Token Window

Handles extended dialogues and document analysis via sparse expert routing without dense model memory costs.

Examples

See what Mixtral-8x7B Instruct v0.1 can create

Copy any prompt below and try it yourself in the playground.

Code Debug

“<s>[INST] Debug this Python function that calculates Fibonacci numbers inefficiently: def fib(n): if n <= 1: return n else: return fib(n-1) + fib(n-2) [/INST]”

Multilingual Chat

“<s>[INST] Explain quantum entanglement in French, then translate to English. Keep it simple for beginners. [/INST]”

Reasoning Task

“<s>[INST] Step-by-step: If a train leaves at 3 PM traveling 60 mph, and another at 4 PM at 80 mph, when does the second catch the first if 200 miles apart? [/INST]”

Story Continuation

“<s>[INST] Continue this sci-fi story: The last human awoke in a derelict spaceship, stars unfamiliar. Alarms blared as... [/INST]”

For Developers

A few lines of code.
Instruct chat. One call.

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

Serverless: scales to zero, scales to millions
Pay per token, no minimums
Python and JavaScript SDKs, plus REST API

API Documentation

import requests

response = requests.post(
    "https://modelslab.com/api/v7/llm/chat/completions",
    json={
  "key": "YOUR_API_KEY",
  "prompt": "",
  "model_id": ""
}
)
print(response.json())

FAQ

Common questions about Mixtral-8x7B Instruct v0.1

Read the docs

Mixtral-8x7B Instruct v0.1 is Mistral AI's sparse mixture-of-experts LLM with 8x7B parameters, instruction-tuned for dialogue, code, and reasoning. It activates 12.9B params per token. Licensed Apache 2.0.

Mixtral-8x7B Instruct v0.1 API beats GPT-3.5 Turbo on MT-Bench (8.30 vs 7.94), MMLU (70.6% vs 70.0%), and Arena ELO (1114 vs 1105). Uses less compute.

Supports 32K token context for long dialogues and analysis. Some providers limit to 16K input and output.

Matches Llama 2 70B quality across benchmarks at 5x less inference compute. Ideal for multilingual tasks and chatbots.

Fine-tuned for code tasks via supervised fine-tuning and DPO. Use chat template for precise outputs.

Available via ModelsLab endpoints for text generation. Supports JSON requests with instruction format.

Ready to create?

Start generating with Mixtral-8x7B Instruct v0.1 on ModelsLab.

Try Mixtral-8x7B Instruct v0.1 API Documentation

Mixtral-8x7B Instruct v0.1Sparse Experts, Dense Power

Run Mixtral Efficiently

12.9B Active Params

Beats GPT-3.5 Turbo

32K Token Window

See what Mixtral-8x7B Instruct v0.1 can create

A few lines of code.Instruct chat. One call.

Common questions about Mixtral-8x7B Instruct v0.1

What is Mixtral-8x7B Instruct v0.1?

How does mixtral 8x7b instruct v0 1 API compare to GPT-3.5?

What is the context window for Mixtral-8x7B Instruct v0.1 model?

Is Mixtral-8x7B Instruct v0.1 alternative to Llama 70B good?

Mixtral-8x7B Instruct v0.1 LLM for code generation?

Where to access mixtral 8x7b instruct v0 1 api?

Ready to create?

Mixtral-8x7B Instruct v0.1
Sparse Experts, Dense Power

A few lines of code.
Instruct chat. One call.