Happy Horse 1.0 is now on ModelsLab

Try Now
Skip to main content
Available now on ModelsLab · Language Model

Mixtral-8x7B Instruct v0.1Sparse Experts, Dense Power

Mixtral-8x7B Instruct v0.1

Run Mixtral Efficiently

Cost Efficient

12.9B Active Params

Mixtral-8x7B Instruct v0.1 uses 46.7B total parameters, activates 12.9B per token for 5x less compute than dense 70B models.

Top Benchmarks

Beats GPT-3.5 Turbo

Scores 8.30 on MT-Bench, 70.6% MMLU, 1114 Arena ELO, outperforming GPT-3.5 and Claude 2.1.

Long Context

32K Token Window

Handles extended dialogues and document analysis via sparse expert routing without dense model memory costs.

Examples

See what Mixtral-8x7B Instruct v0.1 can create

Copy any prompt below and try it yourself in the playground.

Code Debug

<s>[INST] Debug this Python function that calculates Fibonacci numbers inefficiently: def fib(n): if n <= 1: return n else: return fib(n-1) + fib(n-2) [/INST]

Multilingual Chat

<s>[INST] Explain quantum entanglement in French, then translate to English. Keep it simple for beginners. [/INST]

Reasoning Task

<s>[INST] Step-by-step: If a train leaves at 3 PM traveling 60 mph, and another at 4 PM at 80 mph, when does the second catch the first if 200 miles apart? [/INST]

Story Continuation

<s>[INST] Continue this sci-fi story: The last human awoke in a derelict spaceship, stars unfamiliar. Alarms blared as... [/INST]

For Developers

A few lines of code.
Instruct chat. One call.

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

  • Serverless: scales to zero, scales to millions
  • Pay per token, no minimums
  • Python and JavaScript SDKs, plus REST API
import requests
response = requests.post(
"https://modelslab.com/api/v7/llm/chat/completions",
json={
"key": "YOUR_API_KEY",
"prompt": "",
"model_id": ""
}
)
print(response.json())

FAQ

Common questions about Mixtral-8x7B Instruct v0.1

Read the docs

Mixtral-8x7B Instruct v0.1 is Mistral AI's sparse mixture-of-experts LLM with 8x7B parameters, instruction-tuned for dialogue, code, and reasoning. It activates 12.9B params per token. Licensed Apache 2.0.

Mixtral-8x7B Instruct v0.1 API beats GPT-3.5 Turbo on MT-Bench (8.30 vs 7.94), MMLU (70.6% vs 70.0%), and Arena ELO (1114 vs 1105). Uses less compute.

Supports 32K token context for long dialogues and analysis. Some providers limit to 16K input and output.

Matches Llama 2 70B quality across benchmarks at 5x less inference compute. Ideal for multilingual tasks and chatbots.

Fine-tuned for code tasks via supervised fine-tuning and DPO. Use chat template for precise outputs.

Available via ModelsLab endpoints for text generation. Supports JSON requests with instruction format.

Ready to create?

Start generating with Mixtral-8x7B Instruct v0.1 on ModelsLab.