---
title: Eleven Multilingual v2 API — Multilingual TTS via ModelsLab
description: Generate speech in 30+ languages with Eleven Multilingual v2 via REST API. ElevenLabs alternative pricing, broadcast quality, voice cloning support.
url: https://modelslab-frontend-v2-927501783998.us-east4.run.app/eleven-multilingual-v2
canonical: https://modelslab-frontend-v2-927501783998.us-east4.run.app/eleven-multilingual-v2
type: website
component: Seo/ModelPage
generated_at: 2026-05-21T09:25:42.638420Z
---

Available now on ModelsLab · Voice & Audio

Eleven Multilingual v2 API — Multilingual Speech Generation
Eleven Multilingual v2 TTS in 30+ languages via REST API. Pay per character.
---

[Get Eleven Multilingual v2 API key](/register) [API documentation](https://docs.modelslab.com)

Sample output

Why teams ship with Eleven Multilingual v2
---

Multilingual v2

### 30+ languages from one model

Eleven Multilingual v2 produces natural speech in 30+ languages including English, Spanish, French, German, Italian, Portuguese, Polish, Mandarin, Japanese, Korean, Hindi, and Arabic. One model handles them all.

Voice library

### Pre-built voices for instant use

Choose from a library of pre-built voices spanning male, female, neutral, and character profiles. Each voice supports all 30+ languages without retraining.

Voice cloning compatible

### Use your cloned voices

Upload a voice sample to the voice cloning API, then synthesize multilingual speech with your custom voice. Same voice, every language.

Streaming output

### Low-latency audio streaming

Stream generated audio chunk-by-chunk for real-time conversational applications. Latency under 400ms to first audio chunk on dedicated infrastructure.

Output formats

### MP3, WAV, PCM, and Opus

Choose the output format that fits your pipeline: MP3 for web playback, WAV for editing, PCM for low-level processing, Opus for streaming applications.

Predictable pricing

### Pay per character generated

Per-character pricing — no per-minute or per-month surprises. Run the math: a 5-minute audiobook chapter typically costs $0.30–$0.50 to generate.

No vendor lock-in

### Same key for image, video, LLM

ModelsLab gives you a single API key across modalities. Use Eleven Multilingual v2 alongside the image, video, and LLM APIs without juggling vendor accounts.

Compliance

### GDPR-ready, DPA available

Generated audio and source text are processed in compliant regions and removed after delivery. Signed DPAs and dedicated VPC deployments available for enterprise.

Examples

Eleven Multilingual v2 use cases
---

Copy any prompt below and try it yourself in the [playground](/register).

Tech Demo

“Explain quantum computing basics in a clear, enthusiastic tone for a tech conference audience.”

Product Pitch

“Describe a new electric vehicle model, highlighting speed, range, and eco features with confident delivery.”

Nature Narration

“Narrate a serene forest walk, capturing bird calls and wind sounds in a calm, immersive voice.”

City Guide

“Guide tourists through urban architecture, pointing out historical landmarks with an engaging local accent.”

For Developers

A few lines of code.
Multilingual speech in one POST
---

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

- **Serverless:** scales to zero, scales to millions
- **Pay per second,** no minimums
- **Python and JavaScript SDKs,** plus REST API

[API Documentation ](https://docs.modelslab.com)

PythonJavaScriptcURL

Copy

```
<code>import requests

response = requests.post(
    "https://modelslab.com/api/v7/voice/text-to-speech",
    json={
  "key": "YOUR_API_KEY",
  "prompt": "Hey, love. I just wanted to say… you're doing beautifully. Even if today felt a little messy, even if you didn’t get everything done  that’s okay. You’re still growing, still trying, still shining. I see your heart, your effort, your gentleness. And I just hope you can feel how much you're loved. So rest easy now. You’re safe, you’re enough, and I’m proud of you  more than words can say.",
  "voice_id": "M7baJQBjzMsrxxZ796H6"
}
)
print(response.json())</code>
```

FAQ

Common questions about Eleven Multilingual v2 API — Multilingual Speech Generation
---

[Read the docs ](https://docs.modelslab.com)

### What is the Eleven Multilingual v2 API?

Eleven Multilingual v2 is a text-to-speech model that produces broadcast-quality speech in 30+ languages from a single API call. ModelsLab exposes the model via a REST endpoint with pay-per-character pricing — no ElevenLabs subscription required.

### Which languages does Eleven Multilingual v2 support?

30+ languages including English, Spanish, French, German, Italian, Portuguese, Polish, Mandarin, Japanese, Korean, Hindi, Arabic, Turkish, Dutch, Czech, Russian, Indonesian, Malay, Filipino, Bulgarian, Romanian, Ukrainian, Greek, Vietnamese, and more. Pass a language parameter or let the model auto-detect from text.

### How is this different from the ElevenLabs API?

The model is the same — Eleven Multilingual v2. The difference is pricing and integration: ModelsLab charges per character with no subscription, exposes the same model via the same call, and bundles it with image, video, and LLM APIs on a single API key.

### Can I clone a voice and use it in multiple languages?

Yes. Use the ModelsLab voice cloning API to create a custom voice from a 10-second sample, then synthesize multilingual speech with that voice. The same voice works across all 30+ supported languages without retraining.

### Does the API support streaming output?

Yes. Set stream=true in the request to receive audio chunks via server-sent events. Latency to the first audio chunk is typically under 400ms, suitable for real-time conversational apps.

### What audio formats does the API output?

MP3 (default, web-friendly), WAV (lossless, editing-ready), PCM (raw audio for processing pipelines), and Opus (low-latency streaming). Specify with the output_format parameter.

### How much does Eleven Multilingual v2 cost?

Pricing is per character, starting at $0.0002 per character. A 1-minute audiobook chapter (~150 words, ~750 characters) costs approximately $0.15. No monthly minimum, no subscription.

### What latency should I expect?

For non-streaming, full audio for 100 characters of text generates in 1–2 seconds. For streaming, time-to-first-audio is under 400ms. Latency is consistent across requests — no cold starts.

### Are there usage rate limits?

Default limits are 60 requests per minute, scaling automatically with paid usage. Enterprise plans include higher limits and dedicated capacity. Contact sales for custom rate terms.

### Is the Eleven Multilingual v2 API GDPR-compliant?

Yes. Source text and generated audio are processed in compliant regions and removed from infrastructure after delivery. Signed DPAs and dedicated VPC deployments available for enterprise customers.

Ready to create?
---

Start generating with Eleven Multilingual v2 API — Multilingual Speech Generation on ModelsLab.

[Get Eleven Multilingual v2 API key](/register) [API documentation](https://docs.modelslab.com)

---

*This markdown version is optimized for AI agents and LLMs.*

**Links:**
- [Website](https://modelslab.com)
- [API Documentation](https://docs.modelslab.com)
- [Blog](https://modelslab.com/blog)

---
*Generated by ModelsLab - 2026-05-21*