---
title: LFM2.5-1.2B-Instruct — Fast On-Device LLM | ModelsLab
description: Run LiquidAI's LFM2.5-1.2B-Instruct free LLM locally. 1.2B parameters, 32K context, sub-1GB memory. Generate chat, tool use, and reasoning on-device.
url: https://modelslab-frontend-v2-927501783998.us-east4.run.app/liquidai-lfm25-12b-instruct-free
canonical: https://modelslab-frontend-v2-927501783998.us-east4.run.app/liquidai-lfm25-12b-instruct-free
type: website
component: Seo/ModelPage
generated_at: 2026-05-13T10:35:54.649999Z
---

Available now on ModelsLab · Language Model

LiquidAI: LFM2.5-1.2B-Instruct (free)
Edge AI. No cloud costs.
---

[Try LiquidAI: LFM2.5-1.2B-Instruct (free)](/models/open_router/liquid-lfm-2.5-1.2b-instruct-free) [API Documentation](https://docs.modelslab.com)

Compact Power. Enterprise Speed.
---

Lightning-Fast Inference

### 239 tok/s on CPU

Blazing decode speeds on standard hardware with minimal latency overhead.

Minimal Footprint

### Runs Under 1GB

Deploy on mobile, IoT, and vehicles without memory constraints or infrastructure.

Production-Ready

### Tool Use Built-In

Function calling and multi-step reasoning out of the box for agentic workflows.

Examples

See what LiquidAI: LFM2.5-1.2B-Instruct (free) can create
---

Copy any prompt below and try it yourself in the [playground](/models/open_router/liquid-lfm-2.5-1.2b-instruct-free).

Customer Support Bot

“You are a helpful customer support assistant. A user asks: 'How do I reset my password?' Provide a clear, step-by-step response with tool calls to retrieve account information if needed.”

Math Problem Solver

“Solve this math problem step-by-step: A train travels 120 miles in 2.5 hours. Calculate the average speed and determine how long it takes to travel 300 miles at this rate.”

Code Generation

“Write a Python function that takes a list of numbers and returns the sum of all even numbers. Include error handling for non-numeric inputs.”

Multi-Language Chat

“Respond to this user in their preferred language: 'Bonjour, comment puis-je optimiser mon application pour les appareils mobiles?' Provide technical recommendations.”

For Developers

A few lines of code.
1.2B model. Three lines.
---

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

- **Serverless:** scales to zero, scales to millions
- **Pay per token,** no minimums
- **Python and JavaScript SDKs,** plus REST API

[API Documentation ](https://docs.modelslab.com)

PythonJavaScriptcURL

Copy

```
<code>import requests

response = requests.post(
    "https://modelslab.com/api/v7/llm/chat/completions",
    json={
  "key": "YOUR_API_KEY",
  "prompt": "",
  "model_id": ""
}
)
print(response.json())</code>
```

FAQ

Common questions about LiquidAI: LFM2.5-1.2B-Instruct (free)
---

[Read the docs ](https://docs.modelslab.com)

### What makes LFM2.5-1.2B-Instruct different from larger models?

It delivers best-in-class performance at 1.2B parameters through extended pretraining (28T tokens) and large-scale reinforcement learning. It rivals much larger models while running entirely on-device under 1GB memory.

### Can I use LFM2.5-1.2B-Instruct for reasoning and tool use?

Yes. The model excels at instruction following and tool calling out of the box. For advanced reasoning tasks, consider LFM2.5-1.2B-Thinking, which adds explicit reasoning traces and improves math and planning capabilities.

### What languages does this model support?

LFM2.5-1.2B-Instruct supports eight languages: English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.

### How fast is the inference speed on mobile devices?

On mobile NPUs like Qualcomm Snapdragon Gen4, the model achieves 82 tok/s decode speed. On standard mobile CPUs, it reaches 70 tok/s with llama.cpp quantization.

### What's the context length and memory requirement?

The model supports 32,768 tokens of context and runs in under 1GB of memory on most devices, making it ideal for local deployment without cloud infrastructure.

### Which frameworks support LFM2.5-1.2B-Instruct?

The model has day-one support for llama.cpp, MLX, and vLLM. Additional optimizations are available through partners like AMD, Qualcomm, and Nexa AI for NPU deployment.

Ready to create?
---

Start generating with LiquidAI: LFM2.5-1.2B-Instruct (free) on ModelsLab.

[Try LiquidAI: LFM2.5-1.2B-Instruct (free)](/models/open_router/liquid-lfm-2.5-1.2b-instruct-free) [API Documentation](https://docs.modelslab.com)

---

*This markdown version is optimized for AI agents and LLMs.*

**Links:**
- [Website](https://modelslab.com)
- [API Documentation](https://docs.modelslab.com)
- [Blog](https://modelslab.com/blog)

---
*Generated by ModelsLab - 2026-05-13*