---
title: StepFun: Step 3.5 Flash — Fast Reasoning LLM | ModelsLab
description: Run StepFun: Step 3.5 Flash for 100-300 tok/s agentic reasoning and 256K context. Try this 196B MoE model with 11B active params now.
url: https://modelslab-frontend-v2-927501783998.us-east4.run.app/stepfun-step-35-flash
canonical: https://modelslab-frontend-v2-927501783998.us-east4.run.app/stepfun-step-35-flash
type: website
component: Seo/ModelPage
generated_at: 2026-05-13T09:42:14.099634Z
---

Available now on ModelsLab · Language Model

StepFun: Step 3.5 Flash
Flash Reasoning 196B MoE
---

[Try StepFun: Step 3.5 Flash](/models/open_router/stepfun-step-3.5-flash) [API Documentation](https://docs.modelslab.com)

Reason Deep. Run Fast.
---

MoE Efficiency

### 11B Active Params

Activates 11B of 196B params per token via sparse MoE for top reasoning at 11B speed.

Blazing Speed

### 100-300 Tok/s

3-way Multi-Token Prediction delivers 100-300 tok/s, peaking at 350 tok/s for coding.

Long Context

### 256K Window

Hybrid Sliding Window Attention handles 256K context with low compute overhead.

Examples

See what StepFun: Step 3.5 Flash can create
---

Copy any prompt below and try it yourself in the [playground](/models/open_router/stepfun-step-3.5-flash).

Math Proof

“Solve this AIME-level math problem step-by-step: Prove that for integers n > 1, the sum of divisors function σ(n) satisfies certain bounds. Use chain-of-thought reasoning and verify with code execution if needed.”

Code Agent

“Write a Python function to parse a large codebase, identify bugs in async handlers, and suggest fixes. Output the refactored code with explanations.”

Logic Chain

“Analyze this complex logic puzzle involving 10 agents with constraints. Deduce the solution through multi-step reasoning, listing assumptions and eliminations.”

Data Summary

“Summarize key insights from a 200K token dataset on AI benchmarks, highlighting trends in MoE vs dense models, with quantitative comparisons.”

For Developers

A few lines of code.
Agentic inference. Few lines.
---

ModelsLab handles the infrastructure: fast inference, auto-scaling, and a developer-friendly API. No GPU management needed.

- **Serverless:** scales to zero, scales to millions
- **Pay per token,** no minimums
- **Python and JavaScript SDKs,** plus REST API

[API Documentation ](https://docs.modelslab.com)

PythonJavaScriptcURL

Copy

```
<code>import requests

response = requests.post(
    "https://modelslab.com/api/v7/llm/chat/completions",
    json={
  "key": "YOUR_API_KEY",
  "prompt": "",
  "model_id": ""
}
)
print(response.json())</code>
```

FAQ

Common questions about StepFun: Step 3.5 Flash
---

[Read the docs ](https://docs.modelslab.com)

### What is StepFun: Step 3.5 Flash?

StepFun: Step 3.5 Flash is an open-source 196B MoE LLM activating 11B params per token. It excels in agentic reasoning, coding, and math at 100-300 tok/s. Supports 256K context via hybrid attention.

### How fast is stepfun step 3.5 flash?

Achieves 100-300 tok/s typical throughput with 3-way MTP-3. Peaks at 350 tok/s for coding on Hopper GPUs. Enables real-time multi-step reasoning.

### What is StepFun: Step 3.5 Flash API used for?

StepFun: Step 3.5 Flash API powers fast agentic workflows, deep math, and long-context tasks. Ideal for low-VRAM inference on unified-memory hardware. Rivals proprietary models in benchmarks.

### StepFun: Step 3.5 Flash model specs?

196B total params, 11B active, 45-layer transformer, 256K context, 128K vocab. Uses 288 fine-grained experts per layer, top-8 routed. FP8 quantized.

### Is stepfun: step 3.5 flash api an alternative to closed models?

Yes, StepFun: Step 3.5 Flash alternative matches GPT/Claude/Gemini in math (AIME 99.8%) and agents (ARC-AGI 56.5%). Open-source with superior efficiency. Deploy via API for production.

### StepFun: Step 3.5 Flash vs dense models?

MoE design gives 196B intelligence at 11B speed/latency. Outperforms on agentic tasks while using less VRAM. Handles long contexts cost-efficiently.

Ready to create?
---

Start generating with StepFun: Step 3.5 Flash on ModelsLab.

[Try StepFun: Step 3.5 Flash](/models/open_router/stepfun-step-3.5-flash) [API Documentation](https://docs.modelslab.com)

---

*This markdown version is optimized for AI agents and LLMs.*

**Links:**
- [Website](https://modelslab.com)
- [API Documentation](https://docs.modelslab.com)
- [Blog](https://modelslab.com/blog)

---
*Generated by ModelsLab - 2026-05-13*