Step 3.5 Flash
196B MoE (11B active)
196B MoE with only 11B active params. Open-weight (Apache 2.0). Uses MTP-3 for fast inference. One of the cheapest capable models available. Also on OpenRouter.
Pricing
| Type | USD / 1M tokens |
|---|---|
| Input | $0.1 |
| Output | $0.3 |
| Cached Input | $0.02 |
vs Western Models
Benchmarks
| Benchmark | Score |
|---|---|
| HumanEval | 81.1% |
IDE Setup
Cursor
Cline
Strengths
- + extreme cost-efficiency
- + fast inference (100-300 tok/s)
- + open-weight
- + agentic tasks
Watch Out For
- ! content censorship
- ! no flagship-tier model
- ! small ecosystem
International Access
Frequently Asked Questions
How much does Step 3.5 Flash cost?
Step 3.5 Flash costs $0.1 per million input tokens and $0.3 per million output tokens (USD). Cached input is $0.02/M.
Can I use Step 3.5 Flash from outside China?
Yes. Step 3.5 Flash is directly accessible internationally. Registration requires account + Stripe. Payment via credit card.
Is Step 3.5 Flash compatible with the OpenAI API?
Yes. Step 3.5 Flash uses an OpenAI-compatible API. You can use any OpenAI SDK by changing the base URL to api.stepfun.com.
What is the context window of Step 3.5 Flash?
Step 3.5 Flash supports a 262K token context window (262,144 tokens).
How do I use Step 3.5 Flash in Cursor or Cline?
Set the base URL to https://api.stepfun.com/v1 and the model name to step-3.5-flash. Both Cursor and Cline support OpenAI-compatible providers.
Related Articles & Guides
Ready to try Step 3.5 Flash?
Get your API key from the official platform.
Go to StepFun →