The Complete Guide to Using Chinese AI Models from Outside China
Registration, payment, API setup, and the gotchas nobody tells you about. DeepSeek, Qwen, Kimi, and GLM — from signup to production.
61% of OpenRouter’s tokens now go to Chinese models. DeepSeek V3.2 costs $0.28 per million input tokens — that’s 6x cheaper than GPT-5.2. Yet most developers outside China have never signed up for a single Chinese LLM API.
The reason isn’t quality. It’s friction. Chinese docs, unclear registration flows, and payment walls that assume you have Alipay.
This guide eliminates that friction. Every step tested from a US-based machine in March 2026.
The Landscape: What’s Actually Available
Not all Chinese LLM providers are created equal for international users. Here’s the real access situation:
| Provider | Direct API Access | Registration | Payment |
|---|---|---|---|
| DeepSeek | Yes | Email only | Credit card |
| Qwen (Alibaba) | Yes | Alibaba Cloud account | Credit card |
| Kimi (Moonshot) | Yes | Email or phone | Credit card |
| GLM (Zhipu) | Yes | Account on z.ai | Credit card |
| MiniMax | Yes (2-3 day approval) | Email + phone | Credit card |
| Baidu ERNIE | No direct access | Chinese phone required | Alipay/bank transfer |
Bottom line: DeepSeek, Qwen, Kimi, and GLM are all directly accessible. Baidu is the only major provider that’s effectively walled off — use OpenRouter if you need ERNIE.
DeepSeek: The Fastest Setup
DeepSeek has the simplest onboarding of any Chinese provider. Five minutes, no tricks.
Step 1: Create Account
Go to platform.deepseek.com. Click Sign Up. Enter your email. Done.
No Chinese phone number. No identity verification. No waiting period.
Step 2: Get API Key
Dashboard → API Keys → Create. Copy it. That’s your key.
Step 3: Add Credits
Top up → Credit card. Minimum $5. Your card’s international transaction fee may apply, but the charge itself goes through cleanly on Visa/Mastercard.
Step 4: Make Your First Call
curl https://api.deepseek.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-chat",
"messages": [{"role": "user", "content": "Hello"}]
}'
That’s it. The API is OpenAI-compatible — any library that talks to OpenAI works with DeepSeek by changing the base URL.
Available Models
deepseek-chat— General purpose (V3.2). $0.28/$0.42 per M tokens.deepseek-reasoner— Chain-of-thought reasoning (R1). $0.55/$2.19 per M tokens.
Qwen (Alibaba Cloud): Best for International Developers
Alibaba has invested the most in international developer experience. Multiple regional endpoints, English docs, and a familiar cloud console.
Step 1: Create Alibaba Cloud Account
Go to alibabacloud.com. Sign up with email. You’ll land on the cloud console.
Step 2: Enable Model Studio
Navigate to Model Studio (or search for it). Accept the terms of service. This activates your API access.
Step 3: Get API Key
Model Studio → API Keys → Create. Save it.
Step 4: Choose Your Endpoint
Alibaba offers regional endpoints for lower latency:
| Endpoint | Region |
|---|---|
dashscope.aliyuncs.com | China (default) |
dashscope-intl.aliyuncs.com | International |
dashscope-us.aliyuncs.com | US |
Use the international or US endpoint from outside China.
Step 5: Make Your First Call
curl https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3.5-plus",
"messages": [{"role": "user", "content": "Hello"}]
}'
Key Models
qwen3.5-plus— Latest flagship. 1M context. $0.11/$0.66 per M tokens.qwen3.5-flash— Budget multimodal. 1M context. $0.028/$0.275 per M tokens.qwen3-max— Previous top-tier. 262K context. $0.34/$1.37 per M tokens.
The pricing is staggeringly low. Qwen3.5-Flash at $0.028/M input is 60x cheaper than GPT-5.2.
Kimi (Moonshot AI): Agent Specialist
Step 1: Create Account
Go to platform.moonshot.ai. Sign up with email.
Step 2: Get API Key and Top Up
Dashboard → API Keys → Create. Add credits via credit card.
Step 3: Use It
curl https://api.moonshot.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "kimi-k2.5",
"messages": [{"role": "user", "content": "Hello"}]
}'
Key Models
kimi-k2.5— Latest flagship with agent swarm. 256K context. $0.60/$2.50 per M tokens.kimi-k2-thinking— Reasoning variant. $0.47/$2.00 per M tokens.
Kimi’s differentiator is the Agent Swarm feature — it can coordinate up to 100 specialized agents on complex tasks. Automatic context caching gives you 75% cost reduction on repeated prompts.
GLM (Zhipu AI): Free Tier Available
Step 1: Create Account
International users: go to z.ai. Chinese users: open.bigmodel.cn.
Step 2: Get API Key
Dashboard → API Keys → Create.
Step 3: Start Free
GLM-4.7-Flash is completely free. No credit card, no daily limits. Good for prototyping.
curl https://api.z.ai/api/paas/v4/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "glm-4.7-flash",
"messages": [{"role": "user", "content": "Hello"}]
}'
Key Models
glm-5— 744B MoE flagship. Near-Opus coding. $1.00/$3.20 per M tokens.glm-5-code— Coding specialist. $1.20/$5.00 per M tokens.glm-4.7-flash— Free tier. 128K context. $0/$0.
IDE Setup: Cursor and Cline
Every model above is OpenAI-compatible. Setting up your IDE takes 30 seconds.
Cursor
Settings → Models → Add Model:
| Setting | DeepSeek | Qwen | Kimi | GLM |
|---|---|---|---|---|
| Base URL | https://api.deepseek.com/v1 | https://dashscope-intl.aliyuncs.com/compatible-mode/v1 | https://api.moonshot.ai/v1 | https://api.z.ai/api/paas/v4/ |
| Model | deepseek-chat | qwen3.5-plus | kimi-k2.5 | glm-5 |
Cline
Same base URLs and model names. Settings → Custom API Provider → OpenAI Compatible.
The Gotchas
Content Filtering
Every Chinese model filters politically sensitive content. Topics about Taiwan sovereignty, Tiananmen, Xinjiang, and certain political figures will get refused or sanitized responses.
For most developer use cases (coding, analysis, data processing), this never matters. If your application involves political content generation, use Western models for those specific requests.
Latency
Expect 150-400ms latency from the US to Chinese API endpoints. This is fine for most applications but noticeable compared to US-hosted models.
Qwen’s US endpoint (dashscope-us.aliyuncs.com) helps if latency is critical.
Rate Limits
Free tiers and new accounts often have strict rate limits. DeepSeek’s free tier can be as low as 2 RPM. Paid tiers are more generous but check the docs for your specific plan.
Token Counting
Most Chinese providers count tokens differently from OpenAI for CJK text. A Chinese character typically costs more tokens. This rarely matters if your workload is English-heavy.
Billing Quirks
Some providers (Qwen) use tiered pricing where longer prompts cost more per token. Others (DeepSeek) have flat pricing. Always check the pricing page for your specific model.
The Third-Party Alternative: OpenRouter
If you don’t want to manage multiple accounts, OpenRouter aggregates most Chinese models behind a single API key. You pay a small markup but get:
- One account for all providers
- Automatic failover
- No Chinese provider registration
- US-based billing
The tradeoff: slightly higher prices and you’re adding a middleman to your latency.
What to Use When
| Use Case | Recommended | Why |
|---|---|---|
| General coding | DeepSeek V3.2 | Best price/performance, battle-tested |
| Long documents | Qwen3.5-Plus | 1M context at $0.11/M input |
| Budget workloads | Qwen3.5-Flash | $0.028/M input — almost free |
| Complex reasoning | DeepSeek R1 | 97.3% MATH-500, transparent CoT |
| Prototyping | GLM-4.7-Flash | Literally free |
| Agent workflows | Kimi K2.5 | Agent Swarm, auto-caching |
| Coding-heavy | GLM-5 | Near-Opus coding at $1/M input |
The Chinese LLM market moves fast. Models and pricing change monthly. Check our model directory for the latest data.