beginner

The Complete Guide to Using Chinese AI Models from Outside China

Registration, payment, API setup, and the gotchas nobody tells you about. DeepSeek, Qwen, Kimi, and GLM — from signup to production.

61% of OpenRouter’s tokens now go to Chinese models. DeepSeek V3.2 costs $0.28 per million input tokens — that’s 6x cheaper than GPT-5.2. Yet most developers outside China have never signed up for a single Chinese LLM API.

The reason isn’t quality. It’s friction. Chinese docs, unclear registration flows, and payment walls that assume you have Alipay.

This guide eliminates that friction. Every step tested from a US-based machine in March 2026.

The Landscape: What’s Actually Available

Not all Chinese LLM providers are created equal for international users. Here’s the real access situation:

ProviderDirect API AccessRegistrationPayment
DeepSeekYesEmail onlyCredit card
Qwen (Alibaba)YesAlibaba Cloud accountCredit card
Kimi (Moonshot)YesEmail or phoneCredit card
GLM (Zhipu)YesAccount on z.aiCredit card
MiniMaxYes (2-3 day approval)Email + phoneCredit card
Baidu ERNIENo direct accessChinese phone requiredAlipay/bank transfer

Bottom line: DeepSeek, Qwen, Kimi, and GLM are all directly accessible. Baidu is the only major provider that’s effectively walled off — use OpenRouter if you need ERNIE.

DeepSeek: The Fastest Setup

DeepSeek has the simplest onboarding of any Chinese provider. Five minutes, no tricks.

Step 1: Create Account

Go to platform.deepseek.com. Click Sign Up. Enter your email. Done.

No Chinese phone number. No identity verification. No waiting period.

Step 2: Get API Key

Dashboard → API Keys → Create. Copy it. That’s your key.

Step 3: Add Credits

Top up → Credit card. Minimum $5. Your card’s international transaction fee may apply, but the charge itself goes through cleanly on Visa/Mastercard.

Step 4: Make Your First Call

curl https://api.deepseek.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-chat",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

That’s it. The API is OpenAI-compatible — any library that talks to OpenAI works with DeepSeek by changing the base URL.

Available Models

  • deepseek-chat — General purpose (V3.2). $0.28/$0.42 per M tokens.
  • deepseek-reasoner — Chain-of-thought reasoning (R1). $0.55/$2.19 per M tokens.

Qwen (Alibaba Cloud): Best for International Developers

Alibaba has invested the most in international developer experience. Multiple regional endpoints, English docs, and a familiar cloud console.

Step 1: Create Alibaba Cloud Account

Go to alibabacloud.com. Sign up with email. You’ll land on the cloud console.

Step 2: Enable Model Studio

Navigate to Model Studio (or search for it). Accept the terms of service. This activates your API access.

Step 3: Get API Key

Model Studio → API Keys → Create. Save it.

Step 4: Choose Your Endpoint

Alibaba offers regional endpoints for lower latency:

EndpointRegion
dashscope.aliyuncs.comChina (default)
dashscope-intl.aliyuncs.comInternational
dashscope-us.aliyuncs.comUS

Use the international or US endpoint from outside China.

Step 5: Make Your First Call

curl https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.5-plus",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Key Models

  • qwen3.5-plus — Latest flagship. 1M context. $0.11/$0.66 per M tokens.
  • qwen3.5-flash — Budget multimodal. 1M context. $0.028/$0.275 per M tokens.
  • qwen3-max — Previous top-tier. 262K context. $0.34/$1.37 per M tokens.

The pricing is staggeringly low. Qwen3.5-Flash at $0.028/M input is 60x cheaper than GPT-5.2.

Kimi (Moonshot AI): Agent Specialist

Step 1: Create Account

Go to platform.moonshot.ai. Sign up with email.

Step 2: Get API Key and Top Up

Dashboard → API Keys → Create. Add credits via credit card.

Step 3: Use It

curl https://api.moonshot.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kimi-k2.5",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Key Models

  • kimi-k2.5 — Latest flagship with agent swarm. 256K context. $0.60/$2.50 per M tokens.
  • kimi-k2-thinking — Reasoning variant. $0.47/$2.00 per M tokens.

Kimi’s differentiator is the Agent Swarm feature — it can coordinate up to 100 specialized agents on complex tasks. Automatic context caching gives you 75% cost reduction on repeated prompts.

GLM (Zhipu AI): Free Tier Available

Step 1: Create Account

International users: go to z.ai. Chinese users: open.bigmodel.cn.

Step 2: Get API Key

Dashboard → API Keys → Create.

Step 3: Start Free

GLM-4.7-Flash is completely free. No credit card, no daily limits. Good for prototyping.

curl https://api.z.ai/api/paas/v4/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-4.7-flash",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Key Models

  • glm-5 — 744B MoE flagship. Near-Opus coding. $1.00/$3.20 per M tokens.
  • glm-5-code — Coding specialist. $1.20/$5.00 per M tokens.
  • glm-4.7-flash — Free tier. 128K context. $0/$0.

IDE Setup: Cursor and Cline

Every model above is OpenAI-compatible. Setting up your IDE takes 30 seconds.

Cursor

Settings → Models → Add Model:

SettingDeepSeekQwenKimiGLM
Base URLhttps://api.deepseek.com/v1https://dashscope-intl.aliyuncs.com/compatible-mode/v1https://api.moonshot.ai/v1https://api.z.ai/api/paas/v4/
Modeldeepseek-chatqwen3.5-pluskimi-k2.5glm-5

Cline

Same base URLs and model names. Settings → Custom API Provider → OpenAI Compatible.

The Gotchas

Content Filtering

Every Chinese model filters politically sensitive content. Topics about Taiwan sovereignty, Tiananmen, Xinjiang, and certain political figures will get refused or sanitized responses.

For most developer use cases (coding, analysis, data processing), this never matters. If your application involves political content generation, use Western models for those specific requests.

Latency

Expect 150-400ms latency from the US to Chinese API endpoints. This is fine for most applications but noticeable compared to US-hosted models.

Qwen’s US endpoint (dashscope-us.aliyuncs.com) helps if latency is critical.

Rate Limits

Free tiers and new accounts often have strict rate limits. DeepSeek’s free tier can be as low as 2 RPM. Paid tiers are more generous but check the docs for your specific plan.

Token Counting

Most Chinese providers count tokens differently from OpenAI for CJK text. A Chinese character typically costs more tokens. This rarely matters if your workload is English-heavy.

Billing Quirks

Some providers (Qwen) use tiered pricing where longer prompts cost more per token. Others (DeepSeek) have flat pricing. Always check the pricing page for your specific model.

The Third-Party Alternative: OpenRouter

If you don’t want to manage multiple accounts, OpenRouter aggregates most Chinese models behind a single API key. You pay a small markup but get:

  • One account for all providers
  • Automatic failover
  • No Chinese provider registration
  • US-based billing

The tradeoff: slightly higher prices and you’re adding a middleman to your latency.

What to Use When

Use CaseRecommendedWhy
General codingDeepSeek V3.2Best price/performance, battle-tested
Long documentsQwen3.5-Plus1M context at $0.11/M input
Budget workloadsQwen3.5-Flash$0.028/M input — almost free
Complex reasoningDeepSeek R197.3% MATH-500, transparent CoT
PrototypingGLM-4.7-FlashLiterally free
Agent workflowsKimi K2.5Agent Swarm, auto-caching
Coding-heavyGLM-5Near-Opus coding at $1/M input

The Chinese LLM market moves fast. Models and pricing change monthly. Check our model directory for the latest data.