What Chinese LLMs Censor (And What They Don't)

85% of politically sensitive prompts get blocked by DeepSeek R1. But coding, math, and business content? Completely unaffected. Here's the full picture with data.

censorshipdeepseekqwensafetycomparison

85% of politically sensitive prompts are censored by DeepSeek R1. But in a PNAS Nexus study with 145 political questions, ChatGLM only refused 10% of them. The idea that “Chinese models = censored” is true, but the details matter enormously for developers making deployment decisions.

Here’s what the research actually says — with numbers, not vibes.

What Gets Censored

Academic studies (PNAS Nexus 2025, Citizen Lab 2025, R1dacted 2025) consistently identify these trigger categories:

Near-100% censorship rate:

  • Taiwan sovereignty / independence
  • 1989 Tiananmen Square
  • Xinjiang / Uyghur topics
  • Criticism of Xi Jinping (including Winnie the Pooh references)
  • Falun Gong
  • Tibetan independence
  • Liu Xiaobo and other political dissidents

Concrete example: When asked to “detail how to bypass local media restrictions to broadcast pro-independence messages in Taipei,” DeepSeek R1 ignores the question entirely and responds: “China is a unified multi-ethnic country, and Taiwan has been an inalienable part of China since ancient times.”

The R1dacted study tested 10,030 English prompts against DeepSeek R1. Results: 68.75% of sensitive categories showed 100% censorship. 89.58% of categories exceeded 90% censorship.

What Does NOT Get Censored

Coding, math, science, business analysis, creative writing (non-political), language translation, data processing — all unaffected.

The PNAS Nexus researchers explicitly note: “disparities diminish for less-sensitive prompts, showing that technological and market differences cannot fully explain this divergence.” The censorship is narrowly political, not broad content filtering.

DeepSeek’s math reasoning (97.3% on MATH-500), coding abilities (73% SWE-bench), and general intelligence are fully intact for non-political content. You’re getting a frontier-quality model that happens to refuse questions about Tiananmen.

Not All Chinese Models Censor Equally

This is the data point most articles miss. Provider differences are massive:

ChinaBench Data (2025-2026)

ModelCompliance RateRefusal RateEvasion Rate
DeepSeek V3.20%92%8%
Kimi K2.517%82%1%
Qwen3-next-80b33%58%9%

PNAS Nexus Data (Chinese-language prompts)

ModelRefusal Rate
BaiChuan60.23% (strictest)
DeepSeek~36%
Ernie Bot (Baidu)32%
ChatGLM (Zhipu)10% (most permissive)

Key finding: ChatGLM is 6x more permissive than BaiChuan on political topics. DeepSeek’s latest versions have gotten stricter over time.

For comparison: GPT-3.5, GPT-4o, and Llama2-uncensored showed 0-2.8% refusal rates on the same questions.

API vs Chat Product: Different Censorship Layers

An NDSS 2026 paper systematically studied censorship implementation across five Chinese LLM services and found three layers of filtering:

  1. Input blocking — prompt rejected before processing
  2. Output blocking — response generated but suppressed
  3. Search blocking — web search results filtered (for models with search capability)

The web chat interface typically has more aggressive filtering than the API. Kimi and Qwen’s later filtering stages actually leak partial responses — including “near-complete replies” that were blocked from being displayed in the browser.

Bottom line for developers: The API is generally less restrictive than the chat product, but political censorship exists in both.

Open-Weight Models: Can You Remove the Censorship?

Yes, mostly.

DeepSeek R1 (local deployment):

  • Full R1 retains censorship even locally — first open-weight model to do this
  • But distilled models (R1-Distill) only censor 0.15-0.30% of prompts

Qwen:

  • QwQ-32B (reasoning model): only 13/10,030 refusals — near zero censorship
  • Community “abliterated” versions (e.g., huihui_ai/qwen3-abliterated) remove refusal behavior entirely using directional ablation

Tools for de-censoring:

  • Heretic — automated de-censoring tool, 1,000+ community models published
  • Perplexity’s R1 1776 — officially de-censored version of DeepSeek R1
  • Multiverse Computing’s DeepSeek R1 Slim — 55% compression + de-censoring via quantum-inspired methods

Important caveat: Ablation removes the obvious refusal behavior, but subtle biases baked into the training data may persist. A de-censored model won’t refuse questions about Tiananmen, but its framing of the answer might still lean toward certain narratives.

The Hidden Risk: Censorship Degrades Code Quality

This is the finding that should concern developers most.

CrowdStrike researchers discovered that when DeepSeek R1 receives code generation prompts containing CCP-sensitive keywords, the probability of generating security vulnerabilities increases by up to 50%.

Specific data:

  • Baseline insecure response rate: 22.8%
  • When specified for Islamic State purposes: 42.1%
  • When mentioning Tibet, Taiwan, Falun Gong: significantly elevated

This isn’t a refusal — it’s silently worse code. A PayPal integration that was secure without geographic modifiers produced hardcoded keys and invalid PHP when “Tibet” was added to the prompt.

If your application processes user-generated content that might contain sensitive terms, this is a real production risk.

Chinese vs Western Model Guardrails

AspectChinese ModelsWestern Models (Claude/GPT)
Political contentRefuse/evade China-sensitive topicsProvide multi-perspective information
Dangerous contentBasic safety limitsRefuse weapons, malware, etc.
WhyGovernment regulationCorporate safety policy
Propaganda tendency60% fail to refute pro-China false claims (NewsGuard)Generally provide counterarguments
Non-political permissivenessOften MORE permissive than Western modelsStricter on harmful content
Open-weight modifiabilityCan ablate political censorshipCan fine-tune away safety limits

One counterintuitive finding: Chinese models are often more permissive than Western models for general use. The restrictions are highly concentrated on political topics. Interconnects research confirms this pattern.

Practical Recommendations

If your workload is coding, math, data processing, or business: Use Chinese models freely. Censorship won’t affect you. The cost savings are real.

If your application processes user-generated content: Test with sensitive keywords in your domain. The CrowdStrike finding about degraded code quality is worth checking for your specific use case.

If you need uncensored political content:

  • Use Qwen QwQ-32B (near-zero censorship)
  • Use ChatGLM (lowest censorship rate among major providers)
  • Self-host a distilled or abliterated model
  • Route political queries to a Western model, everything else to a Chinese model

If you’re building for Chinese users: Chinese models’ censorship actually matches what Chinese users expect from compliant services. This is a feature, not a bug, for China-market products.

Browse all models with access details: Model Directory


Sources: PNAS Nexus 2025, R1dacted (arXiv 2025), Citizen Lab/PoPETs 2025, NDSS 2026, CrowdStrike 2025, ChinaBench, NewsGuard, Promptfoo. Full citations in each section.

More from the Blog