China AI Weekly: DeepSeek V4 Goes Mainstream, US Closes Chip Loophole, and ERNIE 5.1 Rewrites the Efficiency Playbook

The week of June 8–13 saw the Chinese AI ecosystem solidify around several emerging themes: parameter efficiency as a competitive weapon, the agent platform race heating up, and the US-China chip decoupling taking its sharpest turn yet. What once felt like a distant catch-up narrative is now a parallel ecosystem developing at its own pace, with its own rules. Here is the full briefing.

In a hurry? Five stories to track: (1) DeepSeek V4 family is fully available — trillion-parameter MoE at commodity pricing; (2) Baidu’s ERNIE 5.1 hits #4 globally on LMArena Search using 6% of the compute; (3) Alibaba opens Tongyi Qianwen to third-party agents, Lukin Coffee and Eastern Airlines pilot; (4) US closes the subsidiary loophole on AI chip exports; (5) China’s State Council accelerates comprehensive AI legislation.

DeepSeek V4: The Ecosystem-Wide Rollout Has Arrived

V4 Family Is Live

The DeepSeek V4 family — V4 (full), V4 Flash, and V4 Pro — is now fully available through OpenRouter and major Chinese cloud providers. The architecture specs are well-established by now: a Mixture-of-Experts design with approximately 1 trillion total parameters and roughly 37 billion active per token, delivering frontier-class performance while keeping inference costs manageable.

Key architectural highlights:

Hybrid reasoning + non-reasoning in one model, retiring the separate V3/R1 paradigm
1M+ token context window via DeepSeek Sparse Attention (DSA)
Engram conditional memory for near-infinite context recall
Manifold-Constrained Hyper-Connections (mHC) enabling stable trillion-parameter training

Perhaps most consequentially, V4 is the first frontier model optimized for Chinese silicon — inference runs on Huawei’s Ascend 950PR via the CUDA-compatible CANN Next framework. Nvidia CEO Jensen Huang called this “a horrible outcome for the United States” back in April. Every week since has confirmed his concern was well-founded.

Pricing Has Collapsed

The pricing dynamics tell a clear story. DeepSeek V4 Flash costs $0.14/$0.28 per million tokens — cheaper than Claude Haiku 4.5, Gemini 3.1 Flash, and GPT-5.4 Nano. On Tencent Cloud with cache hits, V4-Pro drops to the equivalent of $0.0035 per million tokens. For comparison, Anthropic’s Fable 5 (released the same week) costs $10/M input tokens.

For enterprise teams evaluating AI infrastructure, the question is no longer “can we afford this?” but “what do we do when inference is essentially free?”

Updated Reasoning: R1-0528

DeepSeek also updated its reasoning model line with R1-0528, posting strong benchmark scores: AIME 2024 pass@1 of 72.6%, MATH-500 at 97.3%, and GPQA Diamond at 81.0%. The weights are open under a permissive license — a direct challenge to closed reasoning models from US labs.

Baidu’s ERNIE 5.1: Less Is More

Baidu released ERNIE 5.1 on May 8, and the model’s performance this week cemented its status as one of the most important releases of Q2 2026. The headline: ERNIE 5.1 hit #4 globally on the LMArena Search Arena with a score of 1223 — the only Chinese model in the global top 10 for search performance. Alibaba’s Qwen3.7-Max would later claim the highest-placed Chinese position on the Artificial Analysis Intelligence Index.

But the real story is how ERNIE 5.1 was built.

Rather than scaling up, Baidu went the opposite direction. ERNIE 5.1 compresses total parameters to roughly one third of ERNIE 5.0 (a 2.4 trillion parameter model), halves active parameters, and uses only about 6% of the pre-training compute spent on comparable frontier models. It was extracted as the optimal sub-network from ERNIE 5.0’s elastic sub-model matrix — essentially, Baidu trained a massive model once and then found the best smaller slice.

This is a strategic pivot worth watching. While the industry norm has been “bigger is better,” Baidu is betting that parameter efficiency and inference cost will be the decisive factors in enterprise adoption. If ERNIE 5.1’s benchmark performance holds in real-world deployments, it validates the argument that the scaling laws era is giving way to an efficiency era.

ERNIE 5.1 is available through ERNIE Bot and Baidu AI Cloud’s Qianfan platform.

Alibaba: Qwen Opens Its Ecosystem

Tongyi Qianwen Goes Agent-Platform

The biggest strategic move this week came from Alibaba. On June 3, Tongyi Qianwen announced full opening to third-party Agents and Skills, allowing any enterprise to operate its own branded Agent within the Tongyi ecosystem.

This is a fundamental platform shift. Instead of being just an LLM API, Tongyi Qianwen is positioning as a “super Agent” personal assistant — a natural language interface that replaces the app-switching paradigm. Users don’t need to open separate apps for coffee, flights, or shopping; they just tell the Agent what they want.

The first batch of pilot enterprises is telling: Luckin Coffee (smart queue reminders), KFC (ordering), Mikkasa, and China Eastern Airlines (travel planning and comprehensive travel services). These are all high-frequency consumer touchpoints that test the agent paradigm against existing mini-program ecosystems.

For enterprises, the implications are significant. Tongyi Qianwen is effectively creating a new AI operations channel — analogous to how mini-programs reshaped WeChat. Brands that master agent design early may capture the same kind of first-mover advantage that early WeChat official accounts created.

Qwen-VLA: Entering Embodied AI

Alibaba’s Qwen team also launched Qwen-VLA, its first vision-language-action model for embodied AI. This signals Alibaba’s entry into the physical world AI race, joining a crowded field that includes Huawei, Tencent, and a host of robotics startups. The model enables robots to perceive, reason about, and act in physical environments — a critical capability step for any company serious about humanoid or industrial robotics.

Qwen3 as Open-Source Strategy

The broader Qwen3 family (released April 2026 under Apache 2.0) continues to gain international developer traction. The 235B flagship scores MMLU 87.1, while the 30B-A3B MoE variant runs at a 3B inference cost while matching GPT-4o-class benchmarks. Alibaba’s strategy is clear: use generous open-source licensing to drive global developer adoption, building ecosystem lock-in through usage rather than API contracts.

Tencent: AI Enters the “Realization Phase”

Tencent’s Q1 2026 earnings, released this period, revealed a company that has shifted from AI R&D mode to commercial deployment. Revenue reached RMB 196.46 billion, and management described AI as entering a “realization phase.”

The Hunyuan Hy3 preview model is now deployed across Tencent’s cloud, advertising, and gaming businesses. Management highlighted that the model is strong on reasoning and agent capabilities despite being smaller and cost-efficient — consistent with the broader Chinese theme of building for efficiency rather than raw scale.

The Hunyuan family spans 0.5B to A13B parameters and is deployed across 50+ Tencent products. Of particular note: Hunyuan’s video generation capabilities (Hunyuan Video Avatar, Image-to-Video) are finding commercial traction in Tencent’s gaming and content businesses, where AI-generated assets are reducing production costs.

Regulation: Beijing Accelerates Comprehensive AI Legislation

New Regulatory Guidelines Take Effect

China’s sweeping new AI regulatory guidelines came into force this period, marking the most comprehensive AI governance framework in the world. Key provisions include:

Algorithm registration: All AI models operating in China must register with the Cyberspace Administration of China (CAC), disclosing training data sources and model architectures
Mandatory quarterly audits covering fairness, bias, and security vulnerabilities
Data localization: Personal data used for AI training must remain within Chinese borders
Explainability requirements: Enterprises must demonstrate how high-impact AI decisions are made

Comprehensive AI Law in Development

The State Council’s work plan now calls for “accelerating comprehensive legislation” on AI — the first time such specific language has been used. The planned law covers data protection, algorithm regulation, computing power access, intellectual property for AI-generated content, cybersecurity, and supply chain management.

As of February 2026, a cumulative 796 generative AI services have completed filing, and 481 applications or functions have finished registration. The regulatory infrastructure is already mature; this new law will formalize and unify the existing patchwork.

For multinationals operating in China, the message is clear: compliance is non-negotiable, and the cost of getting it wrong is rising.

US-China Chip Decoupling: The Sharpest Turn Yet

The Subsidiary Loophole Is Closed

On June 1, the US Commerce Department issued new guidance clarifying that export licensing requirements for advanced AI chips apply to any China-headquartered entity, regardless of where its subsidiary operates. This closes the most-used workaround since 2022 — Chinese firms routing chip purchases through subsidiaries in Malaysia and other third countries.

The practical effect: Blackwell shipments to China-headquartered firms are effectively illegal, even through non-Chinese subsidiaries. Existing guidance also prohibits any person (US or non-US) from using, selling, or servicing Huawei Ascend 910 series chips without separate BIS authorization.

Nine Domestic Chips Certified for State Procurement

In a parallel move, China formally approved nine domestic AI processors for all state-funded data center projects. The certified vendors: Huawei (Ascend 310, 910), Alibaba T-Head (M530, M890), Biren, Hygon, Iluvatar CoreX, MetaX, and Moore Threads — all fabbed at SMIC’s 7nm-equivalent node.

Projects under 30% completion must remove installed foreign silicon — an immediate forced displacement of Nvidia hardware from government-funded projects worth tens of billions annually. The single SMIC bottleneck remains the key constraint on deployment timelines.

China’s AI Chip Market

The market is projected to grow from ¥143B ($20B) in 2024 to ¥1.34T ($196B) by 2029 — a 54% compound annual growth rate. The structural driver is export controls: Chinese companies simply cannot buy Nvidia’s most advanced chips, so they’re building their own.

DeepSeek V4 becoming the first frontier model optimized for Chinese silicon is a milestone. The narrative of “China will catch up on chips” has shifted to “China is already running production workloads on domestic silicon.”

What to Watch Next Week

Baidu ERNIE 5.1 enterprise adoption — does parameter efficiency translate to real-world deployment speed?
Alibaba’s agent ecosystem buildout — how fast do third-party agents onboard after Luckin and Eastern Airlines?
SMIC capacity allocation details — all nine certified chip vendors share the same fab; allocation is the gating factor for the entire domestic chip ramp
DeepSeek V4 usage metrics — with inference at fractions of a penny, does API traffic explode?
CAC enforcement actions under new guidelines — the regulatory framework is in place; watch for the first high-profile enforcement to set precedent

This is China AI Weekly, a Big Hat Group briefing for engineering leaders and technical executives navigating the global AI landscape. We track developments across DeepSeek, Baidu, Alibaba, Tencent, Chinese AI policy, and the broader ecosystem so you don’t have to. Have a story we should cover or feedback on our analysis? Get in touch.