DeepSeek's concession

The Chinese lab that trained AI on the cheap just shipped its first model at full price — and it shows

// Share
DeepSeek's concession

IN THE TECHNICAL MATERIALS accompanying V4's preview release this week, DeepSeek made a concession no frontier lab includes in its own marketing: the model, the company's paper allowed, trails the state-of-the-art by "approximately 3 to 6 months."

V4 Flash and V4 Pro are now live. Pro is a 1.6-trillion-parameter mixture-of-experts system (49 billion active) that beats GPT-5.2 and Gemini 3.0 Pro on some reasoning benchmarks and lags GPT-5.4 and Gemini 3.1 Pro on knowledge ones. Pricing is aggressive: $0.145 per million input tokens and $3.48 per million output tokens for Pro, under Opus 4.7 and GPT-5.5. Both V4 models ship text-only, in an era when every frontier competitor is natively multimodal.

The launch arrived alongside a separate DeepSeek first. The lab, 99% owned by the Chinese quantitative hedge fund High-Flyer Capital Management, is raising $300 million at a valuation north of $10 billion, its first outside capital ever. Tencent and Alibaba are reportedly in the mix. Founder Liang Wenfeng, a technology idealist who had spent two years rebuffing every Chinese VC on principle, has apparently changed his mind about something. Both events are readable as concessions. Both are admissions that what R1 cost to train and what V4 costs to train turn out not to be the same number.

Hard ledger

In January 2025, DeepSeek's R1 model moved global markets. A peer-reviewed Nature paper co-authored by Liang last September put R1's final training run at $294,000 — 512 Nvidia H800 chips, 80 hours of wall-clock, nothing else. That figure excluded R&D salaries, prior experiments, preparatory training on smaller models, and the capital cost of the compute itself. SemiAnalysis put DeepSeek's actual infrastructure spend closer to $1.6 billion in servers and $944 million in operating costs, against an inventory of roughly 50,000 Hopper-generation GPUs. The critical block of that stack — the original 10,000 A100s — was bought by High-Flyer in 2021 for the hedge fund's quantitative trading algorithms, before any American export controls existed. That stockpile was, in effect, a pre-paid asset transferred from quantitative finance to frontier AI at zero marginal cost. R1 rode on it. V4 cannot.

// Members only

This article is for Vector members. Start a 7-day free trial to keep reading.

Start your free trial

// The Daily

Get Vector in your inbox.

A free morning briefing on the AI revolution. Weekdays at 6am CT.