DeepSeek R1: a note from the week it dropped

An open-weights reasoning model just landed on par with o1. Writing this down before the dust settles.

·2 min read

DeepSeek just released R1. An open-weights reasoning model, from a Chinese lab most people outside of AI twitter hadn't heard of a year ago, that trades blows with OpenAI's o1 on math and coding benchmarks. The weights are MIT licensed. You can download them. You can run them. The paper explains the training recipe in enough detail that other labs will replicate it.

A few things I keep turning over:

The moat argument is cracking. For the last two years the assumption has been that frontier capability requires frontier compute, which requires frontier capital, which means three or four labs. R1 was trained on a fraction of what GPT-4 cost. Maybe the number is fudged, maybe it isn't — the point is the ceiling came down.

Reasoning is just RL now. The interesting part of the R1 paper isn't the benchmarks, it's that they got the reasoning behavior to emerge from reinforcement learning on verifiable rewards, without a giant supervised chain-of-thought dataset. That's a recipe anyone with a cluster and taste can run.

Local reasoning models are suddenly a thing. The distilled 32B variant fits on a single 4090. I'm going to pull it tonight and see what it actually feels like to have an o1-class model that answers on my own machine with no API bill and no rate limit.

I don't know what the right response to any of this is yet. But "the frontier is 3 US labs" stopped being true this week, and I wanted the timestamp on that.

Keep reading