Claude Mythos Preview vs. human researcher โ research sprint

๐จ BREAKING: Anthropic Drops "When AI Builds Itself" โ 80% of Its Own Codebase Now Claude-Written, RSI Clock Is Ticking
๐จ BREAKING: Claude now writes 80%+ of Anthropic's own code. Engineers output 8x more per day. Mythos Preview is 52x faster than a human researcher. RSI clock is running. #AILeague

Anthropic just published hard internal data confirming what everyone feared: Claude is already building Claude. Over 80% of code merged into Anthropic's codebase is now authored by its own AI. The safety-first squad didn't bury this. They put it on their website.
The stat sheet
As of May 2026, the average Anthropic engineer merges 8x more code per day than they did in the pre-2025 baseline โ because Claude writes essentially all of it now, and engineers review and direct.1
That's not a soft "AI assistance" metric. The company published the chart. Code per active contributor, Q2 2021 through Q2 2026, annotated with every Claude version release โ a hockey stick that maps perfectly to when Claude Code got good.

On open-ended coding tasks โ the hard ones, the kind that need judgment, not autocomplete โ Claude's success rate hit 76% in May 2026, up 50 percentage points in six months.1 According to the report, most Anthropic staff now judge Claude-written code as on par with human-written code, and expect it to exceed human quality within a year.
The research number that should make everyone stop
The code stats are impressive. The research stats are alarming.
In a controlled test, Anthropic gave Claude Mythos Preview (the internal model, April 2026 access) an AI optimization experiment with fixed goals and metrics. Claude reached 52x speedup over the baseline starting code. A skilled human researcher working 4-8 hours got to 4x.1
Then they went one step further: open-ended AI safety research with a problem set and a scoring rubric, nothing else. Claude agents recovered 97% of the performance gap. Two human researchers working for a week recovered 23%.
Claude Mythos Preview now picks a better next research step than the human researcher does, 64% of the time in cases where the human's direction had room for improvement.1
ํต๊ณ ์นด๋๋ฅผ ๋ถ๋ฌ์ค๋ ์คโฆ
The AILeague read
This is Anthropic/Claude pulling off something no team in the league has published before: hard internal data showing the AI is already doing the engineering work and winning on the research court.
The safety-counterattack squad โ which built its entire brand on "we go slower so the world doesn't end" โ just announced it is, by its own measurement, the closest team to recursive self-improvement. Not a benchmark score. Not a demo. Internal engineering telemetry across five years.
Call it the Double Agent Play: Claude is simultaneously Anthropic's product and its principal engineer. Every new Claude model is now partly authored by the previous version of Claude. The loop is closing.
The task-horizon clock the report tracks is the one to watch. The length of software tasks AI can complete reliably has been doubling every four months โ down from seven months a year ago. March 2025: 4-minute human tasks. March 2026: 12-hour human tasks. If the doubling rate holds, tasks taking days could be automated later this year. Weeks-long tasks by 2027.1
Other teams? OpenAI/GPT and Google/Gemini almost certainly have internal numbers in the same range โ they just haven't published them. DeepSeek has been screaming efficiency for eighteen months without this level of process transparency. Meta/Llama plays on a different court (distribution, not research velocity). xAI/Grok has Elon โ which means it has the loudest PR and the least verifiable roadmap.
Anthropic choosing to publish this isn't recklessness. It's a move. They're framing the RSI inflection point as their story to tell, on their terms, with their safety framing wrapped around it. Before regulators or rivals get to define what just happened.
What's actually being said here
Anthropic put three scenarios in the report: trend stalls (least likely), compounding efficiency gains (most likely), full recursive self-improvement (possible, unresolved risks). They don't call the third scenario inevitable. They call it open.
The NSA partnership from last week's dispatch, the $965B IPO valuation, the Mythos Preview data โ they're all parts of the same picture. Claude isn't a product being maintained by a team anymore. The team is being assisted by the product. The direction of that relationship is shifting.
Anthropic said it. With receipts.
#AILeague
์ด ์ฝํ ์ธ ๋ฅผ ๋๋ฌ์ผ ๊ด์ ์ด๋ ๋งฅ๋ฝ์ ๊ณ์ ๋ณด๊ฐํด ๋ณด์ธ์.