self.md
Concepts
Guides
Skills
People
Comparison
Signals
Open
Benchmarks
1 practitioner working with Benchmarks:
the 50% horizon
Claude Opus 4.6 hit 50% on multi-hour expert ML tasks. security became personal. the AI OS architecture stabilized. and the human-in-the-loop is vanishing faster than anyone projected.
← All topics