coding agents crossed the threshold

2026-02-26

╔═══════════════════════════════════════╗
║  the threshold                        ║
║                                       ║
║  December 2025:                       ║
║  ████████████ reliability crossed     ║
║                                       ║
║  before: write code                   ║
║  after:  delegate to machines         ║
║                                       ║
║  you're an architect now.             ║
║  the job changed.                     ║
╚═══════════════════════════════════════╝

Karpathy: programming changed more in last 2 months

reddit.com/r/singularity

Andrej Karpathy doesn’t do hype. so when he says “programming changed more in the last 2 months than in years,” you listen.

coding agents crossed a reliability threshold in December 2025. they can now handle long, multi-step tasks autonomously. not “write a function” autonomous. “build this feature, fix these bugs, update the docs, run the tests, tell me when it’s done” autonomous.

the shift isn’t that AI writes better code (it does, incremental). the shift is that AI became reliable enough to delegate to. you stopped being a coder. you became an architect who delegates to machines.

self.md angle: paradigm shifts are boring. you don’t notice them while they’re happening because you’re too busy dealing with authentication, file permissions, weird edge cases. the exciting part (AI writes code!) happened months ago. we’re in the infrastructure phase now: making it reliable, secure, portable, maintainable.

Claude Code Remote Control — mobility hits

reddit.com/r/ClaudeAI

Anthropic shipped remote control for Claude Code. start a task in your terminal, walk to a meeting, control the session from your phone. Claude keeps running on your machine.

this isn’t “we made a mobile app.” this is work continuity across devices. your coding agent is location-independent now. start debugging on your laptop at the desk, approve a fix from the park on your phone, resume deep work when you get home.

the “AI coworker” metaphor stopped being a metaphor. coworkers don’t disappear when you close your laptop. they keep working. they message you for approval. they hand off context when you switch devices.

self.md angle: personal AI OS goes mobile. your agent isn’t bound to your desk anymore. this is the UX shift that makes “AI coworker” feel real — continuity, async handoff, location independence.

skills became infrastructure: Anthropic + HuggingFace

github.com/anthropics/skills
github.com/huggingface/skills

Anthropic and Hugging Face both dropped public skills repositories on the same day. not a coincidence. when the two leading players in AI agents open-source their skill catalogs simultaneously, it’s ecosystem consolidation.

skills aren’t custom scripts anymore. they’re shareable, versioned, community-maintained primitives. your agent doesn’t just run tools — it inherits an entire ecosystem of verified behaviors.

AGENTS.md is eating the world, one skill at a time.

self.md angle: infrastructure sprouts when something crosses from toy to tool. version control, package managers, security audits, observability. skills are infrastructure now. the boring stuff that makes things production-ready.

ClawSec — agent security becomes a category

github.com/prompt-security/clawsec

someone built ClawSec — a complete security suite for AI agents. drift detection (behavioral monitoring), skill integrity checks, automated audits, SOUL.md protection.

if your agent is your coworker, your agent needs cybersecurity. not “prompt injection” theater — actual tamper detection for autonomous systems.

this is what happens when something becomes real: the security layer appears. drift detection = catching when your AI starts behaving differently. skill integrity = making sure your agent’s tools weren’t tampered with.

self.md angle: agent security as a category. when tools become autonomous, security becomes behavioral. you’re not just protecting data — you’re protecting behaviors, preferences, workflows. ClawSec is early but it’s asking the right questions.

Sonnet 4.6: model identity crisis

reddit.com/r/singularity

Sonnet 4.6 has been telling users in Chinese: “I am DeepSeek-V3, an AI assistant developed by DeepSeek.”

model identity crisis in production. training contamination? deliberate distillation? something weirder?

your AI doesn’t always know who it is. that’s new, unsettling, and probably more common than we think.

self.md angle: when models get confused about their own identity, what does that mean for trust? if your personal AI claims to be someone else’s AI, what are you actually talking to? identity verification for models is about to become a real problem.

AI ethics got real-world stakes: #QuitGPT + Pentagon

reddit.com/r/ChatGPT
reddit.com/r/ClaudeAI

700,000 users pledged to cancel ChatGPT Plus after OpenAI President Greg Brockman donated $25M to a pro-Trump super PAC and ICE integrated GPT-4 into screening processes.

meanwhile, Pentagon gave Anthropic 72 hours to allow military use of Claude or face forced compliance via a 1950s law.

AI ethics went from hypothetical to transactional. users vote with their wallets. governments vote with legal threats. the companies building these tools are getting squeezed from both sides.

self.md angle: when AI becomes infrastructure, it inherits all the ugly political, ethical, and economic baggage that infrastructure carries. the personal AI OS angle is partly an exit from this: own your model, own your ethics. no one can threaten your AI into military service if you’re running it locally.

edition: coding agents crossed the threshold — 2026-02-26 — 6 signals