Yohei Nakajima's Build-in-Public AI Agent System
Table of content
Yohei Nakajima didn’t set out to create one of the most influential AI projects of 2023. He was prototyping an “autonomous founder”—an AI that could run a startup—when he realized the task-planning loop he’d built could be useful for more general purposes. He shared the code on Twitter, named it BabyAGI, and watched it go viral: millions of impressions, 22,000+ GitHub stars, coverage in Fortune, Forbes, Fast Company, and a TED AI talk in San Francisco.
But BabyAGI isn’t the system. It’s an artifact of the system. Nakajima’s real methodology is “build-in-public”—a workflow where he treats building AI experiments as a way to learn, meet founders, and contribute to the ecosystem. He’s a venture capitalist by day (general partner at Untapped Capital), builder by night, and his personal AI operating system runs on this dual identity.
Background
- Born 1987 in Tokyo, now based in the US
- General Partner at Untapped Capital (early-stage VC, founded 2020)
- Previously: Techstars (Director of Pipeline, helped spin up Disney Accelerator), Scrum Ventures (led Nintendo Switch+Tech engagement)
- 15 years working with early-stage startups
- Ranked Top 20 thought leader in the 2023 Data Driven VC Landscape report
- 68k X/Twitter followers
Personal site | GitHub | Twitter: @yoheinakajima | Build Log
The System: Build-in-Public for AI
Nakajima’s core insight is simple: building publicly is a superior learning strategy. Instead of reading papers or taking courses, he builds prototypes, shares them on Twitter, and learns from the feedback loop. This compounds in three ways:
- Learn faster: Shipping forces you to understand something deeply enough to make it work
- Meet founders: Building AI tools attracts AI founders—exactly who a VC wants to meet
- Contribute to the ecosystem: Open-source experiments help others learn and build
The build cycle:
| Phase | What Happens |
|---|---|
| Spark | See an interesting AI capability or problem |
| Prototype | Build a minimal working version (often in a single file) |
| Share | Post on Twitter with explanation thread |
| Iterate | Respond to feedback, build variants (BabyBeeAGI, BabyCatAGI, BabyFoxAGI) |
| Document | Write a blog post or “paper” explaining the architecture |
| Archive | Move to archive when superseded, start fresh |
This is why BabyAGI has multiple versions: each iteration is a learning experiment, not a product. The original BabyAGI from March 2023 is now archived. The current BabyAGI is a self-building framework—an agent that can write its own functions.
The BabyAGI Architecture
BabyAGI introduced a pattern that became foundational for autonomous agents: the task-planning loop. The original system had three components:
- Execution Agent: Completes the current task using GPT-4
- Task Creation Agent: Generates new tasks based on results
- Prioritization Agent: Reorders the task queue based on objectives
The loop runs continuously: execute → create new tasks → prioritize → repeat. Vector storage (Pinecone) provides memory across iterations.
What made this significant wasn’t the complexity—the original was ~100 lines of Python. It was the demonstration that LLMs could plan, execute, and adapt without human intervention. Nakajima released it with a “paper” (actually GPT-4-generated based on the codebase) explaining the architecture, which helped researchers and builders understand the pattern.
Key variants:
| Version | Innovation |
|---|---|
| BabyBeeAGI | Task management + function expansion |
| BabyCatAGI | Faster execution through parallelization |
| BabyFoxAGI | Skill library + more sophisticated task chaining |
| BabyAGI 2.0 | Self-building: agent writes its own functions via “functionz” framework |
AI-Augmented VC Work
Nakajima doesn’t just build AI experiments—he uses AI to run his VC firm. Untapped Capital is known for leveraging AI and automations across their investment process:
GPT VC Associate: A custom GPT that founders can use to practice their startup pitch. It asks questions, expands on answers, and generates a downloadable investment memo. This is publicly available, which means:
- Founders get free pitch practice
- Nakajima learns what founders are building
- The interaction surfaces potential investments
AI-assisted startup research: Automated pipelines for sourcing, screening, and analyzing startups. He’s presented on “How to run your VC firm using AI” at events like the a16z GP-LP Mixer.
Build log: A public Softr-powered dashboard tracking every experiment he builds, creating a living portfolio of work.
The Self-Improving Agent Philosophy
Nakajima’s recent work (December 2025) focuses on self-improving agents—systems that get better through their own experience rather than human labels. In his synthesis of NeurIPS 2025 research, he identifies six mechanisms:
- Self-reflection: Agents critique their own outputs and try again (Reflexion, Self-Refine)
- Self-generated curricula: Agents create the tasks they learn from
- Self-adapting models: Agents fine-tune themselves based on feedback
- Self-improving code agents: Agents modify their own source code
- Embodied self-improvement: Agents learn by acting in environments
- Verification and safety: Keeping self-improvement from going off the rails
The current BabyAGI implements this through “functionz”—a framework where functions are stored in a database with metadata about dependencies, imports, and relationships. The agent can register new functions, manage them via a dashboard, and build on its own capabilities over time.
import babyagi
@babyagi.register_function(
imports=["math"],
dependencies=["circle_area"],
metadata={"description": "Calculates cylinder volume"}
)
def cylinder_volume(radius, height):
import math
area = circle_area(radius)
return area * height
Knowledge Graphs + LLMs
Beyond task planning, Nakajima builds tools for knowledge representation:
- Instagraph (3.5k stars): Converts text or URLs into visual knowledge graphs
- MindGraph: An ever-expanding knowledge graph built through AI queries
- PrettyGraph: UI for text-to-knowledge-graph generation
These tools reflect a consistent philosophy: AI should help structure and retrieve information, not just generate text.
Why This Matters
Nakajima represents a specific archetype: the practitioner-theorist who builds first and explains second. His influence on autonomous agents comes not from papers or credentials (he’s quick to note he’s never held a job as a developer), but from shipping working code that others can learn from.
The build-in-public methodology has broader implications:
For VCs: Building publicly creates deal flow. Founders working on AI seek out investors who understand the technology deeply enough to build it themselves.
For learning: Building beats reading. A working BabyAGI teaches more about autonomous agents than any number of blog posts about autonomous agents.
For the ecosystem: Open-source experiments compound. BabyAGI influenced dozens of other agent frameworks, got cited in research papers, and appeared in Sequoia and a16z’s AI Canon—not because it was the most sophisticated, but because it was available to learn from.
The warning signs are also instructive. Nakajima’s BabyAGI README includes disclaimers about paperclip maximizers and AI safety risks. He’s not naive about the dangers of self-improving systems—the code comes with notes about “worst case scenarios” and why you shouldn’t run autonomous agents without constraints.
What You Can Steal
| Technique | How to Apply |
|---|---|
| Task-planning loop | Implement: execute task → generate new tasks from result → prioritize → repeat |
| Build-in-public workflow | Share prototypes early, iterate publicly, document what you learn |
| Self-generated curricula | Let your agents create their own training tasks with verifiable outcomes |
| Function registry pattern | Store agent capabilities as database entries with metadata |
| Knowledge graph extraction | Use LLMs to convert unstructured text into structured graph representations |
| GPT as VC Associate | Build domain-specific GPTs that surface information while providing value |
Quick start with BabyAGI:
pip install babyagi
import babyagi
if __name__ == "__main__":
app = babyagi.create_app('/dashboard')
app.run(host='0.0.0.0', port=8080)
Navigate to http://localhost:8080/dashboard to see the function registry and logs.
Resources:
- BabyAGI GitHub - Current self-building version
- BabyAGI Archive - Original March 2023 version
- Task-driven Autonomous Agent paper - Architecture explanation
- Build Log - All public experiments
- Blog - Detailed writeups
Related
- ACE Framework - Multi-layer autonomous agent architecture
- Parallel Sessions - Running multiple agents simultaneously
- Browser Agents - AI automation through browser control
Next: Steve Krshakov’s Local AI System →
Get updates
New guides, workflows, and AI patterns. No spam.
Thank you! You're on the list.