Yohei Nakajima's Build-in-Public AI Agent System

Table of content

Yohei Nakajima didn’t set out to create one of the most influential AI projects of 2023. He was prototyping an “autonomous founder”—an AI that could run a startup—when he realized the task-planning loop he’d built could be useful for more general purposes. He shared the code on Twitter, named it BabyAGI, and watched it go viral: millions of impressions, 22,000+ GitHub stars, coverage in Fortune, Forbes, Fast Company, and a TED AI talk in San Francisco.

But BabyAGI isn’t the system. It’s an artifact of the system. Nakajima’s real methodology is “build-in-public”—a workflow where he treats building AI experiments as a way to learn, meet founders, and contribute to the ecosystem. He’s a venture capitalist by day (general partner at Untapped Capital), builder by night, and his personal AI operating system runs on this dual identity.

Background

Born 1987 in Tokyo, now based in the US
General Partner at Untapped Capital (early-stage VC, founded 2020)
Previously: Techstars (Director of Pipeline, helped spin up Disney Accelerator), Scrum Ventures (led Nintendo Switch+Tech engagement)
15 years working with early-stage startups
Ranked Top 20 thought leader in the 2023 Data Driven VC Landscape report
68k X/Twitter followers

Personal site | GitHub | Twitter: @yoheinakajima | Build Log

The System: Build-in-Public for AI

Nakajima’s core insight is simple: building publicly is a superior learning strategy. Instead of reading papers or taking courses, he builds prototypes, shares them on Twitter, and learns from the feedback loop. This compounds in three ways:

Learn faster: Shipping forces you to understand something deeply enough to make it work
Meet founders: Building AI tools attracts AI founders—exactly who a VC wants to meet
Contribute to the ecosystem: Open-source experiments help others learn and build

The build cycle:

Phase	What Happens
Spark	See an interesting AI capability or problem
Prototype	Build a minimal working version (often in a single file)
Share	Post on Twitter with explanation thread
Iterate	Respond to feedback, build variants (BabyBeeAGI, BabyCatAGI, BabyFoxAGI)
Document	Write a blog post or “paper” explaining the architecture
Archive	Move to archive when superseded, start fresh

This is why BabyAGI has multiple versions: each iteration is a learning experiment, not a product. The original BabyAGI from March 2023 is now archived. The current BabyAGI is a self-building framework—an agent that can write its own functions.

The BabyAGI Architecture

BabyAGI introduced a pattern that became foundational for autonomous agents: the task-planning loop. The original system had three components:

Execution Agent: Completes the current task using GPT-4
Task Creation Agent: Generates new tasks based on results
Prioritization Agent: Reorders the task queue based on objectives

The loop runs continuously: execute → create new tasks → prioritize → repeat. Vector storage (Pinecone) provides memory across iterations.

What made this significant wasn’t the complexity—the original was ~100 lines of Python. It was the demonstration that LLMs could plan, execute, and adapt without human intervention. Nakajima released it with a “paper” (actually GPT-4-generated based on the codebase) explaining the architecture, which helped researchers and builders understand the pattern.

Key variants:

Version	Innovation
BabyBeeAGI	Task management + function expansion
BabyCatAGI	Faster execution through parallelization
BabyFoxAGI	Skill library + more sophisticated task chaining
BabyAGI 2.0	Self-building: agent writes its own functions via “functionz” framework

AI-Augmented VC Work

Nakajima doesn’t just build AI experiments—he uses AI to run his VC firm. Untapped Capital is known for leveraging AI and automations across their investment process:

GPT VC Associate: A custom GPT that founders can use to practice their startup pitch. It asks questions, expands on answers, and generates a downloadable investment memo. This is publicly available, which means:

Founders get free pitch practice
Nakajima learns what founders are building
The interaction surfaces potential investments

AI-assisted startup research: Automated pipelines for sourcing, screening, and analyzing startups. He’s presented on “How to run your VC firm using AI” at events like the a16z GP-LP Mixer.

Build log: A public Softr-powered dashboard tracking every experiment he builds, creating a living portfolio of work.

The Self-Improving Agent Philosophy

Nakajima’s recent work (December 2025) focuses on self-improving agents—systems that get better through their own experience rather than human labels. In his synthesis of NeurIPS 2025 research, he identifies six mechanisms:

Self-reflection: Agents critique their own outputs and try again (Reflexion, Self-Refine)
Self-generated curricula: Agents create the tasks they learn from
Self-adapting models: Agents fine-tune themselves based on feedback
Self-improving code agents: Agents modify their own source code
Embodied self-improvement: Agents learn by acting in environments
Verification and safety: Keeping self-improvement from going off the rails

The current BabyAGI implements this through “functionz”—a framework where functions are stored in a database with metadata about dependencies, imports, and relationships. The agent can register new functions, manage them via a dashboard, and build on its own capabilities over time.

import babyagi

@babyagi.register_function(
    imports=["math"],
    dependencies=["circle_area"],
    metadata={"description": "Calculates cylinder volume"}
)
def cylinder_volume(radius, height):
    import math
    area = circle_area(radius)
    return area * height

Knowledge Graphs + LLMs

Beyond task planning, Nakajima builds tools for knowledge representation:

Instagraph (3.5k stars): Converts text or URLs into visual knowledge graphs
MindGraph: An ever-expanding knowledge graph built through AI queries
PrettyGraph: UI for text-to-knowledge-graph generation

These tools reflect a consistent philosophy: AI should help structure and retrieve information, not just generate text.

Why This Matters

Nakajima represents a specific archetype: the practitioner-theorist who builds first and explains second. His influence on autonomous agents comes not from papers or credentials (he’s quick to note he’s never held a job as a developer), but from shipping working code that others can learn from.

The build-in-public methodology has broader implications:

For VCs: Building publicly creates deal flow. Founders working on AI seek out investors who understand the technology deeply enough to build it themselves.

For learning: Building beats reading. A working BabyAGI teaches more about autonomous agents than any number of blog posts about autonomous agents.

For the ecosystem: Open-source experiments compound. BabyAGI influenced dozens of other agent frameworks, got cited in research papers, and appeared in Sequoia and a16z’s AI Canon—not because it was the most sophisticated, but because it was available to learn from.

The warning signs are also instructive. Nakajima’s BabyAGI README includes disclaimers about paperclip maximizers and AI safety risks. He’s not naive about the dangers of self-improving systems—the code comes with notes about “worst case scenarios” and why you shouldn’t run autonomous agents without constraints.

What You Can Steal

Technique	How to Apply
Task-planning loop	Implement: execute task → generate new tasks from result → prioritize → repeat
Build-in-public workflow	Share prototypes early, iterate publicly, document what you learn
Self-generated curricula	Let your agents create their own training tasks with verifiable outcomes
Function registry pattern	Store agent capabilities as database entries with metadata
Knowledge graph extraction	Use LLMs to convert unstructured text into structured graph representations
GPT as VC Associate	Build domain-specific GPTs that surface information while providing value

Quick start with BabyAGI:

pip install babyagi

import babyagi

if __name__ == "__main__":
    app = babyagi.create_app('/dashboard')
    app.run(host='0.0.0.0', port=8080)

Navigate to http://localhost:8080/dashboard to see the function registry and logs.

Resources:

BabyAGI GitHub - Current self-building version
BabyAGI Archive - Original March 2023 version
Task-driven Autonomous Agent paper - Architecture explanation
Build Log - All public experiments
Blog - Detailed writeups

ACE Framework - Multi-layer autonomous agent architecture
Parallel Sessions - Running multiple agents simultaneously
Browser Agents - AI automation through browser control

Next: Steve Krshakov’s Local AI System →

Topics: autonomous-agents babyagi build-in-public vc open-source