Steve Korshakov's Build-Your-Own-AI-Stack Approach

Table of content
Steve Korshakov's Build-Your-Own-AI-Stack Approach

Steve Korshakov doesn’t use AI tools. He builds the AI tools he needs, runs them on his own hardware, and open-sources them. His VS Code extension Llama Coder replaces GitHub Copilot with local LLMs. His Supervoice projects train text-to-speech models from scratch. And he wears Bee, an AI pendant that captures his conversations and extracts todos automatically.

Background

Find him on GitHub (@ex3ndr), X (@Ex3NDR), and his blog.

The System: Build What You Need

Steve’s approach: if an AI tool doesn’t exist the way you want it, build it yourself and run it locally.

Core stack:

ToolWhat It Does For Him
Llama CoderCode completion without sending code to the cloud
Ollama + CodeLlamaLocal LLM runtime on his hardware
Bee wearablePassive conversation capture → automatic todos
Supervoice modelsTTS trained “in his basement” for understanding, not just using

Daily workflow:

He wears Bee throughout the day. When he says something like “need to remember to do X” in conversation, the device extracts it as a todo. No manual input required. The AI builds context over time from ambient audio.

For coding, Llama Coder runs entirely on his machine. No telemetry, no cloud sync, no subscription.

# His setup for local code completion
brew install ollama
ollama pull codellama:7b-code
# Install Llama Coder extension in VS Code
# Done. Copilot replacement without the cloud.

Chaotic Good Engineering

Steve calls his philosophy “Chaotic Good Engineering”: ship fast, but think deliberately. From his blog:

“Good developers ship. To ship, they think. That’s the only thing that matters in the end.”

He positions himself “10% from the ship fast side.” Accept small slowdowns for avoiding technical debt. Write bugless code through habit, not process.

On AI specifically, he’s both skeptic and maximalist. Skeptical of “vibecoded” AI-generated submissions flooding open source. Maximalist about building always-on wearables and working toward AGI at Bee.

From his post on agency:

“When you choose experiences over agency, you ironically choose a path with fewer experiences.”

His rule: create tools, don’t just consume them.

Reproducing SOTA in Your Basement

Most practitioners use AI APIs. Steve reproduces the models himself.

His Supervoice projects are open-source implementations of:

Why rebuild what exists? From his GitHub bio:

“Reproducing and training text-to-speech SOTA neural networks in my basement.”

Understanding how models work requires building them. This differs from Georgi Gerganov’s approach of optimizing inference — Steve focuses on training and reproducing architectures.

What You Can Steal

TechniqueHow to Apply
Local LLM for codeInstall Ollama + Llama Coder, run CodeLlama locally. Works on M1+ Macs or RTX GPUs with 16GB+ RAM.
Ambient captureWear a device like Bee. Use natural speech (“need to remember…”) to capture todos without typing.
Build to understandPick one AI technique you use. Implement it from scratch. You’ll understand it 10x better than using an API.
90/10 speed splitShip fast, but invest 10% extra time in thoughtful design. Prevents months of technical debt.
Agency over consumptionWhen you need a tool, build it. When it works, open-source it. Stop waiting for perfect solutions.

Next: Georgi Gerganov’s llama.cpp

Topics: personal-ai local-llm workflow wearable-ai voice-ai