Steve Korshakov's Build-Your-Own-AI-Stack Approach
Table of content

Steve Korshakov doesn’t use AI tools. He builds the AI tools he needs, runs them on his own hardware, and open-sources them. His VS Code extension Llama Coder replaces GitHub Copilot with local LLMs. His Supervoice projects train text-to-speech models from scratch. And he wears Bee, an AI pendant that captures his conversations and extracts todos automatically.
Background
- Early Telegram engineer (assembler-level image preprocessing, custom encryption)
- Co-founded Openland (YC W18), TON Whales ($20M+ revenue)
- Created Tact programming language for TON blockchain
- Now founding engineer at Bee, building wearable AI infrastructure
Find him on GitHub (@ex3ndr), X (@Ex3NDR), and his blog.
The System: Build What You Need
Steve’s approach: if an AI tool doesn’t exist the way you want it, build it yourself and run it locally.
Core stack:
| Tool | What It Does For Him |
|---|---|
| Llama Coder | Code completion without sending code to the cloud |
| Ollama + CodeLlama | Local LLM runtime on his hardware |
| Bee wearable | Passive conversation capture → automatic todos |
| Supervoice models | TTS trained “in his basement” for understanding, not just using |
Daily workflow:
He wears Bee throughout the day. When he says something like “need to remember to do X” in conversation, the device extracts it as a todo. No manual input required. The AI builds context over time from ambient audio.
For coding, Llama Coder runs entirely on his machine. No telemetry, no cloud sync, no subscription.
# His setup for local code completion
brew install ollama
ollama pull codellama:7b-code
# Install Llama Coder extension in VS Code
# Done. Copilot replacement without the cloud.
Chaotic Good Engineering
Steve calls his philosophy “Chaotic Good Engineering”: ship fast, but think deliberately. From his blog:
“Good developers ship. To ship, they think. That’s the only thing that matters in the end.”
He positions himself “10% from the ship fast side.” Accept small slowdowns for avoiding technical debt. Write bugless code through habit, not process.
On AI specifically, he’s both skeptic and maximalist. Skeptical of “vibecoded” AI-generated submissions flooding open source. Maximalist about building always-on wearables and working toward AGI at Bee.
From his post on agency:
“When you choose experiences over agency, you ironically choose a path with fewer experiences.”
His rule: create tools, don’t just consume them.
Reproducing SOTA in Your Basement
Most practitioners use AI APIs. Steve reproduces the models himself.
His Supervoice projects are open-source implementations of:
- VALL-E 2 (neural codec language model for TTS)
- VoiceBox (Meta’s speech generation model)
- Diffusion-based audio enhancement
Why rebuild what exists? From his GitHub bio:
“Reproducing and training text-to-speech SOTA neural networks in my basement.”
Understanding how models work requires building them. This differs from Georgi Gerganov’s approach of optimizing inference — Steve focuses on training and reproducing architectures.
What You Can Steal
| Technique | How to Apply |
|---|---|
| Local LLM for code | Install Ollama + Llama Coder, run CodeLlama locally. Works on M1+ Macs or RTX GPUs with 16GB+ RAM. |
| Ambient capture | Wear a device like Bee. Use natural speech (“need to remember…”) to capture todos without typing. |
| Build to understand | Pick one AI technique you use. Implement it from scratch. You’ll understand it 10x better than using an API. |
| 90/10 speed split | Ship fast, but invest 10% extra time in thoughtful design. Prevents months of technical debt. |
| Agency over consumption | When you need a tool, build it. When it works, open-source it. Stop waiting for perfect solutions. |
Links
Next: Georgi Gerganov’s llama.cpp
Get updates
New guides, workflows, and AI patterns. No spam.
Thank you! You're on the list.