data portability for AI

Table of content

by Ray Svitla

every AI tool you use is accumulating context about you. your preferences, your projects, your writing style, your code patterns, your thinking habits. this context is what makes the tool useful over time. it’s also what makes switching tools painful.

the question nobody asks during the honeymoon phase of a new AI tool: can I leave? and if I leave, does my data come with me?

the lock-in trap

here’s how it works. you start using an AI coding assistant. it learns your codebase, your conventions, your architecture decisions. after six months, it’s genuinely faster because it has context you never need to re-explain.

then the pricing changes. or a better tool launches. or the company gets acquired and the product direction shifts. you want to switch.

but your context — the thing that makes the tool valuable — lives inside the platform. your conversation history, your custom instructions, your memory data, your fine-tuned behaviors. none of it exports cleanly. switching means starting from zero with a new tool while the old one holds your data hostage.

this isn’t hypothetical. it’s happening right now across every AI platform.

what portable looks like

your instructions are files. CLAUDE.md, .cursorrules, custom instructions — these should live in your project repos, not in a platform’s cloud. if your configuration is a local file, it works with any tool that reads that file format. if it’s a settings page on a web dashboard, you’re locked in.

your memory is exportable. conversation history, learned preferences, accumulated context — you should be able to export this as structured data (JSON, markdown, whatever). not “download a zip of your data” that gives you an unreadable blob, but actually structured, reusable data.

your integrations are standard. MCP is good here — it’s a protocol, not a platform. an MCP server you build for Claude Code works with any MCP-compatible client. a proprietary plugin for a specific tool locks you to that tool.

your workflows are reproducible. the way you work with the AI should be describable in a platform-agnostic way. “I give it a spec, it generates tests, then code, then I review” works anywhere. “I use the three-dot menu, then click Custom Workflow, then select Template #4” works in one product.

the self.md philosophy

this is why self.md exists as a concept. your AI context — who you are, how you work, what you know — should live in files you control. markdown files in a git repo. plain text that any tool can read.

~/
├── .claude/
│   └── CLAUDE.md          ← project-level context
├── self.md/
│   ├── identity.md        ← who you are
│   ├── preferences.md     ← how you work
│   ├── knowledge/         ← what you know
│   └── memory/            ← what happened

this structure is tool-agnostic. Claude Code reads it. Cursor could read it. a hypothetical future tool you haven’t heard of yet could read it. because it’s just files.

the opposite: your identity lives in OpenAI’s custom instructions, your memory lives in Claude’s project knowledge, your preferences live in Cursor’s settings, and your knowledge lives in Notion’s API. five platforms holding pieces of your context, none of them talking to each other, all of them charging you monthly.

the practical steps

audit your context. right now, where does your AI context live? list every platform that knows something about you that makes it useful. can you export that knowledge?

centralize to files. anything stored in platform settings that could be a file, make it a file. custom instructions → CLAUDE.md. project knowledge → markdown docs in your repo. personal preferences → a dotfile.

use standard protocols. MCP over proprietary plugins. file-based config over web dashboards. git repos over cloud-only storage. every standard you adopt is a lock-in you avoid.

export regularly. for platforms where you can’t avoid cloud storage, export your data periodically. conversation histories, memory data, generated artifacts. don’t wait until you want to leave.

test portability. try using a different AI tool for a week with only your local files as context. if the experience is terrible, you have a portability problem. if it’s roughly equivalent, your context is actually portable.

the hard truth about AI memory

most AI “memory” systems are proprietary black boxes. Claude’s memory, ChatGPT’s memory, Copilot’s context — they store information about you in formats you can’t access, in locations you don’t control, subject to policies you didn’t write.

Mem0 and similar projects are working on open memory layers — memory that you own, that exports cleanly, that works across platforms. this is the direction things need to move. memory as infrastructure you control, not as a feature a platform owns.

until that’s standard, the best defense is keeping your most important context in files you control. let the platform memory handle ephemeral session context. keep durable knowledge — your identity, your preferences, your project context — in your own git repos.

platform risk isn’t theoretical

Google killed Reader. Twitter became X. Heroku killed its free tier. every platform you depend on will eventually change in ways you don’t like. the question isn’t whether — it’s whether you’ll be ready when it happens.

for AI tools specifically, the risk is compounded. you’re not just losing a product — you’re losing accumulated context that took months to build. the switching cost isn’t learning a new interface. it’s re-teaching a new AI everything the old one knew about you.

data portability isn’t a feature to evaluate. it’s a requirement to demand. any AI tool that makes leaving expensive is telling you exactly how it plans to keep you.

what would you lose if your primary AI tool disappeared tomorrow?

→ context engineering — building portable context → ai second brain — owning your knowledge → CLAUDE.md guide — file-based AI configuration

Ray Svitla stay evolving