Stop Renting Your AI
Every token you spend should leave a deposit. If it doesn't, you're renting. And rent goes up.
You know the ritual. Open your AI tool. Re-explain the project. Re-establish the architecture. Re-describe the naming convention you settled on last week. Twenty minutes of setup for forty minutes of work. Tomorrow, same ritual. Same cost. Same starting line.
That's rent. You're paying full price for context you already own.
The Repetition Tax
Someone tracked every token their AI coding agent consumed for a week. 70% was waste. Not bad output, not hallucinations. Waste. Tokens spent re-reading files, re-processing conversation history, re-deriving context that hadn't changed since yesterday.
A deeper breakdown tells the same story. Of all the tokens flowing through a typical AI coding session, 35-45% go to file reading and code search. Another 15-20% is pure context re-sending. Only 5-15% produces actual code. The deliverable is a sliver. The rest is overhead.
And it compounds in the wrong direction. A session that starts at 5,000 tokens per call can reach 200,000 by the end. Each turn re-processes everything before it. By turn 15, you're paying rent on every decision you already made.
In operating systems, there's a failure mode called thrashing: the system spends so much time swapping between memory pages that it can't do real work. AI sessions have the same problem. Token thrashing is what happens when your tools burn more tokens re-loading context than producing anything new. Most sessions are thrashing by default. And nobody's named it until now.
"Getting the context to the model at the right time is at least half the performance." - Dan Shipper
Half the performance. Not half the tokens. Half the quality of what you get back. Most AI interactions spend the majority of their budget on context that produces nothing new. That's not a compute problem. That's an economics problem.
Bigger Windows Won't Save You
The industry's answer is bigger context windows. Gemini offers 2 million tokens. Claude handles 200K. The pitch: stuff more in, get more out.
The data says the opposite. Claude's accuracy on recall tasks drops from 29% at 10,000 tokens to 3% at 1 million. Chroma Research tested 18 frontier models. Every single one degraded as input length grew. Even a single distractor document reduced performance versus baseline.
Microsoft and Salesforce found that model performance dropped 39% when benchmarks were converted from single-turn to multi-turn conversations. More context. Worse answers.
The context window is not memory. One engineer put it perfectly: "It's a re-feed pipeline." Every session, the same information gets re-injected, re-processed, re-attended to. A bigger window doesn't change what happens when the session ends. Everything still disappears.
A bigger apartment is still rented. You don't build equity by getting more square footage. You build equity by owning something that appreciates.
What a Deposit Looks Like
Meta faced this problem at scale. Engineers had 50+ patterns that existed only in people's heads. AI agents would "guess, explore, guess again, and produce code that compiled but was subtly wrong." Every new developer, every new AI session started the same expensive exploration.
Their fix: 59 compass files. Each one roughly 1,000 tokens of crystallized knowledge. What used to take two days of research dropped to 30 minutes. Token consumption fell 40% per task.
Those compass files are deposits. Written once, they pay returns every session after.
This is what it looks like at a personal scale too. You write a pull request. You notice you structured the error handling the same way you did last week. You capture the pattern. Next session, your AI already knows how you handle errors. The token cost of that decision drops from 60 tokens of prose to 3 tokens of reference. Session after session, the vocabulary grows richer and the cost grows cheaper.
"We already did the thinking once. We don't have to do the thinking every time. And that's how we compound."
We've written about how the best frameworks are discovered, not chosen. This is the economic proof. Every captured pattern is a deposit. Every re-explained convention is rent.
The Window Is Open
AI costs are subsidized right now. Everyone in the industry knows this. OpenAI, Anthropic, Google - they're pricing below cost to capture market share. Enterprise AI budgets have grown from $1.2 million to $7 million in two years. The tokens are flowing.
Here's the question most people aren't asking: what are those tokens building?
If the answer is "nothing durable," then you're consuming a subsidy without capturing value from it. When the subsidy normalizes, and it will, the person who spent cheap tokens crystallizing their context pays pennies per session. The person who spent cheap tokens re-explaining the same project pays full price. Again. Every session.
Andrej Karpathy put a name to the shift: context engineering over prompt engineering. Not "what do I type right now" but "what does my AI already know before I type anything?" That's the difference between a lease payment and a mortgage payment. One builds nothing. The other builds equity.
The models will keep improving. Kevin Weil at OpenAI: "The AI model you're using today is the worst AI model you will ever use for the rest of your life." True. And the crystallized context you build today compounds on top of every future model. Your deposits appreciate. Your rent just resets.
What Compounds
RedMonk surveyed developers and found persistent context is their second most-wanted feature. Not speed. Not accuracy. Remembering. Developers don't want bigger windows. They want their AI to know what it knew yesterday.
One team measured the difference: 23% better outcomes after three months of accumulated context versus starting fresh each time. Same model. Same prompts. The only variable was whether knowledge persisted.
Here's the math that makes this irreversible. A junior developer with a base of 1, multiplied by AI, produces 10. A senior developer with a base of 5 produces 50. A senior developer with two years of compounded patterns, a base of 15, produces 150. Same AI. Same exponent. 15x difference in outcomes.
We've written about what happens when judgment runs dry. Your judgment is the expensive input. When you crystallize a pattern, you convert expensive evaluation (slow, metabolically costly, depleting) into cheap recognition (fast, automatic, sustainable). That's the real deposit. Not just saving tokens. Saving the cognitive cost of re-deciding what you already decided.
Frameworks compound. Vocabulary compounds. Judgment compounds. Tokens don't.
The Invitation
Your workbench is where deposits accumulate. Every session that captures a pattern, names a term, records a decision - that's equity. Portable equity. It works with any AI tool. It follows you to the next project, the next job, the next model.
The rent goes up. The deposits appreciate.
Build anything with AI. Keep everything. Evolve forever.
Read more: What Stays When the Session Ends ->