
TeaRAGs π¦π΅
Your coding agent copies the first code it finds β not the right one.
TeaRAGs is an MCP server for code search that enriches every retrieved chunk with git history: authorship, churn, bug-fix rate, ownership. Your agent stops learning from hotspots and starts learning from stable, owned, battle-tested code.
β Quickstart (15 min) Β· Core Concepts
The Problemβ
1. Understanding a monorepo is expensive β for humans AND agentsβ
Every new developer pays in hours. Every fresh agent session pays in tokens. Naming conventions, domain logic, local idioms β all of it has to be rebuilt from scratch, every time.
2. Bad code hygiene is a tax on your agentβ
Confusing names mean the agent reads more files. More files mean more tokens, slower responses, and a higher chance of picking the wrong example. Your codebase's technical debt is now your AI bill.
3. Agents can't tell stable code from a hotspotβ
Standard code search ranks by embedding similarity alone. It doesn't know which function gets bug-fixed every sprint, which module hasn't been touched in two years, or whose name is on the commits. So the agent copies whatever looks similar β including the broken examples.
The Solutionβ
TeaRAGs gives your agent two things it can't get from vanilla code search.
1. Every chunk carries its own historyβ
Retrieved code comes with signals about who wrote it, how stable it is, how often it gets bug-fixed, and how impactful a change would be. Semantic similarity stops being the whole answer β it becomes the floor.
2. Pre-built skills, not just raw toolsβ
TeaRAGs ships agent skills β ready-made playbooks that tell your agent when and how to use the signals. No prompt engineering required:
exploreβ orient in an unfamiliar codebasedata-driven-generationβ write code backed by stable, owned templatesrisk-assessmentβ know what you'd break before you break itrefactoring-scanΒ·bug-huntΒ·pattern-searchβ and more
Install the plugin, your agent learns the workflow. See all skills β
Use Casesβ
π‘οΈ Safe code generationβ
Your agent writes new code backed by stable, canonical templates β modules
with a low bug-fix rate, long stability, and a clear owner. No more copying from
last sprint's hotspot. Skill: data-driven-generation Β·
Why stable code is safer β
π§ Refactoring planning & problem-pattern discoveryβ
Find the 5% of code responsible for 80% of incidents. High churn + high
bug-fix rate + concentrated ownership = your next production issue β and your
next refactoring candidate. Skills: refactoring-scan, bug-hunt
π― Risk assessment before changesβ
Before modifying a function, the agent checks who depends on it, how often it
breaks, and what its ticket history says. Know the blast radius before you
blast. Skill: risk-assessment Β·
Coupling & blast radius theory β
πΊοΈ Learning an unfamiliar codebaseβ
Ask questions instead of reading directory trees. "How does auth work?"
returns the stable, canonical implementation with its history attached β not
a random similar-looking snippet. Skill: explore
How It Worksβ
You talk to your agent. The agent runs a TeaRAGs skill. TeaRAGs searches your code, enriches each result with git history, and ranks by what the skill needs β stability, ownership, risk, or pure relevance.
What You Getβ
- 𧬠Trajectory-aware retrieval β the only open-source code RAG that scores results by git history, not just embedding similarity
- π Ships with agent skills β 6 ready-made playbooks for exploration, generation, risk assessment, and index management (plus 2 internal strategies)
- π Local-first, privacy-first β works fully offline with Ollama; your code never leaves your machine (cloud providers optional)
- π Built for monorepos β AST-aware chunking across 10+ languages, incremental reindexing, parallel pipelines, millions of LOC tested
Who It's Forβ
- Developers in large monorepos β where "find similar code" returns a dozen near-duplicates and you need the canonical one
- Solo devs doing agentic development β agent-driven workflows produce
bursts of micro-commits that wreck churn metrics. TeaRAGs ships a
GIT SESSIONS mode
(
TRAJECTORY_GIT_SQUASH_AWARE_SESSIONS=true) that groups commits by(author, time gap)so a 20-commit refactor session counts as one. Churn, bug-fix rate, and ownership stay meaningful even with a single human + an agent as the only contributors. - Tech leads worried about AI code quality β who want their team's agents to learn from stable modules, not from last sprint's hotspot
- Privacy-sensitive teams β finance, healthcare, defense, or anyone who can't send source code to a cloud API
Not for: repos without git history (no signal to enrich) or teams that only need autocomplete (use Copilot).
Next Stepsβ
| I want to... | Start here |
|---|---|
| Get it running | Quickstart (15 min) β install, index, first query |
| Understand the concept | Core Concepts β vectorization, trajectory enrichment, reranking |
| See what my agent can do | Skills β 6 ready-made agent playbooks for exploration, generation, risk |
| Look under the hood | Architecture β pipelines, data model, reranker internals |
| Learn the theory | Knowledge Base β RAG, code search, software evolution research |
Acknowledgmentsβ
Thanks to Martin Halder / qdrant-mcp-server for the foundation this forks from, and qdrant/mcp-server-qdrant β the ancestor of all forks. Built with Docusaurus π.