Skip to main content
TeaRAGs

TeaRAGs πŸ¦–πŸ΅

Your coding agent copies the first code it finds β€” not the right one.

TeaRAGs is an MCP server for code search that enriches every retrieved chunk with git history: authorship, churn, bug-fix rate, ownership. Your agent stops learning from hotspots and starts learning from stable, owned, battle-tested code.

β†’ Quickstart (15 min) Β· Core Concepts

The Problem​

1. Understanding a monorepo is expensive β€” for humans AND agents​

Every new developer pays in hours. Every fresh agent session pays in tokens. Naming conventions, domain logic, local idioms β€” all of it has to be rebuilt from scratch, every time.

2. Bad code hygiene is a tax on your agent​

Confusing names mean the agent reads more files. More files mean more tokens, slower responses, and a higher chance of picking the wrong example. Your codebase's technical debt is now your AI bill.

3. Agents can't tell stable code from a hotspot​

Standard code search ranks by embedding similarity alone. It doesn't know which function gets bug-fixed every sprint, which module hasn't been touched in two years, or whose name is on the commits. So the agent copies whatever looks similar β€” including the broken examples.

The Solution​

TeaRAGs gives your agent two things it can't get from vanilla code search.

1. Every chunk carries its own history​

Retrieved code comes with signals about who wrote it, how stable it is, how often it gets bug-fixed, and how impactful a change would be. Semantic similarity stops being the whole answer β€” it becomes the floor.

2. Pre-built skills, not just raw tools​

TeaRAGs ships agent skills β€” ready-made playbooks that tell your agent when and how to use the signals. No prompt engineering required:

  • explore β€” orient in an unfamiliar codebase
  • data-driven-generation β€” write code backed by stable, owned templates
  • risk-assessment β€” know what you'd break before you break it
  • refactoring-scan Β· bug-hunt Β· pattern-search β€” and more

Install the plugin, your agent learns the workflow. See all skills β†’

Use Cases​

πŸ›‘οΈ Safe code generation​

Your agent writes new code backed by stable, canonical templates β€” modules with a low bug-fix rate, long stability, and a clear owner. No more copying from last sprint's hotspot. Skill: data-driven-generation Β· Why stable code is safer β†’

πŸ”§ Refactoring planning & problem-pattern discovery​

Find the 5% of code responsible for 80% of incidents. High churn + high bug-fix rate + concentrated ownership = your next production issue β€” and your next refactoring candidate. Skills: refactoring-scan, bug-hunt

🎯 Risk assessment before changes​

Before modifying a function, the agent checks who depends on it, how often it breaks, and what its ticket history says. Know the blast radius before you blast. Skill: risk-assessment Β· Coupling & blast radius theory β†’

πŸ—ΊοΈ Learning an unfamiliar codebase​

Ask questions instead of reading directory trees. "How does auth work?" returns the stable, canonical implementation with its history attached β€” not a random similar-looking snippet. Skill: explore

How It Works​

You talk to your agent. The agent runs a TeaRAGs skill. TeaRAGs searches your code, enriches each result with git history, and ranks by what the skill needs β€” stability, ownership, risk, or pure relevance.

What You Get​

  • 🧬 Trajectory-aware retrieval β€” the only open-source code RAG that scores results by git history, not just embedding similarity
  • πŸ“š Ships with agent skills β€” 6 ready-made playbooks for exploration, generation, risk assessment, and index management (plus 2 internal strategies)
  • πŸ”’ Local-first, privacy-first β€” works fully offline with Ollama; your code never leaves your machine (cloud providers optional)
  • πŸš€ Built for monorepos β€” AST-aware chunking across 10+ languages, incremental reindexing, parallel pipelines, millions of LOC tested

Who It's For​

  • Developers in large monorepos β€” where "find similar code" returns a dozen near-duplicates and you need the canonical one
  • Solo devs doing agentic development β€” agent-driven workflows produce bursts of micro-commits that wreck churn metrics. TeaRAGs ships a GIT SESSIONS mode (TRAJECTORY_GIT_SQUASH_AWARE_SESSIONS=true) that groups commits by (author, time gap) so a 20-commit refactor session counts as one. Churn, bug-fix rate, and ownership stay meaningful even with a single human + an agent as the only contributors.
  • Tech leads worried about AI code quality β€” who want their team's agents to learn from stable modules, not from last sprint's hotspot
  • Privacy-sensitive teams β€” finance, healthcare, defense, or anyone who can't send source code to a cloud API

Not for: repos without git history (no signal to enrich) or teams that only need autocomplete (use Copilot).

Next Steps​

I want to...Start here
Get it runningQuickstart (15 min) β€” install, index, first query
Understand the conceptCore Concepts β€” vectorization, trajectory enrichment, reranking
See what my agent can doSkills β€” 6 ready-made agent playbooks for exploration, generation, risk
Look under the hoodArchitecture β€” pipelines, data model, reranker internals
Learn the theoryKnowledge Base β€” RAG, code search, software evolution research

Acknowledgments​

Thanks to Martin Halder / qdrant-mcp-server for the foundation this forks from, and qdrant/mcp-server-qdrant β€” the ancestor of all forks. Built with Docusaurus πŸ“š.