Tools

Collection Management

Tool	Description
`create_collection`	Create collection with specified distance metric (Cosine/Euclid/Dot)
`list_collections`	List all collections
`get_collection_info`	Get collection details and statistics
`delete_collection`	Delete collection and all documents

Document Operations

Tool	Description
`add_documents`	Add documents with automatic embedding (supports string/number IDs, metadata)
`semantic_search`	Natural language search with optional metadata filtering
`hybrid_search`	Hybrid search combining semantic and keyword (BM25) search with RRF
`delete_documents`	Delete specific documents by ID

Code Vectorization

Tool	Description
`index_codebase`	Index a codebase for semantic code search with AST-aware chunking
`search_code`	Search indexed codebase using natural language queries
`reindex_changes`	Incrementally re-index only changed files (detects added/modified/deleted)
`get_index_status`	Get indexing status and statistics for a codebase
`clear_index`	Delete all indexed data for a codebase

Project Registry

Tool	Description
`register_project`	Register a short alias for a project path. Persists in `~/.tea-rags/registry.json`
`list_projects`	List all registered projects with collection metadata (embedding model, chunks count, etc.)
`unregister_project`	Remove a project alias by name (idempotent, does NOT delete the Qdrant collection)

Once registered, the alias can be passed as project to any project-aware tool instead of path or collection. Resolution priority: collection > project > path. See Project Registry for the full guide, including the tea-rags doctor command, --recover-registry, orphan collection inspection, and the --purge flag for destructive cleanup.

No purge on MCP unregister_project

The MCP unregister_project tool removes the registry entry only — it has no purge parameter. Destructive removal of the underlying Qdrant collection is exposed only via the CLI (tea-rags projects unregister --name <alias> --purge). From an MCP client, follow unregister_project with clear_index or delete_collection if you need to remove the chunks.

Similarity Search

Tool	Description
`find_similar`	Find code similar to given chunks or code blocks using Qdrant recommend API

`find_similar` — Find Similar Code

Find chunks similar to given examples (positive) and dissimilar to others (negative). Works with chunk IDs from previous search results and/or raw code blocks.

Parameters:

Parameter	Type	Required	Default	Description
`positiveIds`	`string[]`	no*	`[]`	Chunk IDs from previous search results to find similar code
`positiveCode`	`string[]`	no*	`[]`	Raw code blocks to find similar code (embedded on-the-fly)
`negativeIds`	`string[]`	no	`[]`	Chunk IDs to push results away from
`negativeCode`	`string[]`	no	`[]`	Raw code blocks to push results away from
`strategy`	`string`	no	`best_score`	Recommend strategy: `best_score`, `average_vector`, `sum_scores`
`filter`	`object`	no	—	Qdrant native filter (must/should/must_not)
`pathPattern`	`string`	no	—	Glob pattern to filter results by file path
`fileExtensions`	`string[]`	no	—	Filter by file extensions (e.g. `[".ts", ".js"]`)
`rerank`	`string \| object`	no	`relevance`	Rerank preset or custom weights
`limit`	`number`	no	`10`	Max results to return
`offset`	`number`	no	`0`	Offset for pagination
`level`	`string`	no	by preset	`chunk` (alpha-blended) or `file` (file signals only, one per file)
`metaOnly`	`boolean`	no	`false`	Return metadata only (no content)
`collection` / `path`	`string`	yes	—	Project path or collection name

*At least one positive or negative input is required. average_vector and sum_scores strategies require at least one positive input.

Strategy descriptions:

best_score (default) — Scores each candidate against every example independently. Most flexible, supports negative-only queries.
average_vector — Averages all positive vectors, subtracts negative. Fastest. Requires at least one positive.
sum_scores — Sums scores across all examples. Middle ground. Requires at least one positive.

Example — find more code like a previous result:

{
  "collection": "code_abc123",
  "positiveIds": ["chunk-uuid-from-search-result"],
  "limit": 10
}

Example — find code similar to a snippet but not like another:

{
  "collection": "code_abc123",
  "positiveCode": ["async function handleAuth(req, res) { ... }"],
  "negativeCode": ["function legacyAuth(callback) { ... }"],
  "fileExtensions": [".ts"],
  "rerank": "codeReview"
}

Search Parameters

`level` — Analysis Level

Controls scoring granularity and result grouping. Available on semantic_search, hybrid_search, find_similar, and rank_chunks.

Value	Scoring	Grouping
`chunk`	Alpha-blended file + chunk signals	All chunks returned
`file`	Pure file signals only (alpha forced = 0)	One best chunk per file

Default: determined by the preset's signalLevel. Explicit value overrides the preset. If no preset and no explicit value — defaults to chunk.

File-level presets (default to level: "file"): securityAudit, ownership, onboarding.

{
  "collection": "code_abc123",
  "query": "authentication",
  "rerank": "securityAudit",
  "level": "file"
}

`rerank` — Result Reranking

Reorder search results based on git metadata signals.

Development-focused (available on search_code, semantic_search, hybrid_search, find_similar):

Preset	Use Case	Signals
`relevance`	Default semantic similarity	similarity only
`recent`	Find recently modified code	similarity + recency + burstActivity + density + churn
`stable`	Find stable implementation examples	stability + similarity
`proven`	Battle-tested low-bug code	stability + ownership + low bugFix + similarity

Analytical (available on semantic_search, hybrid_search, rank_chunks):

Preset	Use Case	Signals
`techDebt`	Legacy problematic code	age + churn + bugFix + volatility
`hotspots`	Bug-prone areas	chunkChurn + chunkRelativeChurn + burstActivity + bugFix + volatility
`dangerous`	High-risk modification targets	bugFix + volatility + pathRisk + imports
`bugHunt`	Historically buggy code	bugFix + chunkChurn + volatility
`codeReview`	Review recent changes	recency + burstActivity + density + chunkChurn
`refactoring`	Refactoring candidates	chunkChurn + relativeChurnNorm + chunkSize + volatility + bugFix + age
`onboarding`	Entry points for new devs	documentation + stability
`securityAudit`	Old code in critical paths	age + pathRisk + bugFix + ownership + volatility
`ownership`	Knowledge transfer / silos (live-line, blame-based)	ownership + knowledgeSilo (flags single-author code, sourced from `git.file.blame*`)

recentActivityConcentration is a weight key, not a preset — use it via custom weights (see Available weight keys below) when you want to surface recent committer concentration rather than live-line authority.

Documentation / structural (automatic or explicit):

Preset	Use Case	Signals
`documentationRelevance`	Heading-weighted doc search	similarity + headingDepth + density
`decomposition`	Find components to split	chunkSize + imports + density

note

impactAnalysis was removed — use custom weights with imports or the dangerous preset for blast-radius scoring. Full preset catalog with weight formulas: Rerank Presets.

Custom weights:

{ "custom": { "similarity": 0.7, "recency": 0.3 } }

Available weight keys: similarity, recency, stability, churn, age, ownership, recentActivityConcentration, chunkSize, chunkDensity, documentation, headingRelevance, imports, bugFix, volatility, density, chunkChurn, relativeChurnNorm, burstActivity, pathRisk, knowledgeSilo, chunkRelativeChurn, blockPenalty

Live catalog

This list is hand-maintained and may drift from the live registry. For the canonical, build-current weight-key catalog (with descriptions), read the tea-rags://schema/signals MCP resource — it is generated from the running server's signal registry.

Two ownership weight keys

ownership and knowledgeSilo derive from the live-line family (git.file.blame* — who currently owns the lines in the file), and answer authority and bus-factor questions. recentActivityConcentration derives from the commit-window family (git.file.recent* — who's been actively committing lately) and is the right key for review routing or "who's mentally loaded into this area right now". There is no authors weight key — use ownership (negate to favor diffuse ownership) or recentActivityConcentration.

`metaOnly` — Metadata Only Response

For semantic_search / hybrid_search only. Returns metadata without content:

{
  "score": 0.87,
  "relativePath": "src/auth/login.ts",
  "startLine": 45,
  "endLine": 89,
  "language": "typescript",
  "chunkType": "function",
  "name": "handleLogin",
  "imports": ["express", "jsonwebtoken", "./utils"],
  "git": {
    "ageDays": 5,
    "commitCount": 12,
    "recentDominantAuthor": "alice",
    "blameDominantAuthor": "alice"
  }
}

imports contains file-level imports (inherited by all chunks from that file). Used by the dangerous preset and custom weights (imports key) to boost files with many dependencies (blast radius).

Use for file discovery, analytics, or reducing response size.

Resources

qdrant://collections — list all collections
qdrant://collection/{name} — collection details

Collection Management​

Document Operations​

Code Vectorization​

Project Registry​

Similarity Search​

find_similar — Find Similar Code​

Search Parameters​

level — Analysis Level​

rerank — Result Reranking​

metaOnly — Metadata Only Response​

Resources​