Skip to main content

Query Modes

TeaRAGs provides three search tools, each backed by a different MCP tool and optimized for different workflows.

ModeMCP ToolOutputBest for
Code Searchsearch_codeHuman-readable textDay-to-day development
Semantic Searchsemantic_searchStructured JSON with full metadataAnalytics, reports, advanced filtering
Hybrid Searchhybrid_searchStructured JSON with full metadataQueries mixing natural language with exact identifiers

MCP tool: search_code

The primary interface for everyday code search. It wraps semantic search with sensible defaults, accepts shorthand parameters for common filters, and returns human-readable text — ready to paste into a conversation.

What you get

  • File path and line numbers
  • Code snippets with syntax context
  • Relevance scores
  • Language information

What you don't get (use semantic_search instead)

  • Chunk IDs
  • Structured git metadata (authors array, taskIds, timestamps)
  • metaOnly mode
  • Full Qdrant filter syntax
  • Advanced rerank presets (techDebt, hotspots, ownership, etc.)

Rerank presets

PresetBoost
relevanceDefault semantic similarity
recentRecently modified code
stableLow-churn, stable implementations

Custom weights are also supported:

{ "custom": { "similarity": 0.7, "recency": 0.3 } }

Examples

How does user authentication work?

Find error handling in TypeScript files only

Search for request validation in the API directory

Find recent changes by Alice

Workflow

1. Index a codebase

Index this codebase for semantic search

2. Search with natural language

How does user authentication work?

3. Incremental updates

After making changes to your codebase:

Update the search index with my recent changes

4. Check index status

Show me stats for the current index

5. Force re-index

If you need a complete re-index (for example, after changing chunking settings):

Reindex the entire codebase from scratch

Multi-language projects

The indexer automatically detects languages across a full-stack project (TypeScript, React, Vue, Python, Go, Java, JSON, YAML, Bash, etc.):

Search for database connection pooling across all languages

Custom ignore patterns

Create a .contextignore file in your project root to exclude files from indexing:

# .contextignore
**/test/**
**/*.test.ts
**/*.spec.ts
**/fixtures/**
**/mocks/**
**/__tests__/**
**/coverage/**
*.generated.ts

MCP tool: semantic_search

Converts your query into a vector embedding and finds code chunks with the closest meaning. Unlike search_code, returns structured JSON with full metadata — chunk IDs, git signals, import lists — and supports advanced features like metaOnly, Qdrant native filters, and all rerank presets.

Featuresearch_codesemantic_search
Human-readable outputyes
Structured JSON outputyes
Chunk ID (UUID)yes
Full git metadata in responseyes (authors[], taskIds[], timestamps)
metaOnly modeyes
Qdrant native filter syntaxyes
pathPattern (glob)yesyes
Rerank presets3 (relevance, recent, stable)9+ (techDebt, hotspots, ownership, ...)
Requires collection nameno (uses path)no (path or collection)

When to use

  • Analytics — tech debt reports, hotspot detection, ownership analysis
  • Need structured data — chunk IDs for downstream processing, git metadata for reports
  • Complex filtering — Qdrant filter syntax with range, match, boolean logic
  • Metadata only — file discovery, codebase structure scans without reading content

Rerank presets

All reranking presets are available:

PresetUse case
relevanceDefault semantic similarity
techDebtLegacy code assessment
hotspotsBug-prone, high-churn areas
codeReviewRecent changes
onboardingDocs + stable code
securityAuditOld critical code
refactoringRefactor candidates
ownershipKnowledge silos
impactAnalysisDependency analysis

metaOnly mode

Returns metadata without content — significantly reduces response size:

{
"score": 0.87,
"relativePath": "src/auth/login.ts",
"startLine": 45,
"endLine": 89,
"language": "typescript",
"chunkType": "function",
"name": "handleLogin",
"imports": ["express", "jsonwebtoken", "./utils"],
"git": { "ageDays": 5, "commitCount": 12, "dominantAuthor": "alice" }
}

Use for file discovery, analytics, codebase structure scans, or ownership reports.

Characteristics

AspectDetail
InputNatural language query
MatchingVector cosine similarity
StrengthsUnderstands synonyms, intent, and cross-language patterns
WeaknessMay miss exact keywords like function names or acronyms

MCP tool: hybrid_search

Combines semantic (vector) similarity with keyword (BM25) matching and merges rankings via Reciprocal Rank Fusion (RRF). Returns the same structured JSON as semantic_search, with the same features (metaOnly, filters, all rerank presets).

When to use

Hybrid search is ideal when your query mixes natural language with technical identifiers:

  • Function or variable names — "getUserById authentication"
  • Acronyms and technical terms — "JWT token validation"
  • Error messages — "ECONNREFUSED database connection"
  • Mixed queries — "OAuth2 authorization flow"

How it works

  1. Dense vector generation — your query is embedded using the configured provider (Ollama, OpenAI, etc.)
  2. Sparse vector generation — the query is tokenized and BM25 scores are calculated
  3. Parallel search — both vector types are searched simultaneously in Qdrant
  4. Result fusion — RRF combines rankings from both searches
  5. Final ranking — merged results with combined relevance scores

RRF formula

Rankings from the semantic and keyword searches are fused using Reciprocal Rank Fusion:

score = sum( 1 / (k + rank_i) )   where k = 60 (default)

RRF does not require score normalization and is robust to differences in score scales between the two retrieval methods.

BM25 sparse vectors

The server uses a lightweight BM25 implementation for sparse vectors:

  • Tokenization: lowercase + whitespace splitting
  • IDF scoring: inverse document frequency
  • Parameters: k1 = 1.2, b = 0.75

Comparison: Semantic vs Hybrid

QuerySemantic SearchHybrid Search
"JWT authentication"Finds authentication concepts; may miss exact "JWT" matchesFinds both semantically related auth docs and exact "JWT" keyword matches
"authenticateUser function"Understands the concept but may miss the exact function nameKeyword match catches authenticateUser precisely; semantic match catches related auth code
"OAuth2 authentification" (typo)Understands "authentification" as "authentication"Keyword match catches "OAuth2" exactly; semantic match handles the typo gracefully

Hybrid search must be enabled at collection creation time:

Create a collection with hybrid search enabled

Existing collections cannot be converted to hybrid after creation.

Performance considerations

  • Storage: hybrid collections require more space (dense + sparse vectors)
  • Indexing: slightly slower due to dual vector generation
  • Query time: two parallel searches plus RRF fusion
  • Scalability: Qdrant optimizes both vector types efficiently

Filters narrow search results by metadata — language, file path, code structure, git churn metrics, and more. Filters work with all three search modes.

See Filters for the complete reference: filter syntax, operators, filterable fields, git churn filters, path patterns, and examples.

Find error handling in TypeScript files only

Show me high-churn code in the auth directory

Choosing the Right Mode

SituationRecommended Mode
Day-to-day development: search, index, reindexsearch_code
Exploring unfamiliar code by conceptsearch_code or semantic_search
Analytics: tech debt, hotspots, ownership reportssemantic_search with rerank presets
Need chunk IDs or structured git metadatasemantic_search
Metadata-only scans (file discovery, structure)semantic_search with metaOnly
Searching for exact function names mixed with conceptshybrid_search
Narrowing results to a specific language, author, or directoryAny mode + filters
info

For complete tool parameters, response formats, and rerank weight keys, see the Tools Schema.

Parts of this page are adapted from examples originally written by Martin Halder in qdrant-mcp-server.