Skip to main content

Reranking

Reranking re-scores search results using git metadata signals combined with task-specific weight profiles. Default semantic search returns results ranked by vector similarity only — but an agent investigating a production bug needs different results than an agent onboarding a new developer, even for the same query.

How It Works

  1. Vector search returns candidates ranked by embedding similarity
  2. Reranking re-scores each result using a weighted combination of trajectory signals
  3. Final ranking reflects both semantic relevance and code quality/history signals

Presets

Each preset declares the tools it supports via a tools: [] field. Most presets work across search_code, semantic_search, and hybrid_search; development-focused presets are narrower.

Development-focused

PresetSignalsBest for
relevancesimilarity onlyGeneral code lookup (default)
recentsimilarity + recencySprint review, incident response
stablesimilarity + stabilityFinding reliable implementations
provensimilarity + stability + low bug-fix rateTemplates that survived production

Analytical (risk, review, audit)

PresetKey signalsBest for
techDebtage + churn + bugFix + volatilityLegacy code assessment
hotspotschunkChurn + burstActivity + bugFix + volatilityBug-prone areas, risk assessment
dangerousbugFix + volatility + pathRiskHigh-risk code in security-sensitive paths
bug-huntbugFix + chunkChurn + recencyActively bug-fixed, recently touched code
codeReviewrecency + burstActivity + density + chunkChurnRecent changes review
onboardingdocumentation + stabilityNew developer entry points
securityAuditage + ownership + bugFix + pathRisk + volatilityOld critical code, security review
refactoringchunkChurn + chunkSize + volatility + bugFix + ageRefactor candidates
ownershipownership + knowledgeSiloKnowledge silos, bus factor
note

Blast-radius-only reranking (previously impactAnalysis) was removed in favour of custom weight combinations. For dependency/blast-radius queries, use rerank: { custom: { similarity: 0.5, imports: 0.5 } }.

Custom Weights

Any signal can be combined with arbitrary weights:

{
"rerank": {
"custom": {
"similarity": 0.4,
"burstActivity": 0.3,
"bugFix": 0.2,
"pathRisk": 0.1
}
}
}

Available Scoring Signals

20+ signals are available for composition (14 derived from git + 7 structural):

KeySignal SourceDescription
similarityVector searchEmbedding similarity score
recencygit (prefers chunk-level)Inverse of ageDays
stabilitygit (prefers chunk-level)Inverse of commitCount
churngit (prefers chunk-level)Direct commitCount
agegit (prefers chunk-level)Direct ageDays
ownershipgitLive-line owner concentration (blameDominantAuthorPctgit blame HEAD)
recentActivityConcentrationgitRecent-commit concentration (recentDominantAuthorPct — recent commit window)
chunkSizechunk metadataLines of code in chunk
documentationchunk metadataIs documentation file
importsfile metadataImport/dependency count
bugFixgit (prefers chunk-level)bugFixRate
volatilitygitchurnVolatility (stddev of commit gaps)
densitygitchangeDensity (commits/month)
chunkChurngit chunk-levelchunkCommitCount
relativeChurnNormgitChurn relative to file size
burstActivitygitrecencyWeightedFreq — recent burst of changes
pathRiskfile metadataSecurity-sensitive path pattern (0 or 1)
knowledgeSilogitSingle-contributor flag (1 / 0.5 / 0)
chunkRelativeChurngit chunk-levelChunk's share of file churn

All presets automatically prefer chunk-level data when available (e.g., chunkCommitCount over commitCount for churn signals).

Two ownership families

ownership and knowledgeSilo derive from git.file.blame* (live-line ownership via git blame HEAD) — best for authority and bus-factor questions. recentActivityConcentration derives from git.file.recent* (recent commit window) — best for review routing and activity hotspots. When the long-time owner has stopped committing, blame* and recent* disagree, and the divergence itself signals a knowledge handoff in progress.

Confidence-Aware Signals

Some signals are ratios or small-sample aggregates where the value is statistically unreliable when the underlying sample is small. Reading bugFixRate=67% as "this file is dangerously buggy" makes sense when there were 200 commits; with 3 commits, 2/3 produces the same 67% but is just noise.

To avoid agents over-reading small-sample noise, confidence-aware signals declare a support sibling (typically commitCount) that gates how much the signal contributes to ranking and which overlay label it surfaces.

Two consumers, one declaration

A confidence-aware signal declares ONE block that drives both:

  • Score dampening — derived-signal contribution is multiplied by (supportValue / k)² (capped at 1). Small samples score lower; large samples pass through unchanged. k is read adaptively from the support signal's percentile distribution at query time, so the dampening curve scales with the codebase rather than a hardcoded constant.
  • Label clamp — overlay label is capped (less-severe ceiling) when support is below a per-rule threshold. Raw value in overlay is preserved; only the bin (label) shifts. Clamp thresholds support adaptive forms (whenSupportBelow: "pN" reads percentile of support from collection stats) with static fallback for stale-index cases.

The result: an agent reading bugFixRate: { value: 67, label: "healthy" } in overlay knows the sample size doesn't support a higher-severity bin, without having to cross-reference commitCount manually.

Current confidence-aware signals

Raw signalSupportScore dampeningLabel clamp
bugFixRatecommitCountadaptive p25<p10 → healthy, <p25 → concerning (per-codebase percentiles)
churnVolatilitycommitCountadaptive p25none
relativeChurncommitCountadaptive p25none
changeDensitycommitCountadaptive p25none
recentDominantAuthorPctcommitCountadaptive p25none
blameDominantAuthorPctcommitCountadaptive p25none
blameContributorCountcommitCountadaptive p25none

Score dampening applies to all 7 confidence-aware signals; label clamp is currently only declared for bugFixRate (the signal most prone to agent misread on small-N).

Reading confidence-clamped overlays

When you see { value, label } in overlay, the label has ALREADY been clamped if the support sibling was low. Trust the label. If you read raw value directly, pair it with the support's overlay label (commitCount.label) — at "low" or below, treat the value as suggestive, not diagnostic. See signal scoring methods for the full confidence-dampening rule.

Combining Filters with Reranking

Filters (Qdrant conditions) narrow the candidate set. Reranking re-orders the filtered results. Use both for precise queries:

GoalFilterRerank
Recent bugs in authgit.ageDays <= 14 + pathPattern: **/auth/**hotspots
Old single-owner codegit.ageDays >= 90 + git.commitCount >= 5ownership (live-line)
Sole recent drivergit.ageDays <= 30 + git.file.recentContributorCount == 1recentActivityConcentration
Recently active TypeScriptlanguage: typescript + git.ageDays <= 30codeReview
Large stable functionschunkType: function + git.commitCount <= 3onboarding

👉 Full agentic reranking workflows — how agents chain presets for bug investigation, code review, refactoring, and more.