Custom Rerank Strategies
Preset reranking covers common scenarios, but custom weights unlock precise, task-specific analysis. The key is combining orthogonal signals — each weight should add unique information that the others don't have.
Strategy 1: Multi-signal risk scoring
Combine orthogonal signals to create a composite risk score:
Find code ranked by a combination of churn, bug fixes, blast radius, and volatility
Tool parameters — "What should we test more?"
{
"rerank": {
"custom": {
"chunkChurn": 0.25,
"bugFix": 0.3,
"imports": 0.25,
"volatility": 0.2
}
}
}
Why this works: chunkChurn identifies hot functions, bugFix adds quality signal, imports adds blast radius, volatility adds unpredictability. Each signal adds information the others don't have.
Anti-pattern: Don't combine churn + chunkChurn + relativeChurnNorm — these are all churn variants and will just triple-weight the same underlying signal.
Strategy 2: Inverse scoring for safe code
Sometimes you need the opposite of a hotspot — stable, well-owned, low-bug code:
Find stable, battle-tested implementations with distributed ownership to use as reference
Tool parameters — "Safe reference code"
{
"rerank": {
"custom": {
"stability": 0.3,
"age": 0.2,
"similarity": 0.3,
"ownership": 0.2
}
}
}
Why this works: stability (inverse of commitCount) boosts low-churn code. age (direct ageDays) boosts old code — old + stable = battle-tested. ownership boosts distributed authorship — multiple people reviewed this code.
Strategy 3: Activity pulse
Track where active development is happening right now:
Find code with recent burst activity and high change density
Tool parameters — "Active development zones"
{
"rerank": {
"custom": {
"burstActivity": 0.4,
"density": 0.3,
"recency": 0.3
}
}
}
Why this works: burstActivity (recencyWeightedFreq) uses exponential decay — a commit from today counts ~10x more than one from 3 weeks ago. Combined with density (commits/month) and recency, this surfaces the most actively worked-on code.
Use case: Sprint planning — see where engineering effort is concentrated. Compare against roadmap priorities.
Strategy 4: Cross-signal anomaly detection
Find unusual patterns that individual presets miss:
Find code that is both a knowledge silo and a hotspot in security-sensitive paths
Tool parameters — "Dangerous silos"
{
"rerank": {
"custom": {
"knowledgeSilo": 0.3,
"chunkChurn": 0.25,
"bugFix": 0.25,
"pathRisk": 0.2
}
},
"pathPattern": "{**/auth/**,**/payment/**,**/crypto/**}"
}
Why this works: No single preset combines silo risk + churn + security path. Custom weights let you search for this specific intersection — the most dangerous code in the codebase.
Guidelines for building custom weights
| Guideline | Explanation |
|---|---|
| Keep weights summing to ~1.0 | Weights are normalized internally, but 1.0 makes intent clearer |
| Use 3-5 signals maximum | More signals dilute each other — focus on what matters for this specific question |
| Don't overlap signals | churn + chunkChurn + density all measure change frequency — pick one |
Include similarity at 0.2-0.4 | Unless doing pure metadata analysis, some semantic relevance prevents nonsensical matches |
Test with metaOnly: true first | See the scoring before downloading full code content |
Signal Overlap Reference
Signals that measure similar things — avoid combining within the same custom rerank:
| Signal group | Members | Pick one |
|---|---|---|
| Churn frequency | churn, chunkChurn, density, burstActivity | chunkChurn for function-level, burstActivity for recency-weighted |
| Churn magnitude | relativeChurnNorm, chunkRelativeChurn | chunkRelativeChurn for function-level |
| Age/freshness | age, recency | recency if you want recent code, age if you want old code |
| Ownership | ownership, knowledgeSilo | knowledgeSilo for binary silo detection, ownership for gradient |
Signals that are orthogonal and combine well:
| Signal A | + Signal B | Combined meaning |
|---|---|---|
chunkChurn | bugFix | Frequently changed + mostly bug fixes = quality problem |
knowledgeSilo | imports | Single owner + many dependents = dangerous silo |
stability | age | Low churn + old = battle-tested code |
burstActivity | pathRisk | Recent activity in security paths = needs review |
chunkRelativeChurn | volatility | Function absorbs disproportionate churn + irregular pattern = structural problem |
Known Limitations
-
Schema gap: The
ScoringWeightsSchemain the MCP tool definitions does not yet expose the newer weight keys (relativeChurnNorm,burstActivity,pathRisk,knowledgeSilo,chunkRelativeChurn). Agents using preset strings are unaffected; agents constructing custom weights for these signals will need the schema updated. -
No cross-search chaining: Each search is independent. The agent must manually chain results from one search into filters for the next. There is no built-in "find all files that import results from my previous search."
-
Git metadata required: All reranking presets except
relevancerequireCODE_ENABLE_GIT_METADATA=trueduring indexing. Without git enrichment, non-relevance presets silently degrade to similarity-only scoring. -
Chunk-level data is partial: Chunk-level metrics (chunkCommitCount, chunkBugFixRate, etc.) are only available for files with multiple chunks and recent commits within the
GIT_CHUNK_MAX_AGE_MONTHSwindow. Single-chunk files and old-only commits fall back to file-level metrics. -
No fan-in (importedBy) data yet: The current
impactAnalysispreset uses only fan-out (imports count). Fan-in metrics and theblastRadiuspreset are planned.