Skip to main content

Risk Assessment

Identifying risky, volatile, and security-sensitive code using TeaRAGs git-enriched signals.

Hotspot Detection: The Two-Stage Approach

The Hotspot Model (Tornhill 2024) defines a hotspot as Complexity x Change Frequency. TeaRAGs approximates this with churn + volatility + bug-fix rate at function granularity.

Stage 1: File-level scan

Start broad — find files with high churn:

Find hotspot areas in core business logic, show metadata only

Tool parameters
{
"query": "core business logic",
"rerank": "hotspots",
"metaOnly": true,
"limit": 30
}

In the results, look at file-level signals:

SignalThresholdMeaning
commitCount >= 15High churnFile changes frequently
relativeChurn >= 3.0High relative churnCode has been substantially rewritten (Nagappan threshold)
bugFixRate >= 40%High fix rateMany commits are patches, not features
churnVolatility > 30Irregular burstsChanges come in unpredictable waves

Stage 2: Chunk-level drill-down

For each hotspot file, search with chunk-level focus:

Find functions with the highest churn and bug-fix rate in the payment module

Tool parameters
{
"query": "error handling in payment processor",
"rerank": {
"custom": {
"similarity": 0.2,
"chunkChurn": 0.3,
"bugFix": 0.3,
"chunkRelativeChurn": 0.2
}
},
"pathPattern": "**/payment/**"
}

Now interpret chunk-level results:

PatternDiagnosisAction
chunkChurnRatio > 0.8 + chunkBugFixRate > 50%This one function absorbs most of the file's churn and it's mostly bug fixesRedesign this function — it's the root cause
chunkChurnRatio > 0.8 + chunkBugFixRate < 20%Function changes a lot but rarely for bug fixesActive feature work — not a problem, just busy
chunkChurnRatio < 0.1 + file commitCount > 20Function is stable inside a churny fileGood template — reliable, battle-tested code
chunkBugFixRate > 60% + chunkCommitCount < 5Few commits, but most are fixesFragile code — breaks easily, not churny enough to be a hotspot but still a quality problem

Real-world interpretation

A file src/payment/processor.ts with commitCount = 28, bugFixRate = 35% might contain:

  • processPayment()chunkCommitCount = 22, chunkBugFixRate = 55%, chunkChurnRatio = 0.79the hotspot
  • validateCard()chunkCommitCount = 4, chunkBugFixRate = 25%, chunkChurnRatio = 0.14 — normal maintenance
  • formatReceipt()chunkCommitCount = 1, chunkBugFixRate = 0%, chunkChurnRatio = 0.04 — stable, good template

Without chunk-level metrics, the entire file looks like a hotspot. With them, you know exactly which function to fix.

Churn Volatility: Healthy vs Pathological Change

Not all churn is bad. churnVolatility (standard deviation of days between commits) distinguishes regular development from erratic patching.

Interpreting churnVolatility

VolatilityCommit patternInterpretation
< 5Commits every few days, very regularCI/CD-driven updates, scheduled maintenance
5 - 15Mostly regular with occasional gapsNormal sprint-based development
15 - 30Mix of active periods and quiet periodsFeature-driven development — bursts during sprints
> 30Long quiet periods interrupted by frantic patchingReactive maintenance — "we only touch it when it breaks"

Combining volatility with other signals

Find code with high volatility and high bug-fix rate in error handling

Tool parameters
{
"query": "error handling retry logic",
"rerank": {
"custom": {
"similarity": 0.3,
"volatility": 0.3,
"bugFix": 0.2,
"chunkChurn": 0.2
}
}
}

Diagnosis matrix:

VolatilityBug Fix RateDiagnosis
Low + LowStable, well-maintainedLeave it alone
Low + HighRegular bug fixes, predictable cadenceSystematic quality issue — needs redesign, not more patches
High + LowIrregular changes, few fixesFeature bursts — probably fine
High + HighErratic patching, mostly bug fixesPathological churn — code breaks, gets patched, breaks again

Security Audit Surface

Strategy: old + sensitive + under-owned

Security vulnerabilities concentrate in old code within sensitive paths, with few reviewers:

Find old authentication and encryption code in security-sensitive paths using security audit ranking

Tool parameters
{
"query": "authentication password encryption token session",
"rerank": "securityAudit",
"pathPattern": "{**/auth/**,**/security/**,**/crypto/**,**/session/**}",
"limit": 20
}

Prioritizing audit targets

Signal combinationAudit priorityReason
ageDays > 365 + dominantAuthorPct > 80% + security pathCriticalOld code, one owner, security-sensitive — highest vulnerability risk
ageDays > 180 + bugFixRate > 30% + security pathHighOld security code that keeps getting patched
ageDays > 180 + churnVolatility > 30 + security pathHighIrregular patching of old security code — reactive maintenance
ageDays < 30 + multiple contributors + security pathLowRecently written by multiple people — likely already reviewed

Find old authentication code with a single owner