Diagnosis Engine
After the rule engine produces a set of findings, the diagnosis engine applies a deterministic heuristic graph to map those findings to their most likely root cause, identify contributing factors, and generate actionable recommendations.
How diagnosis works
The diagnosis engine reads the complete findings set from an analysis run and evaluates it against a heuristic graph that encodes relationships between rules, severity levels, and known architectural failure patterns in RAG systems.
The graph is acyclic and deterministic: given the same findings, the same diagnosis is always produced. There is no model inference, no external API call, and no randomness.
Diagnosis structure
1{2 "diagnosis": {3 "primaryCause": {4 "id": "low-retrieval-score",5 "label": "Poor retrieval quality",6 "description": "...",7 "severity": "critical"8 },9 "contributingCauses": [10 {11 "id": "duplicate-chunks",12 "label": "Duplicate context",13 "description": "Near-identical chunks are consuming context space..."14 }15 ],16 "evidence": [17 {18 "rule": "low-retrieval-score",19 "severity": "error",20 "message": "Chunk score below minimum threshold (0.41 < 0.72)"21 }22 ],23 "recommendations": [24 "Review your embedding model — low scores often indicate embedding mismatch",25 "Consider re-ranking retrieved chunks before passing to the LLM",26 "Increase the retrieval score threshold to filter out low-quality results",27 "Deduplicate your document corpus or increase chunk diversity in retrieval"28 ],29 "confidence": "high"30 }31}Primary root cause
The primary cause is the single most impactful finding, determined by a combination of:
- Finding severity (errors outweigh warnings)
- Known impact on downstream LLM output quality
- Priority weights defined in the heuristic graph
If no findings are present, the primary cause is null and the diagnosis result indicates a healthy trace.
Contributing causes
Contributing causes are secondary findings that amplify or worsen the primary issue but are not the root cause on their own. For example, duplicate-chunks combined with low-retrieval-score creates a compounding problem — the LLM receives both irrelevant and redundant context.
Recommendations
Recommendations are generated from the primary cause and contributing causes. They are ordered by expected impact and tailored to the specific combination of findings present. Recommendations are practical engineering actions, not vague suggestions.
Heuristic priority map
| Primary cause | Triggered when | Hallucination risk |
|---|---|---|
low-retrieval-score | Any chunk score < minScore | High |
context-overload | Utilization > 90% and no score issue | Medium |
duplicate-chunks | Duplicate pairs found, no score/overload issue | Medium |
oversized-chunk | Only oversized finding present | Low |
Confidence field
The diagnosis includes a confidence field with three possible values:
- high — Multiple corroborating findings or a high-severity critical error. The root cause is well-supported.
- medium — Single finding or mild evidence. Likely correct, but consider further investigation.
- low — Edge case or ambiguous findings. The diagnosis is a best-effort interpretation.
Determinism guarantee
Running diagnosis programmatically
1import { diagnose } from "@rag-doctor/core";2 3const result = await diagnose(trace, { pack: "recommended" });4 5console.log(result.diagnosis.primaryCause?.label);6// → "Poor retrieval quality"7 8console.log(result.diagnosis.recommendations[0]);9// → "Review your embedding model..."10 11if (result.diagnosis.confidence === "high") {12 // Safe to act on this diagnosis in automated systems13}