Eval Taxonomy Viz

A six-leaf decision tree that helps a visitor pick the right evaluation strategy for an LLM task. They press Start, answer three or four Yes / No questions, and the tree on the left lights up the active path while a detail panel on the right surfaces the destination eval type's Works for list, Fails when list, and setup-effort badge. After two or more completed paths, a comparison table appears that puts the trade-offs side by side.

The six leaves: Exact Match (low effort, "is the answer exactly 42?"), Fuzzy Match (low effort, normalize-then-compare for JSON / SQL / code), Semantic Similarity (medium effort, embed-and-cosine), Rubric-Based (medium effort, multi-axis scoring), LLM-as-Judge (medium effort, stronger model evaluates weaker), and Human Preference (high effort, A/B by real users).

Decision Tree
YesNoYesNoYesNoYesNoYesNoWhat are you evaluating?Is there a single correct answer?Is the answer structured (JSON, code, SQL)?Does meaning matter more than wording?Can you define scoring criteria?Do you need scale?Exact MatchFuzzy MatchSemantic SimilarityRubric-BasedLLM-as-JudgeHuman Preference
Answer a few questions to find the right eval type for your task.
Enter to start
Eval taxonomy decision tree ready. Press Start or Enter to begin.

Start the decision tree to find the right eval type for your task. Each question narrows the options until you reach the best match.

Customize
Catalog

Installation

npx shadcn@latest add https://craftbits.dev/r/eval-taxonomy-viz.json

Usage

import { EvalTaxonomyViz } from "@craft-bits/viz/eval-taxonomy-viz";
 
<EvalTaxonomyViz />

Subscribe to the visitor's chosen eval type to wire up follow-up content:

<EvalTaxonomyViz
  onLandedOnLeaf={(evalType, answers) => {
    /* gate a follow-up section based on the chosen eval type */
  }}
/>

Override the eval-type catalog to brand or translate the taxonomy:

<EvalTaxonomyViz
  evalTypes={{ /* full six-entry catalog, taglines + colour overrides */ }}
/>

Understanding the component

  1. The tree. A static SVG tree with one root, five interior decision nodes (q-single, q-structured, q-meaning, q-criteria, q-scale) and six leaves. The topology is hard-coded — the eval-type catalog is what consumers customize.
  2. Active path. Each Yes / No answer pushes onto an answer stack. Visited node IDs drive the node fills and edge strokes — visited nodes get the route-coloured fill, eliminated nodes fade to 30% opacity, and the active edge between two visited siblings tints to var(--cb-accent).
  3. Phase machine. Four phases drive the right-hand panel: idle (Start button), deciding (current question + Yes/No + path-so-far list), landed (eval-type detail card with Works for / Fails when), comparing (cards + comparison table below the tree).
  4. Comparison table. Once two or more leaves are completed, the visitor can press Compare from the detail card. The summary lists every unique eval type with its top Best for, top Watch out, and effort badge — quick at-a-glance trade-offs.
  5. Reduced motion. Under prefers-reduced-motion: reduce, every spring entrance / stagger / scale collapses to an instant change. The current-position marker still snaps to its new node; it just does so without spring overshoot.

Props

PropTypeDefaultDescription
evalTypesRecord<EvalTaxonomyVizEvalTypeId, EvalTaxonomyVizEvalType>six-leaf default catalogOverride eval-type names, taglines, lists, effort badges, and colours.
transitionTransitionSPRINGS.snapOverride the spring used for node-state transitions on the tree. Reduced-motion users still snap instantly.
onLandedOnLeaf(evalType, answers) => voidFires whenever the visitor reaches a leaf, with the eval type and the full answer trace.
classNamestringMerged onto the root via cn().

Accessibility

  • The root is role="figure" with an aria-label summarising the interaction model, and the SVG inside has its own role="img" + aria-label for the tree diagram.
  • The decider lives in a tabIndex={0} container with full keyboard support: Enter / Space to start, Y / N / / to answer, Enter / Space to advance from a landed state, R to reset.
  • A polite sr-only live region announces the current phase — ready-to-start, the current question text with Y / N instructions, the landed eval type and tagline, or the comparison count — so a screen-reader user always knows what just changed.
  • Every Yes / No / Try-another / Compare / Reset button has a distinct aria-label and a ≥ 36px hit target. The Yes / No buttons additionally encode the keyboard shortcut in visible mono-text ((Y) / (N)) so sighted-keyboard users discover the shortcut without reading docs.
  • Colour is never the only signal: every leaf node renders its eval-type name in the SVG, the path-so-far list pairs each colour dot with the answer word, and the comparison cards lead with the eval-type name + effort badge text.
  • Motion respects prefers-reduced-motion: reduce — every spring entrance / stagger / scale collapses to an instant change.

Credits

  • Extracted from: craftingattention (app/src/lessons/primitives/systems/EvalTaxonomyViz.tsx). The source was a lesson primitive that bundled lesson narration framing (the ca-narration class) and consumed the lesson's per-track palette tokens (--color-accent-500, --color-success-400, --color-fail-500, --color-ink-*). The viz extract remaps every palette token to the canonical var(--cb-accent) / var(--cb-success) / var(--cb-error) / var(--cb-warning) / var(--cb-info) / var(--cb-fg-*) so consumer themes repaint freely, rewrites the source's inline SPRINGS.snappy / SPRINGS.gentle references to the canonical SPRINGS.snap / SPRINGS.smooth from @craft-bits/core/motion, and switches STAGGER.tight / STAGGER.normal to the canonical scalar STAGGER. The infinite glow-ring pulse animation on the current node and the looping offsetDistance flow-dot on the active edge were dropped — both violate the craft-bits/duration-max-300ms rule, and the route-coloured fill plus the snap-tracked position marker already carry the active-state signal.