Dataset Stratifier Viz
An interactive visualisation of stratified eval-dataset composition. Three difficulty tiers (Easy / Medium / Hard) hold fixed per-tier model accuracies; the viewer rebalances the dataset mix and watches the aggregate accuracy shift — even though the model itself never changes. The default 80/10/10 split mirrors most scraped benchmarks and inflates the aggregate to ~87%; flipping to 33/34/33 collapses it to ~68% and exposes the model's hard-tier weakness.
A single aggregate accuracy hides the failure modes you most need to surface; stratifying makes them visible — and the weighting makes the headline.
Aggregate 87.0% hides a weakness. One tier scores under 50%, but the easy-heavy mix inflates the headline number — classic Simpson's paradox.
Installation
npx shadcn@latest add https://craftbits.dev/r/dataset-stratifier-viz.jsonUsage
import { DatasetStratifierViz } from "@craft-bits/viz/dataset-stratifier-viz";
<DatasetStratifierViz />Open with a balanced mix so the viewer skips straight to the honest readout:
<DatasetStratifierViz defaultDistribution={[33, 34, 33]} />Surface the per-tier 95% confidence intervals to teach the sample-size story:
<DatasetStratifierViz showCI defaultTotalSize={150} />Lift changes into your own readout:
<DatasetStratifierViz
onChange={({ distribution, aggregate }) => {
/* feed the snapshot into a downstream chart */
}}
/>Understanding the component
- Per-tier bars. Each tier renders an accuracy bar in its own accent (
--cb-success/--cb-warning/--cb-error). The fill animates viascaleXon a left transform-origin so it respects the transform-and-opacity-only rule. WhenshowCIis on, a translucent CI band overlays the bar ataccuracy ± 1.96 · √(p(1−p)/n). - Distribution bar. A three-column grid renders the mix proportionally. The columns animate via
layoutmotion so a slider change slides one column wider while the others narrow. - Aggregate readout. The right-hand panel surfaces the weighted-average accuracy as a tabular-numerals figure that interpolates colour between
--cb-success(>85%),--cb-warning(>65%), and--cb-error. - Weighted calculation. When
showCalculationis on (default), an equation row spells out the weighted sum so the viewer can connect the picture to the math. - Slider panel. When
interactiveis on (default), three range inputs and a total-examples number input let the viewer reshape the dataset. The two non-touched sliders redistribute proportionally to their previous shares; the total always sums to 100. - Paradox callout. Whenever the aggregate is above 85% and any tier is below 50%, a warning paragraph announces the Simpson's-paradox shape so the headline number can't lie to a fast reader.
- Reduced motion. Under
prefers-reduced-motion: reduce, every entrance, bar grow, and layout transition snaps instantly.
Props
| Prop | Type | Default | Description |
|---|---|---|---|
tiers | [Tier, Tier, Tier] | Easy 95% / Medium 72% / Hard 38% | Tier definitions in render order. Accuracies stay fixed across mix changes. |
distribution | [number, number, number] | — | Controlled mix percentages (sum 100). Pair with onChange. |
defaultDistribution | [number, number, number] | [80, 10, 10] | Uncontrolled initial mix. |
totalSize | number | — | Controlled total dataset size. Clamped to [10, 10000]. |
defaultTotalSize | number | 500 | Uncontrolled initial total. |
showCI | boolean | false | Overlay each tier bar with a 95% Wald CI band. |
showCalculation | boolean | true | Render the weighted-average equation row. |
interactive | boolean | true | Render the slider panel. Set false for a static figure. |
transition | Transition | SPRINGS.snap | Override the spring used for bar grows and the aggregate readout. |
onChange | (next) => void | — | Fires with { distribution, aggregate, totalSize } on every adjustment. |
className | string | — | Merged onto the root via cn(). |
Accessibility
- The root is
role="figure"with a descriptivearia-label; the tier bars, the distribution bar, and the per-tier accuracy bars each carryrole="img"labels summarising their numbers. - A polite live region announces the current mix, total examples, and aggregate accuracy whenever the viewer reshapes the dataset.
- Slider inputs carry per-tier
aria-labels ("Easy tier percentage of dataset") and reach a 36px hit area; the total-examples input is paired with a<label htmlFor>and the same 36px minimum height. - All sliders show a
:focus-visiblering on--cb-accent; the range track encodes percentage twice (numeric badge + visual fill). - The Simpson's-paradox warning is a separate paragraph — colour and the warning accent are never the only signal.
- Motion respects
prefers-reduced-motion: reduce— every entrance, bar grow, and layout transition snaps instantly.
Credits
- Extracted from:
craftingattention(app/src/lessons/primitives/systems/DatasetStratifierViz.tsx). The source was a three-mode lesson primitive (explore / predict / challenge) layered onWidget,ModeStrip,ChallengeBtn,FeedbackBadge, andScoreDots— none of which belong in the library. The extract drops the quiz scaffolding and lifts the canonical interactive (tier bars + distribution bar + aggregate readout + slider panel) into a Radix-style controlled API. Per-track palette tokens are remapped tovar(--cb-*)semantic tokens so consumer themes repaint freely; inline spring values are re-keyed toSPRINGS.snap/SPRINGS.smoothfrom@craft-bits/core/motion.