Fashion-MNIST Confusion Explorer

A cross-checkpoint confusion-matrix viewer for tracking which class pairs the model still mixes up after each successive training run. The default ships three Fashion-MNIST runs at 75 / 87 / 91 % — enough to surface the persistent Shirt ↔ T-shirt and Coat ↔ Pullover failure modes that survive into the strongest model.

Headline accuracy hides the structure of the remaining errors; a confusion matrix shows it directly.

Unit 375%

T-sh
Trou
Pull
Dres
Coat
Sand
Shir
Snea
Bag
Boot
T-sh
Trou
Pull
Dres
Coat
Sand
Shir
Snea
Bag
Boot

Unit 487%

T-sh
Trou
Pull
Dres
Coat
Sand
Shir
Snea
Bag
Boot
T-sh
Trou
Pull
Dres
Coat
Sand
Shir
Snea
Bag
Boot

Unit 591%

T-sh
Trou
Pull
Dres
Coat
Sand
Shir
Snea
Bag
Boot
T-sh
Trou
Pull
Dres
Coat
Sand
Shir
Snea
Bag
Boot
Unit 3, accuracy 75 percent. No cell selected.
Customize
Layout

Installation

npx shadcn@latest add https://craftbits.dev/r/fashion-mnist-confusion-explorer.json

Usage

import { FashionMNISTConfusionExplorer } from "@craft-bits/viz/fashion-mnist-confusion-explorer";
 
<FashionMNISTConfusionExplorer />

Wire real evaluation matrices instead of the synthetic defaults:

<FashionMNISTConfusionExplorer
  models={[
    { id: "baseline", label: "Baseline", matrix: baselineCm },
    { id: "v2", label: "v2", matrix: v2Cm },
  ]}
/>

Focus on a single matrix and react to cell clicks:

<FashionMNISTConfusionExplorer
  showAllMatrices={false}
  onSelectionChange={(cell) => {
    /* lift the selection into your own narration */
  }}
/>

Understanding the component

  1. Model strip. A role="tablist" row of checkpoints. Each tab shows the label and its diagonal accuracy (Σ diag / Σ all). Selecting a tab swaps the active matrix and clears the cell selection — a cell only carries meaning within its own matrix.
  2. Matrix grid. A 1 + N column CSS grid: the first column is the row-header label, the rest are square <button> cells. Diagonal cells use --cb-success; off-diagonals use --cb-error. Colour intensity scales with count / globalMax so the heatmap stays comparable across models.
  3. Multi-matrix mode. When showAllMatrices is true (default), all models render side-by-side at full size; inactive matrices fade to opacity-60 and act as model-switcher click targets. Set showAllMatrices={false} for a single-matrix layout.
  4. Cell detail. The detail panel surfaces count / row-total (percent) for the picked cell and animates a scaleX bar showing the percentage. Clicking the same cell twice deselects it.
  5. Reduced motion. Under prefers-reduced-motion: reduce, the matrix entrance, the cell hover-/tap-scale, and the detail panel transition all snap to instant.

Props

PropTypeDefaultDescription
modelsFashionMNISTConfusionExplorerModel[]three Fashion-MNIST checkpoints (75 / 87 / 91 %)Models to compare. Each carries its own confusion matrix.
labelsstring[]FASHION_MNIST_LABELSClass labels in row / column order. Length must match every matrix's dimension.
modelIndexnumberControlled active model index. Pair with onModelChange.
defaultModelIndexnumber0Uncontrolled initial model index.
selectedCell[number, number] | nullControlled selected cell. Pair with onSelectionChange.
defaultSelectedCell[number, number] | nullnullUncontrolled initial selected cell.
showAllMatricesbooleantrueRender all models side-by-side as click-to-switch grids.
showCellDetailbooleantrueSurface the cell-detail panel under the matrix grid.
transitionTransitionSPRINGS.snapOverride the spring used for entrance, cell tap, and the detail panel.
onModelChange(index) => voidFires when the active model index changes.
onSelectionChange(cell) => voidFires when the selected cell changes (including clears).
classNamestringMerged onto the root via cn().

Accessibility

  • The root is role="figure" with a descriptive aria-label; the matrix carries role="grid" with role="columnheader" / role="rowheader" / role="gridcell" children.
  • The model strip is a role="tablist" with aria-selected on the active tab. Each tab targets its grid via aria-controls.
  • Every cell button carries an aria-label of the form "Shirt predicted as T-shirt: 12" and an aria-pressed flag while selected — so colour and intensity are never the only signal.
  • A polite live region announces the active model, accuracy, and selected cell breakdown whenever either changes.
  • All interactive elements show a :focus-visible ring on --cb-accent; cell hit areas hit the 24px minimum and the model-tab targets exceed 32×32.
  • Motion respects prefers-reduced-motion: reduce — every entrance, cell hover/tap, and detail-panel transition snaps instantly.

Credits

  • Extracted from: craftingattention (app/src/lessons/primitives/viz/FashionMNISTConfusionExplorer.tsx). The source was a two-mode lesson primitive (explore + a predict quiz layered on ModeStrip, ChallengeBtn, FeedbackBadge, ScoreDots, DoneCard) with hardcoded checkpoints and a per-cell narration string. The extract drops the quiz scaffolding and the narration generator entirely, lifts the canonical interactive (model strip + matrix grid + cell detail) into a Radix-style controlled API, and exposes models / labels so consumers can wire real evaluation matrices. The seeded synthetic generator survives as generateConfusion for demo use.