Fashion-MNIST Confusion Explorer

A cross-checkpoint confusion-matrix viewer for tracking which class pairs the model still mixes up after each successive training run. The default ships three Fashion-MNIST runs at 75 / 87 / 91 % — enough to surface the persistent Shirt ↔ T-shirt and Coat ↔ Pullover failure modes that survive into the strongest model.

Headline accuracy hides the structure of the remaining errors; a confusion matrix shows it directly.

Unit 3 — 75%

T-sh

Trou

Pull

Dres

Coat

Sand

Shir

Snea

Bag

Boot

T-sh

Trou

Pull

Dres

Coat

Sand

Shir

Snea

Bag

Boot

Unit 4 — 87%

T-sh

Trou

Pull

Dres

Coat

Sand

Shir

Snea

Bag

Boot

T-sh

Trou

Pull

Dres

Coat

Sand

Shir

Snea

Bag

Boot

Unit 5 — 91%

T-sh

Trou

Pull

Dres

Coat

Sand

Shir

Snea

Bag

Boot

T-sh

Trou

Pull

Dres

Coat

Sand

Shir

Snea

Bag

Boot

Customize

Layout

show all matrices

show cell detail

Installation

npx shadcn@latest add https://craftbits.dev/r/fashion-mnist-confusion-explorer.json

Usage

import { FashionMNISTConfusionExplorer } from "@craft-bits/viz/fashion-mnist-confusion-explorer";
 
<FashionMNISTConfusionExplorer />

Wire real evaluation matrices instead of the synthetic defaults:

<FashionMNISTConfusionExplorer
  models={[
    { id: "baseline", label: "Baseline", matrix: baselineCm },
    { id: "v2", label: "v2", matrix: v2Cm },
  ]}
/>

Focus on a single matrix and react to cell clicks:

<FashionMNISTConfusionExplorer
  showAllMatrices={false}
  onSelectionChange={(cell) => {
    /* lift the selection into your own narration */
  }}
/>

Understanding the component

Model strip. A role="tablist" row of checkpoints. Each tab shows the label and its diagonal accuracy (Σ diag / Σ all). Selecting a tab swaps the active matrix and clears the cell selection — a cell only carries meaning within its own matrix.
Matrix grid. A 1 + N column CSS grid: the first column is the row-header label, the rest are square <button> cells. Diagonal cells use --cb-success; off-diagonals use --cb-error. Colour intensity scales with count / globalMax so the heatmap stays comparable across models.
Multi-matrix mode. When showAllMatrices is true (default), all models render side-by-side at full size; inactive matrices fade to opacity-60 and act as model-switcher click targets. Set showAllMatrices={false} for a single-matrix layout.
Cell detail. The detail panel surfaces count / row-total (percent) for the picked cell and animates a scaleX bar showing the percentage. Clicking the same cell twice deselects it.
Reduced motion. Under prefers-reduced-motion: reduce, the matrix entrance, the cell hover-/tap-scale, and the detail panel transition all snap to instant.

Props

Prop	Type	Default	Description
`models`	`FashionMNISTConfusionExplorerModel[]`	three Fashion-MNIST checkpoints (75 / 87 / 91 %)	Models to compare. Each carries its own confusion matrix.
`labels`	`string[]`	`FASHION_MNIST_LABELS`	Class labels in row / column order. Length must match every matrix's dimension.
`modelIndex`	`number`	—	Controlled active model index. Pair with `onModelChange`.
`defaultModelIndex`	`number`	`0`	Uncontrolled initial model index.
`selectedCell`	`[number, number] \| null`	—	Controlled selected cell. Pair with `onSelectionChange`.
`defaultSelectedCell`	`[number, number] \| null`	`null`	Uncontrolled initial selected cell.
`showAllMatrices`	`boolean`	`true`	Render all models side-by-side as click-to-switch grids.
`showCellDetail`	`boolean`	`true`	Surface the cell-detail panel under the matrix grid.
`transition`	`Transition`	`SPRINGS.snap`	Override the spring used for entrance, cell tap, and the detail panel.
`onModelChange`	`(index) => void`	—	Fires when the active model index changes.
`onSelectionChange`	`(cell) => void`	—	Fires when the selected cell changes (including clears).
`className`	`string`	—	Merged onto the root via `cn()`.

Accessibility

The root is role="figure" with a descriptive aria-label; the matrix carries role="grid" with role="columnheader" / role="rowheader" / role="gridcell" children.
The model strip is a role="tablist" with aria-selected on the active tab. Each tab targets its grid via aria-controls.
Every cell button carries an aria-label of the form "Shirt predicted as T-shirt: 12" and an aria-pressed flag while selected — so colour and intensity are never the only signal.
A polite live region announces the active model, accuracy, and selected cell breakdown whenever either changes.
All interactive elements show a :focus-visible ring on --cb-accent; cell hit areas hit the 24px minimum and the model-tab targets exceed 32×32.
Motion respects prefers-reduced-motion: reduce — every entrance, cell hover/tap, and detail-panel transition snaps instantly.

Credits

Extracted from: craftingattention (app/src/lessons/primitives/viz/FashionMNISTConfusionExplorer.tsx). The source was a two-mode lesson primitive (explore + a predict quiz layered on ModeStrip, ChallengeBtn, FeedbackBadge, ScoreDots, DoneCard) with hardcoded checkpoints and a per-cell narration string. The extract drops the quiz scaffolding and the narration generator entirely, lifts the canonical interactive (model strip + matrix grid + cell detail) into a Radix-style controlled API, and exposes models / labels so consumers can wire real evaluation matrices. The seeded synthetic generator survives as generateConfusion for demo use.