Attention Stepper Viz

A step-through visualisation of transformer attention. The cursor walks the output tokens one at a time; for the live output, curved connection lines reach up to every input token with opacity and stroke width encoding the attention weight. The top-two attended inputs are tinted with the accent color and labelled with their numeric weight.

Where the Attention Heatmap shows the full N×N matrix at a glance, the stepper walks the matrix row-by-row so learners can see "what is this output token attending to right now."

Step 1 of 4: output 'A' attending most to input 'the' and 'cat'.
Attention stepperstep 01 / 04
inputsoutputs0.550.250.20thecatsatAfluffycatreclines
Customize
Cursor
A
Playback
800ms

Installation

npx shadcn@latest add https://craftbits.dev/r/attention-stepper-viz.json

Usage

import { AttentionStepperViz } from "@craft-bits/core";
 
const inputTokens = ["the", "cat", "sat"];
const outputTokens = ["A", "fluffy", "cat", "reclines"];
 
const attentionWeights = [
  [0.55, 0.25, 0.20],
  [0.70, 0.18, 0.12],
  [0.15, 0.75, 0.10],
  [0.10, 0.25, 0.65],
];
 
<AttentionStepperViz
  inputTokens={inputTokens}
  outputTokens={outputTokens}
  attentionWeights={attentionWeights}
/>

Drive playback from outside the component, e.g. synced to a scrollytelling step:

const [step, setStep] = useState(0);
 
<AttentionStepperViz
  inputTokens={inputTokens}
  outputTokens={outputTokens}
  attentionWeights={attentionWeights}
  currentStep={step}
  onCurrentStepChange={setStep}
  playing={false}
/>

Understanding the component

  1. Two rows, one cursor. Inputs sit across the top, outputs across the bottom. A single currentStep cursor picks which output is "live" — its chip is filled with --cb-accent, every other output dims.
  2. Curves carry weight. Each connection is a cubic curve from the current output up to one input. Its opacity is 0.08 + weight * 0.85 (so even tiny weights are faintly visible) and its stroke width interpolates between the thin and bold tiers from SVG_TOKENS.edge. The top-two attended inputs draw in the accent color; the rest in muted foreground.
  3. Top-k labels. Up to three numeric weight badges float at the midpoint of the strongest connections so the raw matrix value is readable inside the figure.
  4. Autoplay with cleanup. When playing is true, a setInterval advances the cursor every playSpeed milliseconds and wraps at the end. The effect cleans up on unmount and on cursor changes.
  5. SPRINGS.smooth for lines, SPRINGS.snap for chips. Line opacity transitions feel scroll-like; chip highlight flips are crisp.
  6. Reduced-motion fallback. With prefers-reduced-motion: reduce, all transitions collapse to duration: 0 and autoplay is forced off.

Props

PropTypeDefaultDescription
inputTokensreadonly string[]Tokens drawn on the top row. Required.
outputTokensreadonly string[]Tokens drawn on the bottom row. Required.
attentionWeightsreadonly (readonly number[])[]outputTokens.length × inputTokens.length matrix in [0, 1]. Required.
currentStepnumberControlled output cursor. Pair with onCurrentStepChange.
defaultCurrentStepnumber0Uncontrolled initial cursor.
onCurrentStepChange(step) => voidFires on tick and on manual scrub.
playingbooleanControlled play state. Pair with onPlayingChange.
defaultPlayingbooleantrueUncontrolled initial play state.
onPlayingChange(playing) => voidFires when play / pause flips.
playSpeednumber800Milliseconds between step advances.
transitionTransitionSPRINGS.smoothSpring for connection-line transitions.
classNamestringMerged onto the root <div> via cn().

Accessibility

  • The figure is role="figure" with aria-labelledby pointing at the heading and an aria-live="polite" summary that announces the current output token and its top-two attended inputs as the cursor advances.
  • The play / pause button uses aria-pressed; previous / next buttons carry explicit aria-label values.
  • The scrubber is a native <input type="range"> with aria-label="Output token cursor", so keyboard arrows nudge the step and screen readers narrate the value.
  • Color is never the only signal — top-k connections carry numeric weight badges and the current output uses both color and bold weight.
  • prefers-reduced-motion: reduce collapses every transition to an instant swap and disables autoplay.

Credits

  • Extracted from: craftingattention (app/src/lessons/primitives/viz/AttentionStepperViz.tsx). The original was a 4-stage softmax-pipeline narration tied to a specific lesson; the library extract is the row-by-row stepper primitive — the caller supplies the input / output tokens and the precomputed attention matrix.