Online Softmax Stepper

Steps through the online (streaming) softmax — the recurrence Flash Attention uses to compute attention tile by tile without ever materialising the full score matrix. Standard softmax needs two passes (find the max, then normalise). Online softmax processes one logit at a time, keeping a running maximum m and a running normaliser l, and rescaling the accumulator by exp(oldM - newM) whenever a fresh maximum arrives.

Online softmax stepper. 5 values ready. Step 0 of 5.
Online softmax · step 0 / 5idle
0%25%50%75%100%02.011.023.030.041.0probability
Customize
Input
bump
Playback
step 0
Display

Installation

npx shadcn@latest add https://craftbits.dev/r/online-softmax-stepper.json

Usage

import { OnlineSoftmaxStepper } from "@craft-bits/core";
 
<OnlineSoftmaxStepper values={[2, 1, 3, 0, 1]} />

Drive playback from outside (parent scrubber, animation, narration sync):

const [step, setStep] = useState(0);
 
<OnlineSoftmaxStepper
  values={[2, 1, 3, 0, 1]}
  currentStep={step}
  onCurrentStepChange={setStep}
  playing
  playSpeed={600}
/>

Understanding the component

  1. Precomputed snapshots. When values changes, the entire walk is recomputed inside useMemo. Each snapshot carries m, l, the per-index probability vector so far, plus a rescaled flag and the rescale factor exp(oldM - newM). Scrubbing or stepping is O(1).
  2. The update rule. On each new value v at index k: m_new = max(m_old, v), r = exp(m_old - m_new) (= 1 when no new max), l_new = l_old * r + exp(v - m_new). Recovered probabilities are p_j = exp(x_j - m_new) / l_new for j ≤ k.
  3. Why rescaling matters. When a fresh maximum arrives, every previously accumulated term silently picks up the new m. The single multiplicative factor exp(oldM - newM) corrects for that without touching individual terms — the trick that turns softmax into a streamable, tile-friendly operation in Flash Attention.
  4. Numerical stability. Every exp(...) sees a non-positive argument (x - m_new ≤ 0), so values never overflow no matter how large the input logits get. The standard "subtract the max" softmax trick, maintained incrementally.
  5. Spring transitions on bar heights. Bar y and height animate with SPRINGS.smooth; the state readout fades with SPRINGS.snap. prefers-reduced-motion: reduce collapses everything to instant, suppresses autoplay, and parks the timeline at the final snapshot on mount.
  6. Controlled + uncontrolled. Both currentStep and playing accept controlled props paired with onCurrentStepChange / onPlayingChange, or the matching defaultCurrentStep / defaultPlaying uncontrolled variants.

Props

PropTypeDefaultDescription
valuesreadonly number[]Input logits processed one at a time.
labelsreadonly string[]indicesBar labels — falls back to the index when missing.
currentStepnumberControlled active step (0..values.length).
defaultCurrentStepnumber0Uncontrolled initial step.
onCurrentStepChange(step: number) => voidFires whenever the active step changes.
playingbooleanControlled play state.
defaultPlayingbooleanfalseUncontrolled initial play state.
onPlayingChange(playing: boolean) => voidFires when play / pause flips.
playSpeednumber600Milliseconds between auto-played steps.
showStatebooleantrueRender the running max / sum / rescale readout panel.
transitionTransitionSPRINGS.smoothOverride the spring used for bar transitions.
classNamestringMerged onto the root <div> via cn().

Accessibility

  • The root is role="figure" with an aria-labelledby heading and a visually hidden aria-live="polite" summary — screen readers hear the step count, last value, running max, running sum, and rescale factor on every change.
  • The visualisation is read-only — no draggable bars or pointer-driven controls — so there is no keyboard interaction to enumerate.
  • The fresh-max highlight is reinforced by the state readout text, so users not relying on colour still know when a rescale has just happened.
  • Animation respects prefers-reduced-motion: reduce: springs collapse to instant, autoplay is suppressed, and the component parks at the final snapshot on mount.

Credits

  • Extracted from: craftingattention (app/src/lessons/primitives/nn/OnlineSoftmaxStepper.tsx). The source paired the visualisation with a four-phase narration system, a Process-Block / Reset button strip, block grouping (4 blocks of 2 scores), ChallengeBtn / SvgLabel chrome, and inline var(--color-accent-400) references. The library extract is the pure stepthrough primitive — a precomputed online-softmax timeline parameterised by values. Per-element processing replaces block grouping; chrome and narration belong in the consuming lesson.