Weight Decay Toggle
A pocket-sized AdamW demonstrator. Four parameter bars sit on a baseline. The learner taps Step to advance one training step, then flips Weight decay on or off to see the difference. Under decay, every weight is multiplied by (1 − lr·λ) after the gradient step — a shrinkage that pulls each bar a tiny fraction closer to zero, regardless of which direction the loss gradient is pointing. The narration walks through four phases (observe, growing, toggle, insight) as the learner accumulates steps.
The teaching point sits in the insight phase: weight decay is a regulariser, not an optimiser. It doesn't care about the loss — it just keeps the weights small. AdamW applies the shrinkage after the Adam step, bypassing the adaptive scaling that broke L2 regularisation in plain Adam.
Four parameters training. Without weight decay, they grow as large as the loss landscape demands.
Installation
npx shadcn@latest add https://craftbits.dev/r/weight-decay-toggle.jsonUsage
import { WeightDecayToggle } from "@craft-bits/viz/weight-decay-toggle";
<WeightDecayToggle />Override the starting weights so a specific bar dominates:
<WeightDecayToggle defaultWeights={[1, 3, 9, 2]} />Crank λ to make the decay visible after a single step:
<WeightDecayToggle lambda={5} />Subscribe to step events for an external trace:
<WeightDecayToggle
onStep={(s) => {
/* read s.weights, s.loss, s.step, s.decayOn */
}}
/>Understanding the component
- Four bars. A 480 × 320 SVG plots
|w|against bar index. Each bar's height is(|w| / 10) × plot_height, so bars rise straight from the baseline. - Step. The action button runs one gradient-descent step on every parameter —
w ← w − lr × ∂L/∂w— using constant simulated gradients so the magnitude of motion is identical on every step. - Decay overlay. When weight decay is on, the step also multiplies each weight by
(1 − lr·λ). The reduction is small per step but accumulates — the dashed ghost bar inside each rect shows the post-decay target, and a dashed arrow under the baseline marks the direction of the pull. - Phase machine.
observewhile idle.growingafter a handful of un-decayed steps.togglethe first time the learner activates decay.insightafter enough decayed steps for the conceptual punchline to land. - Arithmetic annotation. Whenever decay is on, the largest-magnitude weight shows its own decay arithmetic so the maths stays visible instead of hiding behind the animation.
- Imperative animation. Bars and the loss readout animate via
motion'sanimate()driving raw SVG attributes — no re-render per frame. - Reduced motion. Under
prefers-reduced-motion: reduce, every animation snaps to its end state.
Props
| Prop | Type | Default | Description |
|---|---|---|---|
defaultWeights | readonly number[] | [2, 5, 8, 3] | Starting magnitudes. Reset returns here. |
paramNames | readonly string[] | ["w₁", "w₂", "w₃", "w₄"] | Display labels under each bar. |
learningRate | number | 0.001 | The lr term in the update. |
lambda | number | 0.1 | Weight-decay coefficient λ. Per-step shrinkage is (1 − lr·λ). |
gradients | readonly number[] | [0.3, -0.2, 0.15, -0.35] | Simulated ∂L/∂wᵢ. Kept constant for teaching clarity. |
insightAfterDecayedSteps | number | 10 | Steps after which the narration flips to insight. |
growingAfterSteps | number | 5 | Un-decayed steps after which the narration flips to growing. |
transition | Transition | SPRINGS.snap | Override the per-step bar transition. |
onStep | (step) => void | — | Fires after each step. |
onDecayToggle | (on) => void | — | Fires when the user toggles weight decay. |
onReset | () => void | — | Fires when the user clicks Reset. |
className | string | — | Merged onto the root via cn(). |
Accessibility
- The plot SVG is
role="img"with anaria-labelsummarising the parameter count, step, decay state, and loss. - The decay toggle reports state via
aria-pressed, so screen readers announce the change immediately. - A live region (
aria-live="polite") below the buttons announces the step number, each parameter's magnitude, the loss, and whether decay is on. - The narration paragraph is also
aria-live="polite"and reads as plain prose; it is the canonical explanation for each phase. - Colour is never the only signal — the decay state shows as text inside its button, in the badge, in the narration, and in the live region.
- Motion respects
prefers-reduced-motion: reduce— bars, labels, the loss readout, and reset all snap to their end states.
Credits
- Extracted from:
craftingattention(app/src/lessons/primitives/math/WeightDecayToggle.tsx). The source pulledSvgLabelandChallengeBtnfrom the lesson chrome, ran on the per-track lesson palette tokens, and inlined ad-hoc spring names into the imperative animations. The viz extract drops the lesson chrome, remaps every colour tovar(--cb-*)semantic tokens so consumer themes repaint freely, re-keys the bar transition to the canonicalSPRINGS.snap, and exposes the previously hard-coded weights, gradients, learning rate, and λ as props.