L2 Weight Decay Viz

A teaching visualisation for the L2 regularisation term every modern optimiser carries. The penalty λ·‖w‖² adds a −λ·η·w gradient component that multiplies each weight by the shrink factor (1 − λ·η) every step. Because the loss is proportional to each weight, the same fractional cut means a much larger absolute hit for big weights — small ones barely notice.

w_t = w_0 · (1 - λ·η)^t

The component plots the weight vector as a signed histogram (positive bars in cb-accent, negative bars in cb-warning) with dashed ghost outlines showing the original magnitudes. A λ slider plus a step scrubber drive a closed-form decay so any cursor position is a pure function of the inputs — SSR / hydration-safe and instantly snap-anywhere.

L2 weight decayλ=0.050 · η=0.10 · step 00/60

λ0.050step00

Customize

Decay

lambda0.050

learningRate0.10

Step

currentStep0

playSpeed220ms

Display

ghost outlines

readout band

autoplay

Installation

npx shadcn@latest add https://craftbits.dev/r/l2-weight-decay-viz.json

Usage

import { L2WeightDecayViz } from "@craft-bits/core";
 
<L2WeightDecayViz
  weights={[-3.0, 2.5, -1.0, 0.5, 4.0, -2.0, 1.5, -0.3]}
  defaultLambda={0.05}
  learningRate={0.1}
/>

Drive the cursor externally from a narration step and run an autoplay loop:

const [step, setStep] = useState(0);
 
<L2WeightDecayViz
  weights={initialWeights}
  lambda={0.1}
  learningRate={0.1}
  currentStep={step}
  onCurrentStepChange={setStep}
  playing
  playSpeed={180}
/>

Hide the readout band and ghost outlines for a stripped-down figure:

<L2WeightDecayViz
  weights={initialWeights}
  defaultLambda={0.02}
  showReadout={false}
  showGhostOriginal={false}
/>

Anatomy

Multiplicative shrinkage, not subtraction. Every step applies w[i] *= (1 - λ·η). The factor is global, so each weight loses the same fraction per step. But a 10% cut to 4.0 removes 0.40; a 10% cut to 0.3 removes only 0.03. Big bars melt; small bars barely shift — exactly what L2 regularisation looks like.
Closed-form, not iterative. The current weight vector is computed as w_0 · (1 - λ·η)^t directly. Scrubbing the step slider doesn't replay an integration loop — it jumps to the analytic answer. Same weights, lambda, learningRate, step quadruple always produces the same bars, so SSR and external scrubbers are perfectly aligned.
Sign is preserved. As long as λ·η stays in (0, 1), the shrink factor stays positive, so multiplying by it never flips a sign — a property the bar colours (accent for w >= 0, warning for w < 0) make visible. If λ·η >= 1 the component clamps the factor at zero rather than flipping sign, because past that point the gradient-descent step is bigger than the weight itself and the L2 story breaks.
Asymptotic, not finite. Because each step multiplies by a fixed factor less than 1, the weights halve, halve, halve — always closer, never exactly zero. The ~0 label appears once a bar is within 0.01 so the chart reads cleanly without ever lying about "reached zero."
Ghost outlines anchor the eye. Dashed rectangles in cb-border-strong mark w_0[i] at low opacity — the learner can compare current vs. original without doing arithmetic. Toggle off via showGhostOriginal={false} for a still-frame.
Controlled or uncontrolled everywhere. lambda and currentStep each have controlled (value + on*Change) and uncontrolled (default*) forms (the Radix pattern). playing and playSpeed are simple props so consumers own the transport — there is no built-in play button to keep the primitive small.
Reduced motion. prefers-reduced-motion: reduce collapses every bar spring to an instant swap and disables autoplay; the slider still scrubs.

Props

Prop	Type	Default	Description
`weights`	`readonly number[]`	—	Initial weight vector. Non-finite entries are dropped.
`lambda`	`number`	—	Controlled `λ`. Pair with `onLambdaChange`.
`defaultLambda`	`number`	`0.01`	Uncontrolled initial `λ`.
`onLambdaChange`	`(lambda) => void`	—	Fires when the slider moves.
`learningRate`	`number`	`0.1`	Gradient-descent step size `η`.
`currentStep`	`number`	—	Controlled cursor step. Pair with `onCurrentStepChange`.
`defaultCurrentStep`	`number`	`0`	Uncontrolled initial cursor.
`onCurrentStepChange`	`(step) => void`	—	Fires on autoplay tick and scrub.
`playing`	`boolean`	`false`	Whether autoplay is running.
`playSpeed`	`number`	`220`	Milliseconds per autoplay tick.
`maxStep`	`number`	`60`	Maximum cursor step.
`showReadout`	`boolean`	`true`	Show the `Σw²` / shrink / peak readout band.
`showGhostOriginal`	`boolean`	`true`	Dashed outlines of the original weights.
`transition`	`Transition`	`SPRINGS.smooth`	Spring used for bar transitions.
`className`	`string`	—	Merged onto the root via `cn()`.

Accessibility

The outer element is role="figure" with a hidden title and an aria-live="polite" summary — screen readers hear Step X of N. Lambda …, learning rate …. Shrink factor …. Penalty Σw² …. Peak |w| … whenever the cursor or λ changes.
Positive bars are cb-accent, negative bars are cb-warning, ghost outlines are dashed cb-border-strong — three distinct shape / colour signals.
Both range inputs carry an explicit aria-label and a visible value readout — arrow keys scrub with screen-reader narration.
The zero baseline is rendered as a thicker cb-border-strong line to distinguish it from grid ticks.
prefers-reduced-motion: reduce collapses every spring to an instant swap and disables autoplay; manual scrubbing still works.

Credits

Extracted from: craftingattention (app/src/lessons/primitives/viz/L2WeightDecayViz.tsx). The source was a Widget-chrome lesson primitive with useWidgetHistory (undo / redo, bookmarks), a four-bookmark preset row (no-reg / light / strong / 20-steps), a ModeStrip toggle between explore and a five-round binary usePredictRounds quiz, a heuristic narration block, a setInterval auto-runner with stop-when-converged logic, and a custom BAR_SPRING inline transition. The library version drops the widget chrome, the history / bookmarks, the predict mode, the narration heuristics, and the inline spring — and exposes the underlying primitive every regularisation lesson needs: a signed histogram of w_0 with a closed-form (1 - λ·η)^t shrink, controlled / uncontrolled lambda and currentStep (Radix pattern), a playing plus playSpeed consumer-owned transport, and an SPRINGS.smooth bar transition with an honest prefers-reduced-motion snap. Sits in ML Viz → Regularization alongside OverfittingGapViz, RunningStatsViz, and VarianceCompoundViz.