Gate Debugger
Modern transformer FFN blocks run a GLU variant — GLU(x) = (Wx) · σ(Vx), or SwiGLU / GeGLU in the Llama / PaLM / Gemma families. The gate is the σ branch: a vector of values in [0, 1] that decides per-cell which signals get to flow forward. When too many entries collapse to ~0 (always closed) or ~1 (always open), those cells stop carrying gradient and the model loses capacity silently. GateDebugger makes that pathology visible: bar chart per cell, distribution histogram, plus dead-count / mean / entropy readouts.
Gate debugger over 50 cells. 10 dead (g < 0.05 or g > 0.95). Mean 0.494. Entropy 3.30 of 4.00 max.
Gate debugger · 50 cells
dead10(20%)mean0.494entropy3.30/ 4.00
Customize
Threshold
0.05
16
Layout
Installation
npx shadcn@latest add https://craftbits.dev/r/gate-debugger.jsonUsage
import { GateDebugger } from "@craft-bits/core";
<GateDebugger gateValues={gateActivations} />Tune the dead-band threshold and hide the histogram for compact embeds:
<GateDebugger
gateValues={gateActivations}
deadThreshold={0.02}
showHistogram={false}
/>Pass a denser binning for large activation vectors:
<GateDebugger
gateValues={gateActivations}
binCount={32}
/>Understanding the component
- Bar chart, one column per cell. Each column's height is the cell's gate value
g ∈ [0, 1]. Healthy cells paintcb-accent; cells inside the dead band (g < deadThresholdorg > 1 - deadThreshold) paintcb-warning. The dead band shades the top + bottom strips in muted warning so the threshold is legible without reading numbers. - Histogram over
binCountbins. The distribution shape tells the deeper story: a healthy gate spreads across the unit interval (look for a broad, low-skew mound); a dying gate clumps near0or1. A dashed vertical line marks the running mean. - Three readouts. dead counts cells in the saturation band (with a percentage of the total). mean is the average gate value — networks that train cleanly hover near
0.5, and drift towards an extreme is an early warning. entropy is the Shannon entropy of the binned distribution; uniform spread approacheslog₂(binCount), full collapse approaches0. A thin accent bar across the top of the histogram tracksH / H_maxas a quick visual proxy. - Threshold range.
deadThresholdis clamped to[0, 0.5]on read.0.05is the production-defensible default — tight enough to ignore noisy near-saturated cells, loose enough to catch the truly dead ones. Push it to0.01for stricter audits. - Pure presentational. No internal state. Re-render with new
gateValues— pass a layer's σ activation per batch step, or aggregate across a window — and the bars and bins respring withSPRINGS.smoothfrom@craft-bits/core/motion. - Clamp on read. Values outside
[0, 1]are clamped silently so an unnormalised tensor (someone passing logits) still renders something legible instead of crashing the chart layout.
Props
| Prop | Type | Default | Description |
|---|---|---|---|
gateValues | readonly number[] | — | Per-cell gate activations from a GLU σ branch. Clamped to [0, 1] on read. |
deadThreshold | number | 0.05 | A cell counts as dead when g < t or g > 1 - t. Clamped to [0, 0.5]. |
showHistogram | boolean | true | Render the distribution histogram below the bar chart. |
binCount | number | 16 | Equal-width bins for the histogram + entropy. Clamped to [2, 64]. The entropy ceiling is log₂(binCount). |
transition | Transition | SPRINGS.smooth | Spring used for bar / bin height transitions. |
className | string | — | Merged onto the root <div> via cn(). |
Accessibility
- The figure is
role="figure"witharia-labelledbyheading naming the cell count andaria-describedbysummary reporting the dead count, mean, and entropy / max — so screen-reader users get the same readout sighted users do. - Both SVGs carry
role="img"with descriptive labels; the dead-band shading and the per-bar fills are mirrored by thedata-state="dead" | "alive"attribute on eachrect, available for custom CSS hooks. - Color carries information (warning vs accent), but the dead-band shading and the explicit
deadreadout supply a redundant non-color channel. - Bars and bins animate with
SPRINGS.smooth; reduced-motion users skip the spring and snap to the new heights.
Credits
- Extracted from:
craftingattention(app/src/lessons/primitives/nn/GateDebugger.tsx). Re-architected from the lesson-source LSTM gate-mechanics challenge (forget / input / output slider scenarios with target cell + hidden state values) into a generic per-cell gate-distribution inspector for modern GLU-style activations. Stripped the LSTM cell diagram, the challenge state machine, the pointer-drag slider handling, the phase / narration system, theSvgLabelandChallengeBtnlesson primitives, and the inlineanimate(..., SPRINGS.bouncy)imperative pulses. Replaced project palette (--color-warn-400/--color-accent-400/--color-success-400) withcb-*semantic tokens and inline transitions withSPRINGS.smoothfrom@craft-bits/core/motion.