Gate Debugger

Modern transformer FFN blocks run a GLU variant — GLU(x) = (Wx) · σ(Vx), or SwiGLU / GeGLU in the Llama / PaLM / Gemma families. The gate is the σ branch: a vector of values in [0, 1] that decides per-cell which signals get to flow forward. When too many entries collapse to ~0 (always closed) or ~1 (always open), those cells stop carrying gradient and the model loses capacity silently. GateDebugger makes that pathology visible: bar chart per cell, distribution histogram, plus dead-count / mean / entropy readouts.

Gate debugger over 50 cells. 10 dead (g < 0.05 or g > 0.95). Mean 0.494. Entropy 3.30 of 4.00 max.
Gate debugger · 50 cells
dead10(20%)mean0.494entropy3.30/ 4.00
1.00.0cell index →
0.00.51.0
Customize
Threshold
0.05
16
Layout

Installation

npx shadcn@latest add https://craftbits.dev/r/gate-debugger.json

Usage

import { GateDebugger } from "@craft-bits/core";
 
<GateDebugger gateValues={gateActivations} />

Tune the dead-band threshold and hide the histogram for compact embeds:

<GateDebugger
  gateValues={gateActivations}
  deadThreshold={0.02}
  showHistogram={false}
/>

Pass a denser binning for large activation vectors:

<GateDebugger
  gateValues={gateActivations}
  binCount={32}
/>

Understanding the component

  1. Bar chart, one column per cell. Each column's height is the cell's gate value g ∈ [0, 1]. Healthy cells paint cb-accent; cells inside the dead band (g < deadThreshold or g > 1 - deadThreshold) paint cb-warning. The dead band shades the top + bottom strips in muted warning so the threshold is legible without reading numbers.
  2. Histogram over binCount bins. The distribution shape tells the deeper story: a healthy gate spreads across the unit interval (look for a broad, low-skew mound); a dying gate clumps near 0 or 1. A dashed vertical line marks the running mean.
  3. Three readouts. dead counts cells in the saturation band (with a percentage of the total). mean is the average gate value — networks that train cleanly hover near 0.5, and drift towards an extreme is an early warning. entropy is the Shannon entropy of the binned distribution; uniform spread approaches log₂(binCount), full collapse approaches 0. A thin accent bar across the top of the histogram tracks H / H_max as a quick visual proxy.
  4. Threshold range. deadThreshold is clamped to [0, 0.5] on read. 0.05 is the production-defensible default — tight enough to ignore noisy near-saturated cells, loose enough to catch the truly dead ones. Push it to 0.01 for stricter audits.
  5. Pure presentational. No internal state. Re-render with new gateValues — pass a layer's σ activation per batch step, or aggregate across a window — and the bars and bins respring with SPRINGS.smooth from @craft-bits/core/motion.
  6. Clamp on read. Values outside [0, 1] are clamped silently so an unnormalised tensor (someone passing logits) still renders something legible instead of crashing the chart layout.

Props

PropTypeDefaultDescription
gateValuesreadonly number[]Per-cell gate activations from a GLU σ branch. Clamped to [0, 1] on read.
deadThresholdnumber0.05A cell counts as dead when g < t or g > 1 - t. Clamped to [0, 0.5].
showHistogrambooleantrueRender the distribution histogram below the bar chart.
binCountnumber16Equal-width bins for the histogram + entropy. Clamped to [2, 64]. The entropy ceiling is log₂(binCount).
transitionTransitionSPRINGS.smoothSpring used for bar / bin height transitions.
classNamestringMerged onto the root <div> via cn().

Accessibility

  • The figure is role="figure" with aria-labelledby heading naming the cell count and aria-describedby summary reporting the dead count, mean, and entropy / max — so screen-reader users get the same readout sighted users do.
  • Both SVGs carry role="img" with descriptive labels; the dead-band shading and the per-bar fills are mirrored by the data-state="dead" | "alive" attribute on each rect, available for custom CSS hooks.
  • Color carries information (warning vs accent), but the dead-band shading and the explicit dead readout supply a redundant non-color channel.
  • Bars and bins animate with SPRINGS.smooth; reduced-motion users skip the spring and snap to the new heights.

Credits

  • Extracted from: craftingattention (app/src/lessons/primitives/nn/GateDebugger.tsx). Re-architected from the lesson-source LSTM gate-mechanics challenge (forget / input / output slider scenarios with target cell + hidden state values) into a generic per-cell gate-distribution inspector for modern GLU-style activations. Stripped the LSTM cell diagram, the challenge state machine, the pointer-drag slider handling, the phase / narration system, the SvgLabel and ChallengeBtn lesson primitives, and the inline animate(..., SPRINGS.bouncy) imperative pulses. Replaced project palette (--color-warn-400 / --color-accent-400 / --color-success-400) with cb-* semantic tokens and inline transitions with SPRINGS.smooth from @craft-bits/core/motion.