Weight Tying Toggle Viz
A one-switch picture of the weight tying trick that most modern language models use. Untied: two big matrices — input embedding (V × d) and output projection (d × V = Eᵀ) — sit side by side, dashed link between them. Tied: they collapse into a single shared block with dual-use arrows (token → embed on the left, → logits on the right). A live parameter readout halves itself. Useful for teaching why every modern transformer LM does this.
Weight tying toggle visualisation.Weight tying disabled. Vocab 32,000, embedding dimension 4,096. Embedding parameters: 262,144,000.
Customize
Model shape
LLaMA-7B
Toggle
Installation
npx shadcn@latest add https://craftbits.dev/r/weight-tying-toggle-viz.jsonUsage
import { WeightTyingToggleViz } from "@craft-bits/core";
<WeightTyingToggleViz vocabSize={32000} embedDim={4096} />Drive the tied / untied state externally from a narration step:
const [tied, setTied] = useState(false);
<WeightTyingToggleViz
vocabSize={50257}
embedDim={768}
tied={tied}
onTiedChange={setTied}
/>Hide the parameter-count readout for a purely conceptual diagram:
<WeightTyingToggleViz
vocabSize={32000}
embedDim={4096}
defaultTied
showParameterCount={false}
/>Understanding the component
- Two roles, one tensor. The input embedding turns a token id into a vector (
V × d). The output projection turns a vector back into logits over the vocabulary (d × V). They have the same shape transposed, so a single matrixEcan fulfil both roles viaEon the way in andEᵀon the way out — exactly what "weight tying" means. tiedis a simple toggle. Internally it is a single boolean — there is no in-between "partially tied" state. The animation interpolates between the two layouts so the eye tracks the merge / split visually, but the state itself is binary. Controlled (tied+onTiedChange) and uncontrolled (defaultTied) follow the standard Radix-style pair.- Geometry is fixed; numbers are real. The SVG matrix boxes stay the same size regardless of
vocabSizeandembedDim— the lesson is about sharing, not about scale. The parameter-count readout, however, uses the real numbers:V·dper matrix, doubled when untied, with a "saved N params" line that appears only when tied. - Spring transitions. Merge / split, the dashed link, and the parameter readout all spring with
SPRINGS.smoothfrom@craft-bits/core/motion.prefers-reduced-motion: reducecollapses every spring to an instant swap. - No presets in the library. The source had a three-preset cycle (GPT-2 / LLaMA-7B / GPT-4-class) that drove a narration block. The library version drops the presets and the narration — consumers wire their own
vocabSize/embedDimfrom a lesson step or a slider, and own their own copy if they want a model-picker.
Props
| Prop | Type | Default | Description |
|---|---|---|---|
tied | boolean | — | Controlled tied / untied state. |
defaultTied | boolean | false | Uncontrolled initial state. |
onTiedChange | (tied: boolean) => void | — | Fires when the toggle flips. |
vocabSize | number | 32000 | Vocabulary size V — drives the parameter readout. |
embedDim | number | 4096 | Embedding dimension d — drives the parameter readout. |
showParameterCount | boolean | true | Whether to render the V·d parameter readout band. |
transition | Transition | SPRINGS.smooth | Spring for the merge / split transitions. |
className | string | — | Merged onto the root via cn(). |
Accessibility
- The outer element is
role="figure"with anaria-labeland a visually hiddenaria-live="polite"summary — screen readers hear the tied / untied state, the vocab and embedding dim, and the live parameter count whenever the toggle flips. - The toggle button carries
aria-pressedreflectingtiedand a descriptivearia-label("Tie weights"/"Untie weights"). - Colour is never the only signal — the tied state also gets a thicker stroke, a soft glow, and explicit
"shared embedding"/"saved {N} params"text labels. prefers-reduced-motion: reducecollapses every spring to an instant swap.
Credits
- Extracted from:
craftingattention(app/src/lessons/primitives/nn/WeightTyingToggleViz.tsx). The source was a phase-machine lesson widget (observe/tied/insight) with a hard-coded three-preset cycle (GPT-2 / LLaMA-7B / GPT-4-class), a narration block, a breathing-pulse sonar overlay, and a customChallengeBtntoggle. The library version drops the phases, the presets, the narration, and the lesson chrome — and ships the prior primitive every transformer-LM lesson actually needs: a clean tied / untied toggle parametrised byvocabSizeandembedDim, with a liveV·dparameter readout and an honest reduced-motion fallback.