Dictionary Lookup Viz
A visualisation of the bridge from token IDs to vectors that every embedding layer in a neural net performs. The vocabulary lists down the left column; each row carries an embedding strip on the right whose values are synthesised deterministically from (token, dim, seed). An arrow connects the currently-selected token to its row — exactly the picture you want when teaching what a lookup table is before any training math enters the room.
Dictionary lookup visualisation.Token-to-embedding lookup table with 8 entries and 8-dimensional embeddings. Current token: cat.
Customize
Selection
cat
Shape
8
1
Color
diverging
Installation
npx shadcn@latest add https://craftbits.dev/r/dictionary-lookup-viz.jsonUsage
import { DictionaryLookupViz } from "@craft-bits/core";
<DictionaryLookupViz
vocab={["the", "cat", "sat", "on", "mat", "dog", "ran", "fast"]}
embeddingDim={8}
defaultCurrentToken="cat"
/>Drive the current token externally as part of a narration step:
const [token, setToken] = useState("the");
<DictionaryLookupViz
vocab={["the", "cat", "sat", "on", "mat"]}
currentToken={token}
onCurrentTokenChange={setToken}
/>Swap to the magnitude-only colour scheme and tweak the hash seed for a different value pattern:
<DictionaryLookupViz
vocab={["x", "y", "z"]}
embeddingDim={16}
colorScheme="accent"
seed={42}
/>Understanding the component
- Two columns, one bridge. The left column is plain text — the vocabulary as it would appear in a tokenizer's
id_to_tokentable. The right column is the embedding matrix, one row per vocab entry. The arrow makes the "index in, vector out" mapping concrete. - Deterministic synthesis. Values come from a hand-rolled
FNV-1ahash mixed withseed, fed into amulberry32PRNG, then mapped into[-1, 1]. The hash is hand-rolled (no crypto, no third-party dep) so the matrix is bit-for-bit identical on the server and in the browser — hydration is safe and snapshot tests are stable. - Two colour schemes.
diverging(default) preserves sign — positive values tintvar(--cb-accent), negative values tintvar(--cb-warning), magnitude drives alpha. Zero is near-transparent.accentdrops the sign and tintsvar(--cb-accent)by absolute value only — useful when you want to teach magnitude alone, without the sign distinction. - Controlled and uncontrolled.
currentToken+onCurrentTokenChangelets a parent drive the highlighted row (typical when narration steps step through the vocabulary). OmitcurrentTokenand passdefaultCurrentTokenfor the uncontrolled mode — clicking a vocabulary row then updates the highlight internally. - Spring transitions. When the current token changes, the row outline and arrow both spring with
SPRINGS.smoothfrom@craft-bits/core/motion.prefers-reduced-motion: reducecollapses every spring to an instant swap. - Pure primitive. The component computes the embedding matrix internally from
vocab,embeddingDim, andseed— there is no externally-provided matrix prop. Cells themselves are static rectangles for fast renders even at vocabularies of a few hundred entries; only the highlight outline and arrow animate.
Props
| Prop | Type | Default | Description |
|---|---|---|---|
vocab | readonly string[] | required | Vocabulary indexed by the table. |
embeddingDim | number | 8 | Number of cells per embedding row (clamped to [1, 64]). |
currentToken | string | — | Controlled current token (must be a member of vocab). |
defaultCurrentToken | string | first of vocab | Uncontrolled initial token. |
onCurrentTokenChange | (token: string) => void | — | Fires when a vocabulary row is clicked. |
colorScheme | "diverging" | "accent" | "diverging" | Sign-preserving (accent / warning) vs magnitude-only (accent). |
seed | number | 1 | Seed for the deterministic value hash. |
transition | Transition | SPRINGS.smooth | Spring for arrow + outline transitions. |
className | string | — | Merged onto the root via cn(). |
Accessibility
- The outer element is
role="figure"with anaria-labeland a visually hiddenaria-live="polite"summary — screen readers hear the vocabulary size, embedding dimension, and the current token whenever the selection changes. - Each vocabulary row carries an invisible click target that is at least 28px tall, so a pointer user can reliably hit the right entry on a phone.
- Colour is never the only signal — the diverging scheme distinguishes sign with two distinct hues (accent and warning), the row outline doubles as a non-colour highlight, and the arrow points the eye unambiguously to the right row.
- Animations respect
prefers-reduced-motion: reduce— the arrow + highlight springs collapse to an instant swap.
Credits
- Extracted from:
craftingattention(app/src/lessons/primitives/nn/DictionaryLookupViz.tsx). The source was an attention-as-soft-dictionary-lookup widget — a draggable 2D query / key / value scene with softmax over scaled dot products, a temperature knob, a phase machine, narration copy, and aChallengeBtn"hard lookup" comparison. The library version drops the soft-lookup conceit entirely and ships the prior primitive every embedding lesson actually needs: a plain token → vector lookup table with vocabulary on one side, synthesised embedding rows on the other, and an arrow connecting the current token to its row. Soft-attention math belongs inAttentionHeatmap/AttentionStepperViz; this primitive teaches the lookup that comes before attention.