Dictionary Lookup Viz

A visualisation of the bridge from token IDs to vectors that every embedding layer in a neural net performs. The vocabulary lists down the left column; each row carries an embedding strip on the right whose values are synthesised deterministically from (token, dim, seed). An arrow connects the currently-selected token to its row — exactly the picture you want when teaching what a lookup table is before any training math enters the room.

Dictionary lookup visualisation.Token-to-embedding lookup table with 8 entries and 8-dimensional embeddings. Current token: cat.
Customize
Selection
cat
Shape
8
1
Color
diverging

Installation

npx shadcn@latest add https://craftbits.dev/r/dictionary-lookup-viz.json

Usage

import { DictionaryLookupViz } from "@craft-bits/core";
 
<DictionaryLookupViz
  vocab={["the", "cat", "sat", "on", "mat", "dog", "ran", "fast"]}
  embeddingDim={8}
  defaultCurrentToken="cat"
/>

Drive the current token externally as part of a narration step:

const [token, setToken] = useState("the");
 
<DictionaryLookupViz
  vocab={["the", "cat", "sat", "on", "mat"]}
  currentToken={token}
  onCurrentTokenChange={setToken}
/>

Swap to the magnitude-only colour scheme and tweak the hash seed for a different value pattern:

<DictionaryLookupViz
  vocab={["x", "y", "z"]}
  embeddingDim={16}
  colorScheme="accent"
  seed={42}
/>

Understanding the component

  1. Two columns, one bridge. The left column is plain text — the vocabulary as it would appear in a tokenizer's id_to_token table. The right column is the embedding matrix, one row per vocab entry. The arrow makes the "index in, vector out" mapping concrete.
  2. Deterministic synthesis. Values come from a hand-rolled FNV-1a hash mixed with seed, fed into a mulberry32 PRNG, then mapped into [-1, 1]. The hash is hand-rolled (no crypto, no third-party dep) so the matrix is bit-for-bit identical on the server and in the browser — hydration is safe and snapshot tests are stable.
  3. Two colour schemes. diverging (default) preserves sign — positive values tint var(--cb-accent), negative values tint var(--cb-warning), magnitude drives alpha. Zero is near-transparent. accent drops the sign and tints var(--cb-accent) by absolute value only — useful when you want to teach magnitude alone, without the sign distinction.
  4. Controlled and uncontrolled. currentToken + onCurrentTokenChange lets a parent drive the highlighted row (typical when narration steps step through the vocabulary). Omit currentToken and pass defaultCurrentToken for the uncontrolled mode — clicking a vocabulary row then updates the highlight internally.
  5. Spring transitions. When the current token changes, the row outline and arrow both spring with SPRINGS.smooth from @craft-bits/core/motion. prefers-reduced-motion: reduce collapses every spring to an instant swap.
  6. Pure primitive. The component computes the embedding matrix internally from vocab, embeddingDim, and seed — there is no externally-provided matrix prop. Cells themselves are static rectangles for fast renders even at vocabularies of a few hundred entries; only the highlight outline and arrow animate.

Props

PropTypeDefaultDescription
vocabreadonly string[]requiredVocabulary indexed by the table.
embeddingDimnumber8Number of cells per embedding row (clamped to [1, 64]).
currentTokenstringControlled current token (must be a member of vocab).
defaultCurrentTokenstringfirst of vocabUncontrolled initial token.
onCurrentTokenChange(token: string) => voidFires when a vocabulary row is clicked.
colorScheme"diverging" | "accent""diverging"Sign-preserving (accent / warning) vs magnitude-only (accent).
seednumber1Seed for the deterministic value hash.
transitionTransitionSPRINGS.smoothSpring for arrow + outline transitions.
classNamestringMerged onto the root via cn().

Accessibility

  • The outer element is role="figure" with an aria-label and a visually hidden aria-live="polite" summary — screen readers hear the vocabulary size, embedding dimension, and the current token whenever the selection changes.
  • Each vocabulary row carries an invisible click target that is at least 28px tall, so a pointer user can reliably hit the right entry on a phone.
  • Colour is never the only signal — the diverging scheme distinguishes sign with two distinct hues (accent and warning), the row outline doubles as a non-colour highlight, and the arrow points the eye unambiguously to the right row.
  • Animations respect prefers-reduced-motion: reduce — the arrow + highlight springs collapse to an instant swap.

Credits

  • Extracted from: craftingattention (app/src/lessons/primitives/nn/DictionaryLookupViz.tsx). The source was an attention-as-soft-dictionary-lookup widget — a draggable 2D query / key / value scene with softmax over scaled dot products, a temperature knob, a phase machine, narration copy, and a ChallengeBtn "hard lookup" comparison. The library version drops the soft-lookup conceit entirely and ships the prior primitive every embedding lesson actually needs: a plain token → vector lookup table with vocabulary on one side, synthesised embedding rows on the other, and an arrow connecting the current token to its row. Soft-attention math belongs in AttentionHeatmap / AttentionStepperViz; this primitive teaches the lookup that comes before attention.