Dictionary Lookup Viz

A visualisation of the bridge from token IDs to vectors that every embedding layer in a neural net performs. The vocabulary lists down the left column; each row carries an embedding strip on the right whose values are synthesised deterministically from (token, dim, seed). An arrow connects the currently-selected token to its row — exactly the picture you want when teaching what a lookup table is before any training math enters the room.

Customize

Selection

tokencat

Shape

embeddingDim8

seed1

Color

schemediverging

Installation

npx shadcn@latest add https://craftbits.dev/r/dictionary-lookup-viz.json

Usage

import { DictionaryLookupViz } from "@craft-bits/core";
 
<DictionaryLookupViz
  vocab={["the", "cat", "sat", "on", "mat", "dog", "ran", "fast"]}
  embeddingDim={8}
  defaultCurrentToken="cat"
/>

Drive the current token externally as part of a narration step:

const [token, setToken] = useState("the");
 
<DictionaryLookupViz
  vocab={["the", "cat", "sat", "on", "mat"]}
  currentToken={token}
  onCurrentTokenChange={setToken}
/>

Swap to the magnitude-only colour scheme and tweak the hash seed for a different value pattern:

<DictionaryLookupViz
  vocab={["x", "y", "z"]}
  embeddingDim={16}
  colorScheme="accent"
  seed={42}
/>

Understanding the component

Two columns, one bridge. The left column is plain text — the vocabulary as it would appear in a tokenizer's id_to_token table. The right column is the embedding matrix, one row per vocab entry. The arrow makes the "index in, vector out" mapping concrete.
Deterministic synthesis. Values come from a hand-rolled FNV-1a hash mixed with seed, fed into a mulberry32 PRNG, then mapped into [-1, 1]. The hash is hand-rolled (no crypto, no third-party dep) so the matrix is bit-for-bit identical on the server and in the browser — hydration is safe and snapshot tests are stable.
Two colour schemes. diverging (default) preserves sign — positive values tint var(--cb-accent), negative values tint var(--cb-warning), magnitude drives alpha. Zero is near-transparent. accent drops the sign and tints var(--cb-accent) by absolute value only — useful when you want to teach magnitude alone, without the sign distinction.
Controlled and uncontrolled. currentToken + onCurrentTokenChange lets a parent drive the highlighted row (typical when narration steps step through the vocabulary). Omit currentToken and pass defaultCurrentToken for the uncontrolled mode — clicking a vocabulary row then updates the highlight internally.
Spring transitions. When the current token changes, the row outline and arrow both spring with SPRINGS.smooth from @craft-bits/core/motion. prefers-reduced-motion: reduce collapses every spring to an instant swap.
Pure primitive. The component computes the embedding matrix internally from vocab, embeddingDim, and seed — there is no externally-provided matrix prop. Cells themselves are static rectangles for fast renders even at vocabularies of a few hundred entries; only the highlight outline and arrow animate.

Props

Prop	Type	Default	Description
`vocab`	`readonly string[]`	required	Vocabulary indexed by the table.
`embeddingDim`	`number`	`8`	Number of cells per embedding row (clamped to `[1, 64]`).
`currentToken`	`string`	—	Controlled current token (must be a member of `vocab`).
`defaultCurrentToken`	`string`	first of `vocab`	Uncontrolled initial token.
`onCurrentTokenChange`	`(token: string) => void`	—	Fires when a vocabulary row is clicked.
`colorScheme`	`"diverging" \| "accent"`	`"diverging"`	Sign-preserving (accent / warning) vs magnitude-only (accent).
`seed`	`number`	`1`	Seed for the deterministic value hash.
`transition`	`Transition`	`SPRINGS.smooth`	Spring for arrow + outline transitions.
`className`	`string`	—	Merged onto the root via `cn()`.

Accessibility

The outer element is role="figure" with an aria-label and a visually hidden aria-live="polite" summary — screen readers hear the vocabulary size, embedding dimension, and the current token whenever the selection changes.
Each vocabulary row carries an invisible click target that is at least 28px tall, so a pointer user can reliably hit the right entry on a phone.
Colour is never the only signal — the diverging scheme distinguishes sign with two distinct hues (accent and warning), the row outline doubles as a non-colour highlight, and the arrow points the eye unambiguously to the right row.
Animations respect prefers-reduced-motion: reduce — the arrow + highlight springs collapse to an instant swap.

Credits

Extracted from: craftingattention (app/src/lessons/primitives/nn/DictionaryLookupViz.tsx). The source was an attention-as-soft-dictionary-lookup widget — a draggable 2D query / key / value scene with softmax over scaled dot products, a temperature knob, a phase machine, narration copy, and a ChallengeBtn "hard lookup" comparison. The library version drops the soft-lookup conceit entirely and ships the prior primitive every embedding lesson actually needs: a plain token → vector lookup table with vocabulary on one side, synthesised embedding rows on the other, and an arrow connecting the current token to its row. Soft-attention math belongs in AttentionHeatmap / AttentionStepperViz; this primitive teaches the lookup that comes before attention.