Positional Encoding Viz

A heatmap visualisation of the sinusoidal positional-encoding matrix used by transformers since "Attention Is All You Need." Each row is a position in the sequence; each column is an embedding dimension. Even columns hold sin(pos / base^(2k/d)), odd columns hold cos(pos / base^(2k/d)) — so low columns oscillate fast across positions, high columns oscillate slowly, and the resulting matrix encodes every position with a unique pattern an attention layer can recover.

Positional encoding heatmap.Sinusoidal positional encoding, 32 positions by 64 dimensions, base 10000.
Customize
Shape
32
64
Wavelength
10000
Color
diverging

Installation

npx shadcn@latest add https://craftbits.dev/r/positional-encoding-viz.json

Usage

import { PositionalEncodingViz } from "@craft-bits/core";
 
<PositionalEncodingViz seqLen={32} dModel={64} />

Drop the standard transformer parameters and pick the diverging color scheme so positive and negative values separate visually:

<PositionalEncodingViz
  seqLen={64}
  dModel={128}
  base={10000}
  colorScheme="diverging"
/>

Highlight one row and one column to teach how a single position interacts with each dimension:

<PositionalEncodingViz
  seqLen={32}
  dModel={64}
  highlightRow={8}
  highlightCol={4}
/>

Understanding the component

  1. The formula. For position pos and dimension col, the encoding is sin(pos / base^(2k/d)) when col = 2k, and cos(pos / base^(2k/d)) when col = 2k+1. The base is 10000 in the original transformer paper. The geometric base^(2k/d) denominator makes the wavelength grow exponentially with the dimension index — low dims have short wavelengths, high dims have long wavelengths.
  2. Why this matters. Because every dimension oscillates at a different rate, every position vector is unique, and any two positions a fixed distance apart relate to each other by a rotation — which is exactly the kind of relative-position signal attention layers can pick up via dot products.
  3. Two color schemes. diverging (default) preserves sign — positive values tint var(--cb-accent), negative values tint var(--cb-warning), and magnitude drives alpha. Zero is near-transparent. accent drops the sign and tints var(--cb-accent) by absolute value only — useful when teaching magnitude alone.
  4. Optional highlights. Pass highlightRow to outline one position row with a solid accent stroke; pass highlightCol to outline one dimension column with a dashed accent stroke. Both can be set simultaneously; either can be omitted.
  5. Spring transitions. When a highlight index changes, the outline animates with SPRINGS.smooth from @craft-bits/core/motion. prefers-reduced-motion: reduce collapses the spring to an instant swap.
  6. Pure primitive. The component computes the matrix internally from seqLen, dModel, and base — there is no externally-provided matrix prop. Animations are limited to highlight outlines; the cells themselves are static rectangles for fast renders even at large seqLen and dModel.

Props

PropTypeDefaultDescription
seqLennumber32Sequence length (number of rows).
dModelnumber64Model dimension (number of columns).
basenumber10000Wavelength base. The standard transformer value is 10000.
colorScheme"diverging" | "accent""diverging"Sign-preserving (accent / warning) vs magnitude-only (accent).
highlightRownumberHighlight one position row with an accent outline.
highlightColnumberHighlight one dimension column with an accent outline.
sizenumber360SVG side length in pixels.
transitionTransitionSPRINGS.smoothSpring for highlight transitions.
classNamestringMerged onto the root via cn().

Accessibility

  • The outer element is role="figure" with an aria-label and a visually hidden aria-live="polite" summary — screen readers hear the matrix dimensions, base, and active highlights whenever they change.
  • Color is never the only signal — the diverging scheme distinguishes sign with two distinct hues (accent and warning), and the accent scheme is unambiguous because there is no sign to lose.
  • The component is read-only — no interactive handles, so no keyboard or pointer behaviour to enumerate.
  • Animations respect prefers-reduced-motion: reduce — highlight springs collapse to an instant swap.

Credits

  • Extracted from: craftingattention (app/src/lessons/primitives/viz/PositionalEncodingViz.tsx). The source was a full lesson widget bundled with Explore / Predict / Challenge mode strips, dot-product similarity charts, line waveforms across positions and dimensions, history-undo state, and preset bookmarks. The library extract is the pure visualisation primitive — the heatmap of the encoding matrix itself, plus optional row + column highlights. Mode strips, similarity charts, and narration belong in the consuming lesson.