Perplexity Viz

Per-position probability bars plus the headline language-model eval metric: mean log p and PP = exp(-mean log p). Each bar's height is the model's probability on the true token; an optional row of log p_i underneath each bar exposes the value actually being summed. A model that nails every token sits at PP = 1; a uniform-over-V guesser sits at PP = V.

PP = exp(-mean log p)N = 8

mean log p-0.299

perplexity1.35

Customize

Model

confidenceconfident

Display

log p row

Installation

npx shadcn@latest add https://craftbits.dev/r/perplexity-viz.json

Usage

import { PerplexityViz } from "@craft-bits/core";
 
<PerplexityViz
  tokens={[
    { word: "the", probability: 0.92 },
    { word: "cat", probability: 0.78 },
    { word: "sat", probability: 0.65 },
    { word: "on",  probability: 0.88 },
    { word: "the", probability: 0.95 },
    { word: "mat", probability: 0.55 },
  ]}
/>

Hide the per-token log p_i row for a more compact placement:

<PerplexityViz tokens={tokens} showLogProb={false} />

Understanding the component

Perplexity, one line. PP = exp(-(1/N) Σ log p_i). The mean of the log-probs is the cross-entropy of the model on the true sequence (in nats); exponentiating turns it back into a "vocabulary size" — the number of equally-likely options the model is effectively choosing between at each position.
Two readouts, one number. The strip under the chart shows both the raw mean log-prob (handy when you want to talk in cross-entropy / nats) and the perplexity itself (the number you'd quote in a paper). The mean log-prob is negative; perplexity is the positive exponential image of it.
Bar height = p_i, not log p_i. The bars use the linear probability so a "high" bar matches the user's intuition for "the model was confident". The optional log p_i row underneath each bar is what's actually being summed — useful when teaching why a single low-probability token can crater the whole sentence's perplexity.
Numerical floor. Every probability is clamped at 1e-12 before log so a stray 0 doesn't poison the mean. The floor is well below anything a calibrated LM ever emits — real "zero" probabilities still show up as a giant negative log p_i and pull perplexity up sharply, exactly as the maths demands.
Spring transitions. Bar heights animate with SPRINGS.smooth from @craft-bits/core/motion so swapping tokens (e.g. driving from a scrubber or another model's output) produces a continuous reshape rather than a hard cut. Reduced-motion users snap.

Props

Prop	Type	Default	Description
`tokens`	`readonly { word: string; probability: number }[]`	—	One entry per position. `probability` is clamped to `[1e-12, 1]` for the maths.
`showLogProb`	`boolean`	`true`	Render the per-token `log p_i` row underneath each bar.
`transition`	`Transition`	`SPRINGS.smooth`	Spring used when `tokens` changes.
`className`	`string`	—	Merged onto the root `<div>` via `cn()`.

Accessibility

The chart is wrapped in role="figure" with an aria-describedby summary that reads "Perplexity 2.34 over 6 tokens. Mean log-probability -0.853." Screen-reader users get the same headline metric a sighted user sees.
The summary uses aria-live="polite" so changes are announced when tokens is replaced.
All colors come from the --cb-fg / --cb-accent / --cb-fg-muted token family — contrast is governed by the theme and meets WCAG AA against the default --cb-bg-elevated.
Bar transitions use SPRINGS.smooth and respect prefers-reduced-motion: reduce — bar heights snap instantly when the user opts out of motion.

Credits

Extracted from: craftingattention (app/src/lessons/primitives/viz/PerplexityViz.tsx). Stripped the lesson-specific Explore / Predict mode strip, the ModeStrip / ChallengeBtn / FeedbackBadge / ScoreDots / DoneCard chrome, the synthetic makeDist(confidence) distribution, the "effective choices" 10×10 grid (a different metric), and the TogglePill / LabeledSlider controls — generalised to a pure visualisation primitive that takes a real tokens array and shows the headline metric. Swapped inline SPRINGS.snappy for SPRINGS.smooth from @craft-bits/core/motion so bar reshapes feel continuous when streaming new token sequences in.
Reference: Perplexity (information theory).