Continuous Batching Viz

A timeline of continuous batching in LLM serving. Every request occupies one horizontal track of token cells starting at its arrivalTime; as the cursor advances tick-by-tick, cells fill left-to-right. The moment a request's last cell fills, the row dims — the slot is free and the next arrival takes over without waiting for the rest of the batch.

Continuous batching at tick 0 of 13. 0 in-flight, 0 completed, 6 total.
Continuous batchingt=00 / 13
  1. r00/4
  2. r10/7
  3. r20/3
  4. r30/6
  5. r40/5
  6. r50/4
Customize
Playback
320ms / tick
0

Installation

npx shadcn@latest add https://craftbits.dev/r/continuous-batching-viz.json

Usage

import { ContinuousBatchingViz } from "@craft-bits/core";
 
const requests = [
  { id: "r0", arrivalTime: 0, numTokens: 4 },
  { id: "r1", arrivalTime: 0, numTokens: 7 },
  { id: "r2", arrivalTime: 2, numTokens: 3 },
  { id: "r3", arrivalTime: 4, numTokens: 6 },
];
 
<ContinuousBatchingViz requests={requests} playing playSpeed={320} />

Drive the cursor from outside (scrubbing, narration, MDX prose hover):

<ContinuousBatchingViz
  requests={requests}
  currentTime={t}
  onCurrentTimeChange={setT}
/>

Render a static snapshot at a specific tick:

<ContinuousBatchingViz
  requests={requests}
  defaultCurrentTime={6}
  playing={false}
/>

Understanding the component

  1. One row per request. Each row is a flex line with a label, a grid of maxTime cells, and a filled / total readout. Rows preserve their position across the trace — the component never reorders them when a request finishes.
  2. Cells fill as the cursor passes. A cell at column i belongs to a request if arrivalTime <= i < arrivalTime + numTokens. It becomes filled once currentTime > i. Fill is a motion.span whose opacity and scaleY animate via SPRINGS.snap.
  3. Completed rows dim. As soon as filled === numTokens, the row's accent fill drops to bg-cb-accent/40 and the label desaturates — visually freeing that slot. Static batching would keep the row at full opacity until the longest request in the batch finishes; continuous batching releases each row independently.
  4. Time axis underneath. The component renders sparse tick labels every ceil(maxTime / 10) cells in tabular-nums monospace, so the column positions of arrivals are readable at a glance.
  5. Controlled, uncontrolled, or autoplay. currentTime follows the Radix value / defaultValue pattern. When playing is true, an interval advances the cursor by one tick every playSpeed ms and loops back to 0 at the end of the trace.
  6. Reduced motion. prefers-reduced-motion: reduce collapses every spring to duration: 0 and clamps the cursor to maxTime so screen-reader users hit the final state immediately.

Props

PropTypeDefaultDescription
requestsreadonly ContinuousBatchingVizRequest[]Required. Each item has { id, arrivalTime, numTokens }.
currentTimenumberControlled cursor tick.
defaultCurrentTimenumber0Uncontrolled initial tick.
onCurrentTimeChange(t: number) => voidFires on autoplay tick or scrub.
playingbooleanfalseWhen true, autoplay advances the cursor at playSpeed.
playSpeednumber400Milliseconds per tick during autoplay. Minimum 40 ms.
transitionTransitionSPRINGS.snapSpring used for cell-fill transitions.
classNamestringMerged onto the root via cn().

Accessibility

  • The figure is role="figure" with a hidden aria-live="polite" summary that announces t, maxTime, in-flight count, completed count, and total request count — screen readers hear the progression on every tick.
  • Each row is a <li> with its own aria-label of the form "Request <id>: <filled> of <total> tokens generated." so individual request progress is accessible without the visual fills.
  • Color is never the only signal — every row carries a textual filled / total readout, and completed rows desaturate both their label and the readout.
  • Motion respects prefers-reduced-motion: reduce: every spring collapses to instant and the cursor jumps to the end of the trace.

Credits

  • Extracted from: craftingattention (app/src/lessons/primitives/viz/ContinuousBatchingViz.tsx). The source paired the timeline with a side-by-side static-vs-continuous comparison, a per-mode harness (Explore / Predict / Challenge) with bookmarks and undo / redo via useWidgetHistory, queue chips for waiting requests, GPU-utilization badges, score dots, and reveal-style narration. The library extract is the pure timeline primitive — requests in, animated token cells out — driven entirely by props with controlled / uncontrolled and play / pause APIs, so callers compose the comparison, the modes, and the narration around it.