HBM Traffic Viz
A per-operation bar chart of memory traffic between HBM and SRAM on a GPU. Each operation row gets two bars at the same scale — reads on the left, writes on the right — so the user reads volume by length. A footer totals reads, writes, and (optionally) the bandwidth-bound compute time t = (read + write) / bandwidth, which is the latency lower bound for a memory-bound kernel. Pairs naturally with FlashAttentionViz to teach why tiling wins: it minimises this exact number.
HBM traffic. 4 operations. Total read 17.2 GB, total write 81.9 MB, bandwidth-bound time 11.1 ms at 1555 GB per second.
HBM traffic1555 GB/s
- weights load10.3 ms
- KV load772 µs
- output write25.7 µs
- KV write25.7 µs
total17.2 GB81.9 MB11.1 ms
Customize
Volume
16 GB
1.2 GB
GPU
1555 GB/s
Installation
npx shadcn@latest add https://craftbits.dev/r/hbm-traffic-viz.jsonUsage
import { HBMTrafficViz } from "@craft-bits/core";
<HBMTrafficViz
operations={[
{ id: "weights", label: "weights load", readGb: 16, writeGb: 0 },
{ id: "kv-load", label: "KV load", readGb: 1.2, writeGb: 0 },
{ id: "output", label: "output write", readGb: 0, writeGb: 0.04 },
{ id: "kv-write", label: "KV write", readGb: 0, writeGb: 0.04 },
]}
bandwidthGbPerSec={1555}
showTime
/>Hide the time column when teaching pure volume:
<HBMTrafficViz operations={ops} showTime={false} />Compare two kernels side-by-side by feeding two different operations arrays:
<div className="grid grid-cols-2 gap-4">
<HBMTrafficViz operations={standardOps} bandwidthGbPerSec={1555} />
<HBMTrafficViz operations={flashOps} bandwidthGbPerSec={1555} />
</div>Understanding the component
- One bar pair per operation. Each row has two bars at the same scale — reads on the left (anchored to the right edge so the bar grows toward the central gutter) and writes on the right (anchored to the left edge so the bar grows away). The pairing makes asymmetric rows (load-only or store-only) read at a glance.
- Shared normaliser. Every bar width divides by the largest of all
readGbandwriteGbacross every operation. A write row never balloons to 100% just because no read row exists; comparison preserves real proportions across rows and across the read / write axis. - Bandwidth-bound time. When
showTimeis on, each row shows(readGb + writeGb) / bandwidthGbPerSecin ms. The footer sums to give the kernel-level lower bound. The default1555 GB/smatches an A100 80 GB — pass H100 (3350), MI300X (5300), or your own measured peak as needed. SPRINGS.smootheverywhere. Bar-width changes animate with the canonical smooth spring;prefers-reduced-motion: reducecollapses every spring to an instant swap.- Pure presentation. The component does no GPU spec lookups, no operation inference, no formula display — it plots what you feed it. Build dataset transforms (Llama-3-8B at fp16 → ops list) in your lesson, then hand the result here.
Props
| Prop | Type | Default | Description |
|---|---|---|---|
operations | HBMTrafficVizOperation[] | — | Rows to render, in display order. |
bandwidthGbPerSec | number | 1555 | Peak HBM bandwidth in GB/s. A100 80 GB by default. |
showTime | boolean | true | Show per-row and total bandwidth-bound times. |
transition | Transition | SPRINGS.smooth | Spring used for bar-width transitions. |
className | string | — | Merged onto the root via cn(). |
HBMTrafficVizOperation
| Field | Type | Description |
|---|---|---|
id | string | Stable React key. |
label | string | Short label rendered to the left of the bars. |
readGb | number | Gigabytes read from HBM by this operation. |
writeGb | number | Gigabytes written to HBM by this operation. |
Accessibility
- The figure is
role="figure"with a hidden summary listing operation count, total reads, total writes, and the bandwidth-bound time — screen readers hear the headline whenever props change. - The chart axis is labelled textually ("read" / "write" / "time") above the rows; colour is never the only signal.
- Bar widths animate via
motion/react;prefers-reduced-motion: reducecollapses every transition to an instant swap. - All numeric readouts use
font-variant-numeric: tabular-numsso values do not reflow when they update.
Credits
- Extracted from:
craftingattention(app/src/lessons/primitives/nn/HBMTrafficViz.tsx). Stripped the four-phase narration state machine (observe / standard-done / flash-done / insight), the hard-coded Standard-vs-Flash two-column SVG comparison with animated shuttle rects, theSEQ_LENGTHS = [512, 1024, 2048, 4096, 8192]slider, the Play / Replay button driving imperativemotion/animatetimelines, and thehead_dim = 64,fp16hard-coded formulae. The library extract is the pure plotting primitive — anoperationsarray in, a normalised read / write bar chart with optional bandwidth time out — driven entirely by props.