KV Head Sharing Viz

A two-row arrow diagram of the three attention-sharing schemes — MHA, GQA, MQA. The top row holds numQueryHeads query heads; the bottom row holds the active number of KV heads (numQueryHeads for MHA, gqaGroups for GQA, 1 for MQA). One arrow per query head points to the KV head it shares. Flipping the active scheme retweens the KV row and retargets the arrows.

KV head sharing diagram.GQA: query heads grouped onto shared KV heads. 8 query heads share 4 KV heads — group size 2 query heads per KV head.
KV head sharing8 Q 4 KV · 2:1
Customize
Shape
8
4
Display
GQA

Installation

npx shadcn@latest add https://craftbits.dev/r/kv-head-sharing-viz.json

Usage

import { KVHeadSharingViz } from "@craft-bits/core";
 
<KVHeadSharingViz
  numQueryHeads={8}
  gqaGroups={4}
  defaultAttentionType="gqa"
/>

Drive the active type from outside (Radix-style controlled mode):

<KVHeadSharingViz
  numQueryHeads={8}
  gqaGroups={4}
  attentionType={type}
  onAttentionTypeChange={setType}
/>

Hide the picker pills when the parent already owns the toggle UI:

<KVHeadSharingViz showPicker={false} attentionType={type} />

Understanding the component

  1. Two rows, one arrow per query. Top row: numQueryHeads evenly-spaced query-head circles. Bottom row: the active KV-head count, also evenly spaced. Each query head emits a single arrow downward to the KV head it reads from.
  2. Three sharing schemes. MHA keeps a KV head per query head (1:1, no sharing). GQA collapses the KV row to gqaGroups — query heads form contiguous groups of size numQueryHeads / gqaGroups, all sharing one KV head. MQA collapses the KV row to a single head shared by every query.
  3. Group colouring. KV heads alternate between accent and muted tone; every query head inherits the colour of its KV head. This makes the grouping structure read at a glance — MHA looks like a striped fan, GQA looks like a small set of coloured clusters, MQA is a single coloured starburst.
  4. Controlled or uncontrolled. attentionType supports the Radix pattern — pass attentionType plus onAttentionTypeChange for controlled mode, defaultAttentionType for uncontrolled. The built-in picker is a role="radiogroup" of role="radio" pills with aria-checked. Suppress it via showPicker={false} when the parent owns the toggle UI.
  5. SPRINGS.smooth everywhere. KV-row position changes and arrow re-targets animate with the canonical smooth spring; prefers-reduced-motion: reduce collapses every spring to an instant swap.

Props

PropTypeDefaultDescription
numQueryHeadsnumber8Number of query heads on the top row.
attentionType`'mha''gqa''mqa'`
defaultAttentionType`'mha''gqa''mqa'`
onAttentionTypeChange(t) => voidFires when the picker commits a new value.
gqaGroupsnumber4KV heads when attentionType === "gqa". Clamped to [1, numQueryHeads].
showPickerbooleantrueRender the built-in MHA / GQA / MQA picker pills.
transitionTransitionSPRINGS.smoothSpring used for KV-row plus arrow tween.
classNamestringMerged onto the root via cn().

Accessibility

  • The figure is role="figure" with a hidden, aria-live="polite" summary describing the active scheme, the query-to-KV head count, and the group size — screen readers hear the new structure on every change.
  • The attention-type picker is a role="radiogroup" of role="radio" pills with aria-checked. Tab focuses the group; Space and Enter commit a selection.
  • Each pill carries a textual label (MHA, GQA, MQA) plus the resulting KV-head count so colour is never the only signal.
  • The SVG itself is aria-hidden="true" — the summary text carries the semantics. Node labels (Q1...Q8, KV1...KVk) are visual aids only.
  • Motion respects prefers-reduced-motion: reduce.

Credits

  • Extracted from: craftingattention (app/src/lessons/primitives/viz/KVHeadSharingViz.tsx). Stripped the Widget chrome (useWidgetHistory undo / redo, bookmarks, eyebrow plus premise plus caption plus formula), the Explore / Predict / Challenge ModeStrip, the four-question predict-round generator (kv heads / group size / mem percent / mem savings), the five-challenge progression with FeedbackBadge plus ScoreDots, the gqa-2 and gqa-4 enumerated SharingMode union, the CompareView of all four modes side-by-side, and the memory-percentage bar at the bottom of the SVG. The library extract is the pure two-row arrow diagram — query heads, KV heads, arrows, picker — driven entirely by props with SPRINGS.smooth on the row reflow.