Reranker Pipeline Viz

An interactive walkthrough of a cross-encoder reranking pipeline — the standard recipe for turning a noisy nearest-neighbour shortlist into the small, precise context the LLM actually reads. The viewer picks a natural-language query, then steps through four phases: retrieval (the bi-encoder shortlist), reranking (cross-encoder scores), filtering (everything below the threshold drops out), and packing (top-k into the LLM's context window).

Two extra modes harden intuition. Predict asks the viewer to guess which three documents make it into the LLM context based only on the bi-encoder ranking, then reveals the cross-encoder scores in a staged sequence. Challenge runs multiple-choice rounds about when negation, fact mismatch, and temporal cues move documents up or down — all the failure modes the cross-encoder is specifically there to fix.

Select a query to see the reranker pipeline in action.
query

Select a query to see the reranker pipeline in action.

Customize
Filtering & packing
0.50
3

Installation

npx shadcn@latest add https://craftbits.dev/r/reranker-pipeline-viz.json

Usage

import { RerankerPipelineViz } from "@craft-bits/viz/reranker-pipeline-viz";
 
<RerankerPipelineViz />

Swap in your own docs and queries:

<RerankerPipelineViz
  docs={[
    { id: "d01", name: "Lecture transcript: Inflation" },
    { id: "d02", name: "Federal Reserve press release" },
    /* … */
  ]}
  queries={[
    {
      label: "What did the Fed do in 2022?",
      biEncoder: [
        { docId: "d01", rank: 1 },
        { docId: "d02", rank: 2 },
      ],
      crossEncoder: [
        { docId: "d02", score: 0.94, rank: 1 },
        { docId: "d01", score: 0.41, rank: 2, failurePattern: "topic drift" },
      ],
      threshold: 0.5,
      packCount: 3,
    },
  ]}
/>

Customise the Challenge round set:

<RerankerPipelineViz
  challengeRounds={[
    {
      prompt: "Why does the cross-encoder catch negation that the bi-encoder misses?",
      options: ["More parameters", "Joint query+doc attention", "Random luck"],
      correctIdx: 1,
      explanation:
        "The cross-encoder lets the query and doc attend to each other token-by-token, so 'NOT' can flip the meaning of a nearby phrase. The bi-encoder produces independent embeddings that never see each other.",
    },
  ]}
/>

Understanding the component

  1. Mode strip. A controlled role="tablist" switches between Explore / Predict / Challenge. The strip uses aria-selected so screen-reader users hear the active tab.
  2. Query pills. Each query is a pill button with aria-pressed. Selecting in Explore mode jumps to the retrieval phase; selecting in Predict mode resets the prediction state.
  3. Phase strip (Explore). Four phase pills, gated forward. The viewer advances one phase at a time. Past / current / next / future phases each carry distinct visual treatment; the pulsing next phase is the affordance.
  4. Retrieval. Bi-encoder shortlist by rank with a trailing dot previewing the cross-encoder relevance — the viewer can see which apparently-strong candidates are about to fall before scores reveal.
  5. Reranking. Cross-encoder scores rendered as a subtle horizontal score bar inside each row, a coloured numeric badge, and a ranking-delta indicator (↑ / ↓). Items with a failurePattern show a tiny annotation pill (e.g. negation).
  6. Filtering. Same list, dimmed below the threshold. A dashed separator at the threshold makes the cut visible. Filtered items don't reorder, so the viewer can see exactly who fell where.
  7. Packing. A success-tinted panel showing the top-k passing documents that will be packed into the LLM context window.
  8. Predict mode. The bi-encoder list shows without scores. The viewer taps to select up to 3 docs. Pressing Check runs a staged reveal: cross-encoder scores fade in, the list re-orders to cross-encoder rank, pass / filtered labels appear, then the score callout lands.
  9. Challenge mode. Multiple-choice rounds about reranking failure modes — negation, fact mismatch, temporal cues, threshold counting. A ScoreDots row tracks correctness. The final summary shows percentage with a bouncy entrance when perfect.
  10. Reduced motion. Every entrance collapses to { duration: 0 } under prefers-reduced-motion: reduce; the staged Predict reveal still happens, just faster (50/100/150/200 ms instead of 400/800/1400/1800 ms).

Props

PropTypeDefaultDescription
docsreadonly RerankerPipelineVizDoc[]8 Eiffel-Tower docsDoc pool referenced by each query's rankings.
queriesreadonly RerankerPipelineVizQuery[]3 default queriesQuery presets with bi-encoder + cross-encoder rankings, threshold, and pack count.
challengeRoundsreadonly RerankerPipelineVizChallengeRound[]4 default roundsMultiple-choice rounds for Challenge mode.
transitionTransitionSPRINGS.snapOverride entrance spring for rows and panels.
classNamestringMerged onto the root via cn().

Accessibility

  • The root is role="figure" with an aria-label summarising the visualisation.
  • The mode strip is a real role="tablist" with aria-selected on each tab; the query pills carry aria-pressed to mirror the active query.
  • An sr-only polite live region announces phase / reveal / round transitions via the narration string.
  • Every interactive control has a visible focus ring and ≥ 36px hit area.
  • The ScoreDots row in Challenge mode is role="status" with an aria-label that reports the current correct count.
  • Phase, rank, and pass / fail signals are paired with a non-colour cue (numeric badge, pass / filtered label, ↑ / ↓ delta).
  • Motion respects prefers-reduced-motion: reduce — every entrance collapses to instant.

Credits

  • Extracted from: craftingattention (app/src/lessons/primitives/systems/RerankerPipelineViz.tsx). The source was a lesson primitive built on the Widget / useWidgetHistory / ModeStrip / ChallengeBtn / FeedbackBadge / ScoreDots lesson-chrome stack and carried the CA palette tokens (--color-ink-*, --color-surface-raised, --color-accent-400, --color-warn-400, --color-success-400, --color-fail-400) plus the ca-narration banner class. All lesson chrome is stripped and the shell is rebuilt self-contained (mode tablist + query pills + phase strip + body + narration). Every palette reference remaps to semantic cb-* tokens (var(--cb-accent) / var(--cb-warning) / var(--cb-success) / var(--cb-error) / var(--cb-fg-*) / var(--cb-bg-*)). The hard-coded 8-doc fixture, 3 default queries, and 4 challenge rounds are lifted into docs / queries / challengeRounds props (defaults preserved as DEFAULT_RERANKER_DOCS / DEFAULT_RERANKER_QUERIES / DEFAULT_RERANKER_CHALLENGE_ROUNDS). Re-architected to forwardRef + cn() + ...props spread. Inline SPRINGS.snappy / SPRINGS.gentle re-key to canonical SPRINGS.snap / SPRINGS.smooth / SPRINGS.bouncy from @craft-bits/core/motion; STAGGER.tight collapses to the canonical STAGGER scalar. lessonId and the undo/redo useWidgetHistory were stripped.