---
name: query-reformulation-expander
description: Expand a user query into 3-5 rewrites plus a HyDE-style synthetic passage, then aggregate retrieval results for measurably better recall.
title: Query Reformulation Expander
category: search-retrieval
difficulty: intermediate
author: admin
icon: 🔄
input: text
output: structured-json
phase: enhance
domain: research
tags: query-expansion,hyde-retrieval,reciprocal-rank-fusion,rag-enhancement,paraphrase-generation,retrieval-fusion,document-matching,vocabulary-mismatch,synthetic-passage,rank-aggregation,search-augmentation,information-retrieval
best_for:
  - RAG systems with short user queries
  - Legal or technical document retrieval
  - Multi-vocabulary knowledge bases
  - Improving recall in semantic search
  - QA systems with domain-specific corpora
---

## Description

Takes a terse user query and expands it into multiple retrieval signals — paraphrases, a broader version, a more specific version, and a hypothetical answer passage — runs retrieval against each, and merges results with reciprocal rank fusion. Designed as a drop-in layer over any existing RAG pipeline.

## Why it works

Users write short queries; documents are written in long form. Single-query retrieval often misses passages where the vocabulary differs. Expanding the query into multiple phrasings covers more of the document's vocabulary space, and HyDE retrieves by synthesized-answer similarity — which finds passages that actually answer the question rather than ones that merely restate it.

## How it works

1. Generate 3 paraphrases of the user query via LLM. 2. Generate one broader and one more-specific variant. 3. Generate a 100-word HyDE passage that pretends to answer the query. 4. Retrieve top-K against each variant and the HyDE passage. 5. Merge via reciprocal rank fusion with a small boost for passages retrieved by multiple variants. 6. Return the top N deduped chunks plus a provenance breakdown showing which variant retrieved each chunk.
