Skip to main content
Version: 0.11.0

Reranking and retrieval grounding

After initial retrieval, Lucenia's search pipeline processors can rerank results using LLM-powered analysis and enrich them with provenance information for RAG citation. These processors transform raw search hits into grounded, citation-ready results.

Multimodal reranking

The multimodal rerank response processor uses LLM inference to rerank the top-N search results based on deep understanding of both the query and result content. Unlike score-based reranking, it can reason about relevance across text, images, and structured data.

How it works:

  1. Initial search returns top-N candidates
  2. The reranker sends each candidate (with its content and metadata) to an LLM inference provider
  3. The LLM scores each result for relevance to the original query
  4. Results are reordered by the LLM relevance score

This is especially powerful for multimodal content where a text query needs to match against images, charts, or mixed-content documents.

Retrieval grounding

The retrieval grounding response processor enriches search results with provenance and context information needed for reliable RAG applications. Each hit is annotated with:

  • Source provenance: Which document and section the result came from
  • Spatial context: Geographic coordinates and bounding boxes for geospatial content
  • Chunk context: Surrounding text and position within the original document
  • Confidence metadata: Extraction confidence and relevance signals

This enables downstream LLM applications to generate answers with proper citations, linking back to the exact source material.

End-to-end search pipeline

Combine query embedding, reranking, and grounding in a single search pipeline:

PUT _search/pipeline/ai-retrieval-pipeline
{
"request_processors": [
{
"query_embedding": {
"model_id": "amazon.titan-embed-text-v2:0",
"provider": "bedrock",
"embedding_field": "chunks.embedding",
"query_text_field": "query.neural.chunks.embedding.query_text"
}
}
],
"response_processors": [
{
"multimodal_rerank": {
"top_n": 20,
"context_fields": ["chunks.text", "title"],
"provider": "bedrock",
"model_id": "anthropic.claude-sonnet-4-20250514"
}
},
{
"retrieval_grounding": {
"context_fields": ["chunks.text"],
"source_field": "title"
}
}
]
}

The flow:

User query: "How does Lucenia handle coordinate reprojection?"


┌─────────────────────┐
│ Query embedding │ Convert text query to vector
│ (request proc.) │
└──────────┬──────────┘

┌─────────────────────┐
│ k-NN + BM25 search │ Retrieve top candidates
└──────────┬──────────┘

┌─────────────────────┐
│ Multimodal rerank │ LLM re-scores for relevance
│ (response proc.) │
└──────────┬──────────┘

┌─────────────────────┐
│ Retrieval grounding│ Add citations and provenance
│ (response proc.) │
└──────────┬──────────┘

Grounded results with citations

The result is a search pipeline that takes a plain text question, automatically embeds it, retrieves the most relevant content, reranks using LLM reasoning, and returns results annotated with everything an AI application needs to generate cited answers.