Version: 0.11.1

Multimodal rerank processor

Introduced 0.11.0

The multimodal_rerank processor is a search response processor that reranks the top-N search results using a multimodal inference model (such as Claude 3 on Bedrock). It scores each hit's relevance to the query by sending query-passage pairs to a vision-capable LLM and reordering results by the returned relevance scores.

This is particularly useful for cross-modal search (text query against image tiles) and for improving precision on the first page of results.

Syntax

PUT /_search/pipeline/reranked-search
{
  "response_processors": [
    {
      "multimodal_rerank": {
        "model_id": "anthropic.claude-3-haiku-20240307-v1:0",
        "provider": "bedrock",
        "content_field": "text",
        "top_n": 10,
        "provider_config": {
          "region": "us-east-2"
        }
      }
    }
  ]
}

How it works

Initial search results (by vector similarity):
┌─────────────────────────────────────┐
│ Hit 1: score 0.92 "Budget report"   │  ← vector match but not relevant
│ Hit 2: score 0.89 "Revenue growth"  │  ← highly relevant
│ Hit 3: score 0.87 "Staff directory" │  ← not relevant
│ Hit 4: score 0.85 "Q4 performance"  │  ← highly relevant
│ Hit 5: score 0.83 "Office memo"     │  ← not relevant
│ ...top_n hits sent to reranker...   │
└─────────────────────┬───────────────┘
                      │
         multimodal_rerank processor
                      │
    For each hit, sends to Claude:
    "Query: quarterly financial results
     Passage: [hit text/image]
     Score relevance 0.0-1.0"
                      │
                      ▼
Reranked results:
┌─────────────────────────────────────┐
│ Hit 1: score 0.95 "Revenue growth"  │  ← promoted
│ Hit 2: score 0.91 "Q4 performance"  │  ← promoted
│ Hit 3: score 0.42 "Budget report"   │  ← demoted
│ Hit 4: score 0.15 "Staff directory" │  ← demoted
│ Hit 5: score 0.08 "Office memo"     │  ← demoted
└─────────────────────────────────────┘

Configuration parameters

Parameter	Data type	Required/Optional	Description
`model_id`	String	Required	Inference model identifier (e.g., `anthropic.claude-3-haiku-20240307-v1:0`).
`provider`	String	Required	Inference provider: `bedrock` or `http`.
`content_field`	String	Optional	Source field in each hit containing passage text. Default is `text`.
`image_field`	String	Optional	Source field in each hit containing base64 image data. Enables cross-modal reranking.
`top_n`	Integer	Optional	Number of top results to rerank. Results beyond `top_n` keep their original order. Default is `10`.
`provider_config`	Object	Optional	Provider-specific configuration (region, credentials).
`tag`	String	Optional	The processor's identifier.
`description`	String	Optional	A description of the processor.

Query input

The processor reads query content from ext.query_embedding in the search request (the same extension used by the query_embedding request processor). This means the processor knows the original query text and/or image.

Scoring

The processor sends each query-hit pair to the inference model with the rerank task type. The model returns a relevance score between 0.0 and 1.0:

1.0 = perfectly relevant
0.0 = completely irrelevant

Hits are stable-sorted by descending score (original position breaks ties). Hit scores are updated in-place.

Using the processor

Example 1: Text reranking

PUT /_search/pipeline/reranked-semantic
{
  "request_processors": [
    {
      "query_embedding": {
        "model_id": "amazon.titan-embed-text-v2:0",
        "provider": "bedrock",
        "dimensions": 1024,
        "mode": "template",
        "query_template": "{\"query\":{\"nested\":{\"path\":\"chunks\",\"query\":{\"knn\":{\"chunks.embedding\":{\"vector\":${embedding},\"k\":20}}},\"inner_hits\":{\"_source\":[\"chunks.text\"],\"size\":3}}}}",
        "provider_config": { "region": "us-east-2" }
      }
    }
  ],
  "response_processors": [
    {
      "multimodal_rerank": {
        "model_id": "anthropic.claude-3-haiku-20240307-v1:0",
        "provider": "bedrock",
        "content_field": "text",
        "top_n": 10,
        "provider_config": { "region": "us-east-2" }
      }
    }
  ]
}

PUT /_search/pipeline/reranked-geo
{
  "request_processors": [
    {
      "query_embedding": {
        "model_id": "amazon.titan-embed-image-v1",
        "provider": "bedrock",
        "dimensions": 1024,
        "mode": "template",
        "query_template": "{\"query\":{\"nested\":{\"path\":\"chunks\",\"query\":{\"knn\":{\"chunks.embedding\":{\"vector\":${embedding},\"k\":20}}}}}}",
        "provider_config": { "region": "us-east-1" }
      }
    }
  ],
  "response_processors": [
    {
      "multimodal_rerank": {
        "model_id": "anthropic.claude-3-haiku-20240307-v1:0",
        "provider": "bedrock",
        "image_field": "image_data",
        "top_n": 5,
        "provider_config": { "region": "us-east-2" }
      }
    }
  ]
}

tip

Set top_n to a small number (5-10) to balance reranking quality against latency. Each hit requires one inference call to the LLM.

Example 3: Full pipeline (embed + rerank + grounding)

PUT /_search/pipeline/full-search
{
  "request_processors": [
    {
      "query_embedding": {
        "model_id": "amazon.titan-embed-text-v2:0",
        "provider": "bedrock",
        "dimensions": 1024,
        "mode": "template",
        "query_template": "{\"query\":{\"nested\":{\"path\":\"chunks\",\"query\":{\"knn\":{\"chunks.embedding\":{\"vector\":${embedding},\"k\":20}}},\"inner_hits\":{\"_source\":[\"chunks.text\"],\"size\":3}}}}",
        "provider_config": { "region": "us-east-2" }
      }
    }
  ],
  "response_processors": [
    {
      "multimodal_rerank": {
        "model_id": "anthropic.claude-3-haiku-20240307-v1:0",
        "provider": "bedrock",
        "content_field": "text",
        "top_n": 10,
        "provider_config": { "region": "us-east-2" }
      }
    },
    {
      "retrieval_grounding": {
        "include_provenance": true,
        "include_chunk_context": true
      }
    }
  ]
}

Syntax​

How it works​

Configuration parameters​

Query input​

Scoring​

Using the processor​

Example 1: Text reranking​

Example 2: Cross-modal reranking (text query vs image tiles)​

Example 3: Full pipeline (embed + rerank + grounding)​