Skip to main content
Version: 0.11.0

Extensibility and custom integrations

Lucenia's AI retrieval pipeline is built on a service provider interface (SPI) architecture that lets you extend every stage of the pipeline with custom implementations. Whether you need to extract content from a proprietary format, connect to an in-house embedding model, tile specialized imagery, or reproject coordinates, you can plug into the pipeline without modifying core code.

Extension points

The following SPI interfaces are available for customization:

InterfacePurposeExample
ContentExtractorExtract content from custom document formatsNITF news format, proprietary internal formats
RasterSourceRead and decode custom raster/image formatsSpecialized satellite imagery, medical imaging
EmbeddingProviderGenerate embeddings from custom model endpointsPrivate fine-tuned models, on-premise GPU clusters
InferenceProviderRun inference against custom LLM endpointsPrivate LLMs for OCR, reranking, or classification
CRSProviderInterfaceCustom coordinate reference system transformationsCustom geodetic datums, local engineering grids

For detailed interface definitions and registration, see Extending ingest-content.

How it works

Each extension point follows the same pattern:

  1. Implement the interface -- Create a Java class that implements the SPI interface
  2. Register the provider -- Add a META-INF/services file to your JAR with your class name
  3. Package as a plugin -- Build a Lucenia plugin ZIP with your JAR and dependencies
  4. Install and configure -- Install the plugin and reference your provider in pipeline configurations

Priority-based selection

When multiple implementations support the same MIME type or provider name, the one with the highest priority() value wins:

PriorityMeaning
0Built-in default
1--9Community extensions
10--19Vendor-specific overrides
20+Customer-specific customizations

Plugin project structure

All custom SPI implementations follow the same project layout. Here is the recommended structure using Gradle:

my-lucenia-plugin/
├── build.gradle
├── src/
│ └── main/
│ ├── java/
│ │ └── com/example/myplugin/
│ │ └── MyProvider.java
│ └── resources/
│ └── META-INF/
│ └── services/
│ └── io.lucenia.ingest.content.enrich.raster.RasterSource
└── plugin-descriptor.properties

build.gradle

plugins {
id 'java'
}

group = 'com.example'
version = '1.0.0'

java {
sourceCompatibility = JavaVersion.VERSION_21
targetCompatibility = JavaVersion.VERSION_21
}

repositories {
mavenCentral()
// Add Lucenia Maven repository
maven { url 'https://maven.lucenia.io/releases' }
}

dependencies {
compileOnly "io.lucenia:lucenia-ingest-content:0.11.0"

// Add your implementation-specific dependencies here
// implementation "org.gdal:gdal:3.9.0"
}

// Package as a Lucenia plugin ZIP
task buildPlugin(type: Zip) {
archiveBaseName = 'my-lucenia-plugin'
from(jar)
from(configurations.runtimeClasspath)
from('plugin-descriptor.properties')
}

plugin-descriptor.properties

name=my-lucenia-plugin
version=1.0.0
description=Custom SPI provider for Lucenia
lucenia.version=0.11.1
java.version=21
classname=com.example.myplugin.MyPlugin

Install your plugin

bin/lucenia-plugin install file:///path/to/my-lucenia-plugin-1.0.0.zip

Example: Custom image tiling with RasterSource

Organizations processing proprietary raster formats -- such as NITF imagery, MrSID compressed tiles, or ECW wavelet-compressed imagery -- can implement a custom RasterSource to integrate with the image tiling processor.

RasterSource interface

public interface RasterSource {

/** Whether this source handles the given MIME type. */
boolean supports(String mimeType, Map<String, Object> metadata);

/** Extract metadata: dimensions, CRS, geo-transform, bands. */
RasterMetadata metadata(InputStream stream, Map<String, Object> context)
throws IOException;

/** Read a pixel window from the raster. */
BufferedImage readWindow(InputStream stream, TileWindow window)
throws IOException;

/** Priority for MIME type conflicts. Higher values win. */
int priority();

/**
* Open a stateful session for efficient multi-tile reads.
* Return null to use the per-tile stream approach.
*/
default RasterReadSession openSession(
InputStream stream, Map<String, Object> context) throws IOException {
return null;
}
}

Implementation: NITF raster source via GDAL

This example uses GDAL JNI bindings to read NITF imagery with full geospatial metadata extraction and efficient session-based tiling:

package com.example.nitf;

import io.lucenia.ingest.content.enrich.raster.*;
import org.gdal.gdal.Band;
import org.gdal.gdal.Dataset;
import org.gdal.gdal.gdal;
import org.gdal.osr.SpatialReference;

import java.awt.image.BufferedImage;
import java.awt.image.DataBufferInt;
import java.io.*;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Map;
import java.util.Set;

public class NitfRasterSource implements RasterSource {

private static final Set<String> SUPPORTED_TYPES = Set.of(
"image/nitf", "application/vnd.nitf",
"image/x-mrsid", "image/ecw"
);

static {
gdal.AllRegister();
}

@Override
public boolean supports(String mimeType, Map<String, Object> metadata) {
return SUPPORTED_TYPES.contains(mimeType);
}

@Override
public RasterMetadata metadata(InputStream stream, Map<String, Object> context)
throws IOException {
// Use URI if available (for remote/large files), otherwise spool to temp
String path = resolveDatasetPath(stream, context);
Dataset ds = gdal.Open(path);
if (ds == null) {
throw new IOException("GDAL cannot open: " + path);
}
try {
return extractMetadata(ds);
} finally {
ds.delete();
}
}

@Override
public BufferedImage readWindow(InputStream stream, TileWindow window)
throws IOException {
// For one-off reads without a session
String path = resolveDatasetPath(stream, Map.of());
Dataset ds = gdal.Open(path);
if (ds == null) {
throw new IOException("GDAL cannot open dataset");
}
try {
return readPixels(ds, window);
} finally {
ds.delete();
}
}

@Override
public int priority() {
return 20; // Customer-level override
}

@Override
public RasterReadSession openSession(InputStream stream, Map<String, Object> context)
throws IOException {
String path = resolveDatasetPath(stream, context);
return new GdalReadSession(path);
}

// --- Helper methods ---

private RasterMetadata extractMetadata(Dataset ds) {
double[] gt = ds.GetGeoTransform();
SpatialReference srs = new SpatialReference(ds.GetProjection());

return new RasterMetadata.Builder()
.width(ds.getRasterXSize())
.height(ds.getRasterYSize())
.bands(ds.getRasterCount())
.geoTransform(gt)
.crs(srs.ExportToWkt())
.extra(Map.of(
"driver", ds.GetDriver().getShortName(),
"interleave", ds.GetMetadataItem("INTERLEAVE", "IMAGE_STRUCTURE")
))
.build();
}

private BufferedImage readPixels(Dataset ds, TileWindow window) {
int w = window.width();
int h = window.height();
int bands = Math.min(ds.getRasterCount(), 3);
int[] pixels = new int[w * h];

// Read RGB bands and pack into ARGB int array
for (int b = 1; b <= bands; b++) {
int[] bandData = new int[w * h];
ds.GetRasterBand(b).ReadRaster(
window.x(), window.y(), w, h, bandData
);
int shift = (3 - b) * 8; // R=16, G=8, B=0
for (int i = 0; i < pixels.length; i++) {
pixels[i] |= (bandData[i] & 0xFF) << shift;
pixels[i] |= 0xFF000000; // opaque alpha
}
}

BufferedImage img = new BufferedImage(w, h, BufferedImage.TYPE_INT_ARGB);
System.arraycopy(pixels, 0,
((DataBufferInt) img.getRaster().getDataBuffer()).getData(), 0,
pixels.length);
return img;
}

private String resolveDatasetPath(InputStream stream, Map<String, Object> context)
throws IOException {
// Prefer URI for GDAL's virtual filesystem (VSICURL, etc.)
Object uri = context.get("uri");
if (uri != null) {
String uriStr = uri.toString();
if (uriStr.startsWith("http")) {
return "/vsicurl/" + uriStr;
}
return uriStr;
}
// Fall back to spooling the stream to a temp file
Path tmp = Files.createTempFile("lucenia-gdal-", ".ntf");
try (OutputStream out = Files.newOutputStream(tmp)) {
stream.transferTo(out);
}
return tmp.toString();
}

// --- Stateful session for efficient multi-tile reads ---

static class GdalReadSession implements RasterReadSession {
private final Dataset dataset;
private final RasterMetadata meta;

GdalReadSession(String path) throws IOException {
this.dataset = gdal.Open(path);
if (this.dataset == null) {
throw new IOException("GDAL cannot open: " + path);
}
// Cache metadata once for the session
double[] gt = dataset.GetGeoTransform();
SpatialReference srs = new SpatialReference(dataset.GetProjection());
this.meta = new RasterMetadata.Builder()
.width(dataset.getRasterXSize())
.height(dataset.getRasterYSize())
.bands(dataset.getRasterCount())
.geoTransform(gt)
.crs(srs.ExportToWkt())
.build();
}

@Override
public RasterMetadata metadata() {
return meta;
}

@Override
public BufferedImage readWindow(TileWindow window) throws IOException {
int w = window.width();
int h = window.height();
int bands = Math.min(dataset.getRasterCount(), 3);
int[] pixels = new int[w * h];

for (int b = 1; b <= bands; b++) {
int[] bandData = new int[w * h];
dataset.GetRasterBand(b).ReadRaster(
window.x(), window.y(), w, h, bandData
);
int shift = (3 - b) * 8;
for (int i = 0; i < pixels.length; i++) {
pixels[i] |= (bandData[i] & 0xFF) << shift;
pixels[i] |= 0xFF000000;
}
}

BufferedImage img = new BufferedImage(w, h, BufferedImage.TYPE_INT_ARGB);
System.arraycopy(pixels, 0,
((DataBufferInt) img.getRaster().getDataBuffer()).getData(), 0,
pixels.length);
return img;
}

@Override
public void close() throws IOException {
if (dataset != null) {
dataset.delete();
}
}
}
}

META-INF/services registration

Create META-INF/services/io.lucenia.ingest.content.enrich.raster.RasterSource:

com.example.nitf.NitfRasterSource

Pipeline configuration

Once installed, use your custom raster source with the image tiling processor. Lucenia automatically selects your provider based on MIME type and priority:

PUT /_ingest/pipeline/tile_nitf_imagery
{
"description": "Tile NITF imagery with geospatial metadata",
"processors": [
{
"content_extract": {
"field": "attachment",
"input_mode": "reference"
}
},
{
"image_tiling": {
"field": "attachment",
"tile_size": 512,
"target_field": "tiles",
"include_metadata": true
}
},
{
"embed": {
"field": "tiles.*.description",
"target_field": "tiles.*.embedding",
"provider": "bedrock",
"model_id": "amazon.titan-embed-image-v1",
"dimensions": 1024
}
}
]
}

Index NITF imagery by reference (no upload required for S3 or HTTPS sources):

PUT /satellite_imagery/_doc/recon_001
{
"attachment": {
"uri": "s3://my-bucket/imagery/20250601_recon.ntf",
"content_type": "image/nitf"
},
"mission": "aerial_survey",
"captured": "2025-06-01T14:30:00Z"
}

Example: Custom embedding provider

Organizations running fine-tuned models on private GPU clusters or specialized hardware can implement the EmbeddingProvider interface to connect the embed processor and query embedding processor to any model endpoint.

EmbeddingProvider interface

public interface EmbeddingProvider {

/** Provider name used in pipeline configuration. */
String name();

/** Embed a batch of inputs (text, image, or multimodal). */
List<float[]> embedInputs(
List<EmbeddingInput> inputs,
String modelId,
int dimensions,
Map<String, String> providerConfig
) throws EmbeddingException;

/** Maximum inputs per batch call. */
int maxBatchSize();

/** Validate configuration at pipeline creation time. */
void validate(
String modelId,
int dimensions,
String contentType,
Map<String, String> providerConfig
);
}

Implementation: vLLM-hosted embedding model

This example connects to a privately deployed vLLM server running a fine-tuned embedding model:

package com.example.vllm;

import io.lucenia.ingest.content.enrich.provider.*;

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.*;
import com.fasterxml.jackson.databind.ObjectMapper;

public class VllmEmbeddingProvider implements EmbeddingProvider {

private static final ObjectMapper MAPPER = new ObjectMapper();
private final HttpClient client = HttpClient.newHttpClient();

@Override
public String name() {
return "vllm";
}

@Override
public List<float[]> embedInputs(
List<EmbeddingInput> inputs,
String modelId,
int dimensions,
Map<String, String> providerConfig) throws EmbeddingException {

String endpoint = providerConfig.get("endpoint");
String apiKey = providerConfig.get("api_key");

// Build request body (OpenAI-compatible API)
List<String> texts = new ArrayList<>();
for (EmbeddingInput input : inputs) {
texts.add(input.getText());
}

try {
Map<String, Object> body = Map.of(
"model", modelId,
"input", texts,
"encoding_format", "float"
);

HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(endpoint + "/v1/embeddings"))
.header("Content-Type", "application/json")
.header("Authorization", "Bearer " + apiKey)
.POST(HttpRequest.BodyPublishers.ofString(MAPPER.writeValueAsString(body)))
.build();

HttpResponse<String> response = client.send(
request, HttpResponse.BodyHandlers.ofString()
);

if (response.statusCode() != 200) {
throw new EmbeddingException(
"vLLM returned " + response.statusCode() + ": " + response.body()
);
}

// Parse response
Map<String, Object> result = MAPPER.readValue(
response.body(), Map.class
);
List<Map<String, Object>> data = (List<Map<String, Object>>) result.get("data");

List<float[]> embeddings = new ArrayList<>();
for (Map<String, Object> item : data) {
List<Number> values = (List<Number>) item.get("embedding");
float[] vec = new float[dimensions];
for (int i = 0; i < Math.min(values.size(), dimensions); i++) {
vec[i] = values.get(i).floatValue();
}
embeddings.add(vec);
}
return embeddings;

} catch (EmbeddingException e) {
throw e;
} catch (Exception e) {
throw new EmbeddingException("Failed to call vLLM endpoint", e);
}
}

@Override
public int maxBatchSize() {
return 64;
}

@Override
public void validate(String modelId, int dimensions,
String contentType, Map<String, String> providerConfig) {
if (providerConfig.get("endpoint") == null) {
throw new IllegalArgumentException(
"vLLM provider requires 'endpoint' in provider_config"
);
}
}
}

META-INF/services registration

Create META-INF/services/io.lucenia.ingest.content.enrich.provider.EmbeddingProvider:

com.example.vllm.VllmEmbeddingProvider

Secure API key configuration

Store credentials in the Lucenia keystore instead of pipeline configuration:

bin/lucenia-keystore add vllm.api_key
# Enter your API key when prompted

Pipeline configuration

PUT /_ingest/pipeline/embed_with_vllm
{
"description": "Embed documents using a private vLLM model",
"processors": [
{
"content_extract": {
"field": "attachment",
"target_field": "content"
}
},
{
"chunk": {
"field": "content",
"algorithm": "recursive",
"max_tokens": 512,
"overlap": 50,
"target_field": "chunks"
}
},
{
"embed": {
"field": "chunks.*.text",
"target_field": "chunks.*.embedding",
"provider": "vllm",
"model_id": "intfloat/e5-mistral-7b-instruct",
"dimensions": 4096,
"provider_config": {
"endpoint": "https://gpu-cluster.internal:8000",
"api_key_setting": "vllm.api_key"
}
}
}
]
}

The same provider automatically works at search time with the query embedding processor, ensuring consistent embeddings for both indexing and retrieval.


Example: Custom inference provider

Implement the InferenceProvider interface to connect the OCR processor and multimodal rerank processor to privately deployed LLMs.

InferenceProvider interface

public interface InferenceProvider {

/** Provider name used in pipeline configuration. */
String name();

/** Run inference on a batch of inputs. */
InferenceResult infer(
List<InferenceInput> inputs,
String modelId,
String taskType,
Map<String, String> providerConfig
) throws InferenceException;

/** Validate configuration at pipeline creation time. */
void validate(
String modelId,
String taskType,
Map<String, String> providerConfig
);
}

Supported task types

Task typeInputOutputUsed by
ocrImageText + bounding boxes + confidenceocr processor
rerankText (query + passage)Relevance score [0.0, 1.0]multimodal_rerank
captionImageDescriptive textCustom pipelines
summarizeTextSummary textCustom pipelines
classifyText or imageLabels + scoresCustom pipelines

Implementation: Self-hosted reranking model

This example connects to a privately deployed cross-encoder reranking model for use with the multimodal rerank search processor:

package com.example.rerank;

import io.lucenia.ingest.content.enrich.inference.*;

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.*;
import com.fasterxml.jackson.databind.ObjectMapper;

public class CrossEncoderInferenceProvider implements InferenceProvider {

private static final ObjectMapper MAPPER = new ObjectMapper();
private final HttpClient client = HttpClient.newHttpClient();

@Override
public String name() {
return "cross-encoder";
}

@Override
public InferenceResult infer(
List<InferenceInput> inputs,
String modelId,
String taskType,
Map<String, String> providerConfig) throws InferenceException {

String endpoint = providerConfig.get("endpoint");
String apiKey = providerConfig.get("api_key");

try {
if ("rerank".equals(taskType)) {
return handleRerank(inputs, modelId, endpoint, apiKey);
} else if ("ocr".equals(taskType)) {
return handleOcr(inputs, modelId, endpoint, apiKey);
}
throw new InferenceException("Unsupported task type: " + taskType);
} catch (InferenceException e) {
throw e;
} catch (Exception e) {
throw new InferenceException("Inference call failed", e);
}
}

private InferenceResult handleRerank(
List<InferenceInput> inputs, String modelId,
String endpoint, String apiKey) throws Exception {

// Build pairs for cross-encoder scoring
List<Map<String, String>> pairs = new ArrayList<>();
for (InferenceInput input : inputs) {
pairs.add(Map.of(
"query", input.getPrompt(),
"passage", input.getText()
));
}

Map<String, Object> body = Map.of(
"model", modelId,
"pairs", pairs
);

HttpResponse<String> response = post(endpoint + "/v1/rerank", body, apiKey);
Map<String, Object> result = MAPPER.readValue(response.body(), Map.class);
List<Number> scores = (List<Number>) result.get("scores");

List<InferenceOutput> outputs = new ArrayList<>();
for (Number score : scores) {
outputs.add(InferenceOutput.score(
Math.max(0.0, Math.min(1.0, score.doubleValue()))
));
}
return new InferenceResult(outputs);
}

private InferenceResult handleOcr(
List<InferenceInput> inputs, String modelId,
String endpoint, String apiKey) throws Exception {

List<InferenceOutput> outputs = new ArrayList<>();
for (InferenceInput input : inputs) {
// Send image to vision model endpoint
Map<String, Object> body = Map.of(
"model", modelId,
"image", Base64.getEncoder().encodeToString(input.getImageData()),
"image_type", input.getImageMimeType(),
"prompt", "Extract all text from this image."
);

HttpResponse<String> response = post(endpoint + "/v1/ocr", body, apiKey);
Map<String, Object> result = MAPPER.readValue(response.body(), Map.class);
outputs.add(InferenceOutput.text((String) result.get("text")));
}
return new InferenceResult(outputs);
}

private HttpResponse<String> post(String url, Map<String, Object> body, String apiKey)
throws Exception {
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(url))
.header("Content-Type", "application/json")
.header("Authorization", "Bearer " + apiKey)
.POST(HttpRequest.BodyPublishers.ofString(MAPPER.writeValueAsString(body)))
.build();
return client.send(request, HttpResponse.BodyHandlers.ofString());
}

@Override
public void validate(String modelId, String taskType,
Map<String, String> providerConfig) {
if (providerConfig.get("endpoint") == null) {
throw new IllegalArgumentException(
"cross-encoder provider requires 'endpoint' in provider_config"
);
}
if (!Set.of("rerank", "ocr").contains(taskType)) {
throw new IllegalArgumentException(
"cross-encoder provider supports 'rerank' and 'ocr' task types"
);
}
}
}

Pipeline configuration: Reranking with a private model

PUT /_search/pipeline/rerank_with_cross_encoder
{
"response_processors": [
{
"multimodal_rerank": {
"provider": "cross-encoder",
"model_id": "cross-encoder/ms-marco-MiniLM-L-12-v2",
"provider_config": {
"endpoint": "https://ml-cluster.internal:9000",
"api_key_setting": "cross_encoder.api_key"
}
}
}
]
}

Example: Custom content extractor

Implement ContentExtractor to add support for proprietary or specialized document formats. The extracted content flows seamlessly into downstream chunking, embedding, and OCR processors.

ContentExtractor interface

public interface ContentExtractor {

/** Whether this extractor handles the given MIME type. */
boolean supports(String mimeType, Map<String, Object> metadata);

/**
* Extract content from the input stream.
* Do NOT close the stream -- the caller manages its lifecycle.
*/
ExtractedContent extract(InputStream stream, ExtractionContext context)
throws ExtractionException;

/** Priority for MIME type conflicts. Higher values win. */
default int priority() {
return 0;
}
}

Implementation: DICOM medical imaging extractor

package com.example.dicom;

import io.lucenia.ingest.content.extract.*;

import java.io.InputStream;
import java.util.Map;
import java.util.Set;

public class DicomExtractor implements ContentExtractor {

private static final Set<String> SUPPORTED = Set.of(
"application/dicom", "application/vnd.dicom"
);

@Override
public boolean supports(String mimeType, Map<String, Object> metadata) {
return SUPPORTED.contains(mimeType);
}

@Override
public ExtractedContent extract(InputStream stream, ExtractionContext context)
throws ExtractionException {
try {
DicomFile dicom = DicomParser.parse(stream);

ExtractedContent content = new ExtractedContent();

// Extract patient and study metadata as text
content.addTextBlock(String.format(
"Study: %s\nModality: %s\nBody Part: %s\nStudy Date: %s",
dicom.getStudyDescription(),
dicom.getModality(),
dicom.getBodyPartExamined(),
dicom.getStudyDate()
));

// Extract pixel data as image blocks for downstream OCR/tiling
for (DicomFrame frame : dicom.getFrames()) {
content.addImageBlock(
frame.getPixelData(),
"image/png",
Map.of(
"frame_number", String.valueOf(frame.getNumber()),
"window_center", String.valueOf(frame.getWindowCenter()),
"window_width", String.valueOf(frame.getWindowWidth())
)
);
}

// Attach DICOM-specific metadata
content.setMetadata(Map.of(
"patient_id", dicom.getPatientId(),
"modality", dicom.getModality(),
"study_uid", dicom.getStudyInstanceUID(),
"series_uid", dicom.getSeriesInstanceUID()
));

return content;
} catch (Exception e) {
throw new ExtractionException("Failed to parse DICOM file", e);
}
}

@Override
public int priority() {
return 20;
}
}

Pipeline configuration

PUT /_ingest/pipeline/ingest_dicom
{
"description": "Extract, chunk, and embed DICOM medical images",
"processors": [
{
"content_extract": {
"field": "attachment",
"target_field": "content",
"input_mode": "reference"
}
},
{
"chunk": {
"field": "content",
"algorithm": "fixed",
"max_tokens": 256,
"target_field": "chunks"
}
},
{
"embed": {
"field": "chunks.*.text",
"target_field": "chunks.*.embedding",
"provider": "bedrock",
"model_id": "amazon.titan-embed-text-v2:0",
"dimensions": 1024
}
}
]
}

Example: Custom reprojection provider

For organizations using custom geodetic datums, local engineering grids, or specialized coordinate systems not covered by the EPSG registry, implement the CRSProviderInterface to add custom coordinate transformations to the spatial reprojection processor.

See the spatial reprojection processor documentation for the full CRSProviderInterface, a complete implementation example, and pipeline configuration.


End-to-end example: Complete AI retrieval pipeline with custom providers

This example combines multiple custom providers into a single pipeline that extracts NITF imagery, tiles it with a custom raster source, embeds the tiles, and prepares them for reranking at search time:

Ingest pipeline

PUT /_ingest/pipeline/full_ai_pipeline
{
"description": "End-to-end AI pipeline with custom NITF + vLLM providers",
"processors": [
{
"content_extract": {
"field": "attachment",
"input_mode": "reference",
"target_field": "content"
}
},
{
"image_tiling": {
"field": "attachment",
"tile_size": 512,
"target_field": "tiles",
"include_metadata": true
}
},
{
"ocr": {
"field": "tiles.*.image",
"target_field": "tiles.*.ocr_text",
"provider": "bedrock",
"model_id": "anthropic.claude-3-haiku-20240307-v1:0"
}
},
{
"chunk": {
"field": "content",
"algorithm": "recursive",
"max_tokens": 512,
"overlap": 50,
"target_field": "chunks"
}
},
{
"embed": {
"field": "chunks.*.text",
"target_field": "chunks.*.embedding",
"provider": "vllm",
"model_id": "intfloat/e5-mistral-7b-instruct",
"dimensions": 4096,
"provider_config": {
"endpoint": "https://gpu-cluster.internal:8000",
"api_key_setting": "vllm.api_key"
}
}
},
{
"rerank_prepare": {
"field": "chunks",
"context_fields": ["content", "tiles.*.ocr_text"]
}
}
]
}

Search pipeline with custom reranking

PUT /_search/pipeline/ai_search_pipeline
{
"request_processors": [
{
"query_embedding": {
"provider": "vllm",
"model_id": "intfloat/e5-mistral-7b-instruct",
"dimensions": 4096,
"provider_config": {
"endpoint": "https://gpu-cluster.internal:8000",
"api_key_setting": "vllm.api_key"
}
}
}
],
"response_processors": [
{
"multimodal_rerank": {
"provider": "cross-encoder",
"model_id": "cross-encoder/ms-marco-MiniLM-L-12-v2",
"provider_config": {
"endpoint": "https://ml-cluster.internal:9000",
"api_key_setting": "cross_encoder.api_key"
}
}
},
{
"retrieval_grounding": {
"include_provenance": true,
"include_spatial_context": true
}
}
]
}

Query

POST /satellite_imagery/_search?search_pipeline=ai_search_pipeline
{
"query": {
"knn": {
"chunks.embedding": {
"vector": [],
"k": 20
}
}
},
"_source": ["content", "tiles.*.ocr_text", "mission", "captured"]
}

The query embedding processor automatically vectorizes the query using the same vLLM model, the results are reranked by the cross-encoder, and retrieval grounding attaches provenance and spatial context for downstream RAG applications.