Vector and semantic search
Lucenia provides multiple search methods that go beyond traditional keyword matching to understand the meaning and intent behind queries. These capabilities work together to deliver highly relevant results for AI-powered applications.
Vector search (k-NN)
The k-NN plugin enables nearest-neighbor search across vector fields, finding documents that are semantically similar to a query — even when they don't share the same keywords.
Search modes:
| Mode | Description | Use case |
|---|---|---|
| Approximate k-NN | Uses HNSW or IVF indexes for fast, approximate results | Large-scale vector search (millions+ vectors) |
| Exact k-NN | Brute-force scoring via Painless scripts | Small datasets or when precision is critical |
| Filtered k-NN | Pre-filter documents before vector search | Combining metadata filters with semantic search |
Performance optimizations:
- Vector quantization reduces memory usage while preserving search quality
- Maximal marginal relevance (MMR) promotes diversity in results, reducing redundancy
- Late interaction search enables token-level matching for fine-grained information retrieval
Query-time embedding
The query embedding search request processor automatically converts text or image queries into vector embeddings at search time — no client-side embedding required. It uses the same embedding providers available for indexing (Bedrock, OpenAI, self-hosted HTTP) to ensure query vectors are compatible with your indexed vectors.
PUT _search/pipeline/semantic-search
{
"request_processors": [
{
"query_embedding": {
"model_id": "amazon.titan-embed-text-v2:0",
"provider": "bedrock",
"embedding_field": "chunks.embedding",
"query_text_field": "query.neural.chunks.embedding.query_text"
}
}
]
}
With this pipeline, users send plain text queries and Lucenia handles the embedding transparently.
Hybrid search
Combine lexical (BM25) and vector search in a single query to get the best of both worlds — exact keyword matches and semantic understanding:
GET /knowledge-base/_search
{
"query": {
"hybrid": {
"queries": [
{
"match": {
"chunks.text": "geospatial coordinate reference systems"
}
},
{
"knn": {
"chunks.embedding": {
"vector": [0.12, -0.34, ...],
"k": 10
}
}
}
]
}
}
}
Neural sparse search
For use cases where dense vectors aren't ideal, Lucenia supports neural sparse retrieval through the neural sparse search tool. Sparse representations map content to a learned vocabulary of terms with weights, combining the interpretability of keyword search with the semantic awareness of neural models.
Model management
All search-time models are managed through the ML Commons plugin, which provides:
- Pretrained models ready to deploy
- Custom local model deployment
- Remote model connectors (Bedrock, SageMaker, OpenAI, custom)
- GPU acceleration for local inference
- Model access control for multi-tenant environments