Version: 0.11.1

AI search and retrieval

Lucenia is the search and retrieval engine purpose-built for contextual AI workloads in private clouds. From content extraction to retrieval grounding, Lucenia provides a complete, end-to-end AI retrieval pipeline — all running on your own infrastructure, with enterprise-grade security and cutting-edge geospatial intelligence.

The AI retrieval pipeline

Lucenia's AI retrieval capabilities span the full lifecycle of content — from raw documents to grounded, citation-ready search results:

┌─────────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌──────────┐
│   Extract   │──►│  Chunk  │──►│  Embed  │──►│  Index  │──►│ Search  │──►│ Rerank  │──►│  Ground  │
│             │   │         │   │         │   │         │   │         │   │         │   │          │
│ PDF, DOCX,  │   │ 4 smart │   │ Bedrock │   │ k-NN    │   │ Vector  │   │ LLM     │   │ Citation │
│ HTML, images│   │ algos   │   │ OpenAI  │   │ vectors │   │ hybrid  │   │ powered │   │ provenance│
│ GeoTIFF     │   │         │   │ custom  │   │         │   │ sparse  │   │         │   │ spatial  │
└─────────────┘   └─────────┘   └─────────┘   └─────────┘   └─────────┘   └─────────┘   └──────────┘
     Ingest pipeline processors                    ▲            Search pipeline processors
                                                   │
                                              ┌─────────┐
                                              │   OCR   │
                                              │ Image   │
                                              │ Tiling  │
                                              └─────────┘

Capabilities

Capability	Description	Learn more
Content processing	Extract, chunk, embed, and OCR documents in 10+ formats including PDF, DOCX, HTML, and GeoTIFF	Content processing
Vector and semantic search	k-NN vector search, neural sparse search, hybrid scoring, and query-time embedding	Vector and semantic search
Reranking and grounding	LLM-powered multimodal reranking with retrieval grounding for RAG citations	Reranking and grounding
Geospatial intelligence	Geo AI capabilities unmatched by any other search engine — GeoTIFF extraction, spatial reprojection, image tiling with spatial indexing	Geospatial intelligence
Security and privacy	100% private deployment, ABAC with policy-driven field redaction, document-level and field-level security	Security and privacy
Extensibility	SPI extension points for custom extractors, embedding providers, inference providers, and raster sources	Extensibility

Key differentiators

100% Private: Every component runs on your infrastructure. Your data never leaves your control. Connect to private embedding and inference models in your own VPC.
End-to-end pipeline: No external services required for content extraction, chunking, embedding, or search — it's all built in.
Cutting-edge Geo AI: Geospatial intelligence capabilities — including GeoTIFF extraction, spatial reprojection, and spatially-aware image tiling — that no other search engine can match.
ABAC security: Attribute-based access control with policy-driven field redaction ensures sensitive data is protected even in AI retrieval workflows.
Extensible architecture: SPI-based extension points let you plug in custom content extractors, embedding models, inference providers, and raster sources without modifying core code.
Private model integration: Connect to AWS Bedrock, OpenAI, or any self-hosted model via HTTP — all within your private network.

The AI retrieval pipeline​

Capabilities​

Key differentiators​

The AI retrieval pipeline

Capabilities

Key differentiators