Skip to main content
Version: 0.11.0

AI search and retrieval

Lucenia is the search and retrieval engine purpose-built for contextual AI workloads in private clouds. From content extraction to retrieval grounding, Lucenia provides a complete, end-to-end AI retrieval pipeline — all running on your own infrastructure, with enterprise-grade security and cutting-edge geospatial intelligence.

The AI retrieval pipeline

Lucenia's AI retrieval capabilities span the full lifecycle of content — from raw documents to grounded, citation-ready search results:

┌─────────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌──────────┐
│ Extract │──►│ Chunk │──►│ Embed │──►│ Index │──►│ Search │──►│ Rerank │──►│ Ground │
│ │ │ │ │ │ │ │ │ │ │ │ │ │
│ PDF, DOCX, │ │ 4 smart │ │ Bedrock │ │ k-NN │ │ Vector │ │ LLM │ │ Citation │
│ HTML, images│ │ algos │ │ OpenAI │ │ vectors │ │ hybrid │ │ powered │ │ provenance│
│ GeoTIFF │ │ │ │ custom │ │ │ │ sparse │ │ │ │ spatial │
└─────────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘ └──────────┘
Ingest pipeline processors ▲ Search pipeline processors

┌─────────┐
│ OCR │
│ Image │
│ Tiling │
└─────────┘

Capabilities

CapabilityDescriptionLearn more
Content processingExtract, chunk, embed, and OCR documents in 10+ formats including PDF, DOCX, HTML, and GeoTIFFContent processing
Vector and semantic searchk-NN vector search, neural sparse search, hybrid scoring, and query-time embeddingVector and semantic search
Reranking and groundingLLM-powered multimodal reranking with retrieval grounding for RAG citationsReranking and grounding
Geospatial intelligenceGeo AI capabilities unmatched by any other search engine — GeoTIFF extraction, spatial reprojection, image tiling with spatial indexingGeospatial intelligence
Security and privacy100% private deployment, ABAC with policy-driven field redaction, document-level and field-level securitySecurity and privacy
ExtensibilitySPI extension points for custom extractors, embedding providers, inference providers, and raster sourcesExtensibility

Key differentiators

  • 100% Private: Every component runs on your infrastructure. Your data never leaves your control. Connect to private embedding and inference models in your own VPC.
  • End-to-end pipeline: No external services required for content extraction, chunking, embedding, or search — it's all built in.
  • Cutting-edge Geo AI: Geospatial intelligence capabilities — including GeoTIFF extraction, spatial reprojection, and spatially-aware image tiling — that no other search engine can match.
  • ABAC security: Attribute-based access control with policy-driven field redaction ensures sensitive data is protected even in AI retrieval workflows.
  • Extensible architecture: SPI-based extension points let you plug in custom content extractors, embedding models, inference providers, and raster sources without modifying core code.
  • Private model integration: Connect to AWS Bedrock, OpenAI, or any self-hosted model via HTTP — all within your private network.