Skip to main content
Version: 0.3.1

Oversample processor

The oversample request processor multiplies the size parameter of the search request by a specified sample_factor (>= 1.0), saving the original value in the original_size pipeline variable. The oversample processor is designed to work with the truncate_hits response processor but may be used on its own.

Request fields

The following table lists all request fields.

FieldData typeDescription
sample_factorFloatThe multiplicative factor (>= 1.0) that will be applied to the size parameter before processing the search request. Required.
context_prefixStringMay be used to scope the original_size variable in order to avoid collisions. Optional.
tagStringThe processor's identifier. Optional.
descriptionStringA description of the processor. Optional.
ignore_failureBooleanIf true, Lucenia ignores any failure of this processor and continues to run the remaining processors in the search pipeline. Optional. Default is false.

Example

The following example demonstrates using a search pipeline with an oversample processor.

Setup

Create an index named my_index containing many documents:

POST /_bulk
{ "create":{"_index":"my_index","_id":1}}
{ "doc": { "title" : "document 1" }}
{ "create":{"_index":"my_index","_id":2}}
{ "doc": { "title" : "document 2" }}
{ "create":{"_index":"my_index","_id":3}}
{ "doc": { "title" : "document 3" }}
{ "create":{"_index":"my_index","_id":4}}
{ "doc": { "title" : "document 4" }}
{ "create":{"_index":"my_index","_id":5}}
{ "doc": { "title" : "document 5" }}
{ "create":{"_index":"my_index","_id":6}}
{ "doc": { "title" : "document 6" }}
{ "create":{"_index":"my_index","_id":7}}
{ "doc": { "title" : "document 7" }}
{ "create":{"_index":"my_index","_id":8}}
{ "doc": { "title" : "document 8" }}
{ "create":{"_index":"my_index","_id":9}}
{ "doc": { "title" : "document 9" }}
{ "create":{"_index":"my_index","_id":10}}
{ "doc": { "title" : "document 10" }}

Creating a search pipeline

The following request creates a search pipeline named my_pipeline with an oversample request processor that requests 50% more hits than specified in size:

PUT /_search/pipeline/my_pipeline 
{
"request_processors": [
{
"oversample" : {
"tag" : "oversample_1",
"description" : "This processor will multiply `size` by 1.5.",
"sample_factor" : 1.5
}
}
]
}

Using a search pipeline

Search for documents in my_index without a search pipeline:

POST /my_index/_search
{
"size": 5
}

The response contains five hits:

Response

{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_index",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 1"
}
}
},
{
"_index" : "my_index",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 2"
}
}
},
{
"_index" : "my_index",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 3"
}
}
},
{
"_index" : "my_index",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 4"
}
}
},
{
"_index" : "my_index",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 5"
}
}
}
]
}
}

To search with a pipeline, specify the pipeline name in the search_pipeline query parameter:

POST /my_index/_search?search_pipeline=my_pipeline
{
"size": 5
}

The response contains 8 documents (5 * 1.5 = 7.5, rounded up to 8):

Response

{
"took" : 13,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_index",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 1"
}
}
},
{
"_index" : "my_index",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 2"
}
}
},
{
"_index" : "my_index",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 3"
}
}
},
{
"_index" : "my_index",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 4"
}
}
},
{
"_index" : "my_index",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 5"
}
}
},
{
"_index" : "my_index",
"_id" : "6",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 6"
}
}
},
{
"_index" : "my_index",
"_id" : "7",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 7"
}
}
},
{
"_index" : "my_index",
"_id" : "8",
"_score" : 1.0,
"_source" : {
"doc" : {
"title" : "document 8"
}
}
}
]
}
}