This section describes the Java API that elasticsearch provides. All elasticsearch operations are executed using a Client object. All operations are completely asynchronous in nature (either accepts a listener, or returns a future).
Additionally, operations on a client may be accumulated and executed in Bulk.
Note, all the APIs are exposed through the Java API (actually, the Java API is used internally to execute them).
Elasticsearch is hosted on Maven Central.
For example, you can define the latest version in your pom.xml file:
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>${es.version}</version>
</dependency>
You can use the Java client in multiple ways:
Obtaining an elasticsearch Client is simple. The most common way to get a client is by:
Another manner is by creating a `TransportClient <#transport-client>`__ that connects to a cluster.
Important:
Please note that you are encouraged to use the same version on client and cluster sides. You may hit some incompatibility issues when mixing major versions.
Instantiating a node based client is the simplest way to get a Client that can execute operations against elasticsearch.
import static org.elasticsearch.node.NodeBuilder.*;
// on startup
Node node = nodeBuilder().node();
Client client = node.client();
// on shutdown
node.close();
When you start a Node, it joins an elasticsearch cluster. You can have different clusters by simply setting the cluster.name setting, or explicitly using the clusterName method on the builder.
You can define cluster.name in the /src/main/resources/elasticsearch.yml file in your project. As long as elasticsearch.yml is present in the classpath, it will be used when you start your node.
cluster.name: yourclustername
Or in Java:
Node node = nodeBuilder().clusterName("yourclustername").node();
Client client = node.client();
The benefit of using the Client is the fact that operations are automatically routed to the node(s) the operations need to be executed on, without performing a “double hop”. For example, the index operation will automatically be executed on the shard that it will end up existing at.
When you start a Node, the most important decision is whether it should hold data or not. In other words, should indices and shards be allocated to it. Many times we would like to have the clients just be clients, without shards being allocated to them. This is simple to configure by setting either node.data setting to false or node.client to true (the NodeBuilder respective helper methods on it):
import static org.elasticsearch.node.NodeBuilder.*;
// on startup
Node node = nodeBuilder().client(true).node();
Client client = node.client();
// on shutdown
node.close();
Another common usage is to start the Node and use the Client in unit/integration tests. In such a case, we would like to start a “local” Node (with a “local” discovery and transport). Again, this is just a matter of a simple setting when starting the Node. Note, “local” here means local on the JVM (well, actually class loader) level, meaning that two local servers started within the same JVM will discover themselves and form a cluster.
import static org.elasticsearch.node.NodeBuilder.*;
// on startup
Node node = nodeBuilder().local(true).node();
Client client = node.client();
// on shutdown
node.close();
The TransportClient connects remotely to an elasticsearch cluster using the transport module. It does not join the cluster, but simply gets one or more initial transport addresses and communicates with them in round robin fashion on each action (though most actions will probably be “two hop” operations).
// on startup
Client client = new TransportClient()
.addTransportAddress(new InetSocketTransportAddress("host1", 9300))
.addTransportAddress(new InetSocketTransportAddress("host2", 9300));
// on shutdown
client.close();
Note that you have to set the cluster name if you use one different than “elasticsearch”:
Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", "myClusterName").build();
Client client = new TransportClient(settings);
//Add transport addresses and do something with the client...
Or using elasticsearch.yml file as shown in ?
The client allows to sniff the rest of the cluster, and add those into its list of machines to use. In this case, note that the IP addresses used will be the ones that the other nodes were started with (the “publish” address). In order to enable it, set the client.transport.sniff to true:
Settings settings = ImmutableSettings.settingsBuilder()
.put("client.transport.sniff", true).build();
TransportClient client = new TransportClient(settings);
Other transport client level settings include:
Parameter | Description |
---|---|
client.transport.ignore_cluster_na me | Set to true to ignore cluster name validation of connected nodes. (since 0.19.4) |
client.transport.ping_timeout | The time to wait for a ping response from a node. Defaults to 5s. |
client.transport.nodes_sampler_int erval | How often to sample / ping the nodes listed and connected. Defaults to 5s. |
The index API allows one to index a typed JSON document into a specific index and make it searchable.
There are several different ways of generating a JSON document:
Internally, each type is converted to byte[] (so a String is converted to a byte[]). Therefore, if the object is in this form already, then use it. The jsonBuilder is highly optimized JSON generator that directly constructs a byte[].
Nothing really difficult here but note that you will have to encode dates according to the Date Format.
String json = "{" +
"\"user\":\"kimchy\"," +
"\"postDate\":\"2013-01-30\"," +
"\"message\":\"trying out Elasticsearch\"" +
"}";
Map is a key:values pair collection. It represents very well a JSON structure:
Map<String, Object> json = new HashMap<String, Object>();
json.put("user","kimchy");
json.put("postDate",new Date());
json.put("message","trying out Elasticsearch");
Elasticsearch already uses Jackson but shades it under org.elasticsearch.common.jackson package.So, you can add your own Jackson version in your pom.xml file or in your classpath. See Jackson Download Page.
For example:
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.1.3</version>
</dependency>
Then, you can start serializing your beans to JSON:
import com.fasterxml.jackson.databind.*;
// instance a json mapper
ObjectMapper mapper = new ObjectMapper(); // create once, reuse
// generate json
String json = mapper.writeValueAsString(yourbeaninstance);
Elasticsearch provides built-in helpers to generate JSON content.
import static org.elasticsearch.common.xcontent.XContentFactory.*;
XContentBuilder builder = jsonBuilder()
.startObject()
.field("user", "kimchy")
.field("postDate", new Date())
.field("message", "trying out Elasticsearch")
.endObject()
Note that you can also add arrays with startArray(String) and endArray() methods. By the way, the field methodaccepts many object types. You can directly pass numbers, dates and even other XContentBuilder objects.
If you need to see the generated JSON content, you can use the string() method.
String json = builder.string();
The following example indexes a JSON document into an index called twitter, under a type called tweet, with id valued 1:
import static org.elasticsearch.common.xcontent.XContentFactory.*;
IndexResponse response = client.prepareIndex("twitter", "tweet", "1")
.setSource(jsonBuilder()
.startObject()
.field("user", "kimchy")
.field("postDate", new Date())
.field("message", "trying out Elasticsearch")
.endObject()
)
.execute()
.actionGet();
Note that you can also index your documents as JSON String and that you don’t have to give an ID:
String json = "{" +
"\"user\":\"kimchy\"," +
"\"postDate\":\"2013-01-30\"," +
"\"message\":\"trying out Elasticsearch\"" +
"}";
IndexResponse response = client.prepareIndex("twitter", "tweet")
.setSource(json)
.execute()
.actionGet();
IndexResponse object will give you a report:
// Index name
String _index = response.getIndex();
// Type name
String _type = response.getType();
// Document ID (generated or not)
String _id = response.getId();
// Version (if it's the first time you index this document, you will get: 1)
long _version = response.getVersion();
// isCreated() is true if the document is a new one, false if it has been updated
boolean created = response.isCreated();
For more information on the index operation, check out the REST index docs.
The index API allows one to set the threading model the operation will be performed when the actual execution of the API is performed on the same node (the API is executed on a shard that is allocated on the same server).
The options are to execute the operation on a different thread, or to execute it on the calling thread (note that the API is still asynchronous). By default, operationThreaded is set to true which means the operation is executed on a different thread.
The get API allows to get a typed JSON document from the index based on its id. The following example gets a JSON document from an index called twitter, under a type called tweet, with id valued 1:
GetResponse response = client.prepareGet("twitter", "tweet", "1")
.execute()
.actionGet();
For more information on the get operation, check out the REST get docs.
The get API allows to set the threading model the operation will be performed when the actual execution of the API is performed on the same node (the API is executed on a shard that is allocated on the same server).
The options are to execute the operation on a different thread, or to execute it on the calling thread (note that the API is still async). By default, operationThreaded is set to true which means the operation is executed on a different thread. Here is an example that sets it to false:
GetResponse response = client.prepareGet("twitter", "tweet", "1")
.setOperationThreaded(false)
.execute()
.actionGet();
The delete API allows one to delete a typed JSON document from a specific index based on its id. The following example deletes the JSON document from an index called twitter, under a type called tweet, with id valued 1:
DeleteResponse response = client.prepareDelete("twitter", "tweet", "1")
.execute()
.actionGet();
For more information on the delete operation, check out the delete API docs.
The delete API allows to set the threading model the operation will be performed when the actual execution of the API is performed on the same node (the API is executed on a shard that is allocated on the same server).
The options are to execute the operation on a different thread, or to execute it on the calling thread (note that the API is still async). By default, operationThreaded is set to true which means the operation is executed on a different thread. Here is an example that sets it to false:
DeleteResponse response = client.prepareDelete("twitter", "tweet", "1")
.setOperationThreaded(false)
.execute()
.actionGet();
The bulk API allows one to index and delete several documents in a single request. Here is a sample usage:
import static org.elasticsearch.common.xcontent.XContentFactory.*;
BulkRequestBuilder bulkRequest = client.prepareBulk();
// either use client#prepare, or use Requests# to directly build index/delete requests
bulkRequest.add(client.prepareIndex("twitter", "tweet", "1")
.setSource(jsonBuilder()
.startObject()
.field("user", "kimchy")
.field("postDate", new Date())
.field("message", "trying out Elasticsearch")
.endObject()
)
);
bulkRequest.add(client.prepareIndex("twitter", "tweet", "2")
.setSource(jsonBuilder()
.startObject()
.field("user", "kimchy")
.field("postDate", new Date())
.field("message", "another post")
.endObject()
)
);
BulkResponse bulkResponse = bulkRequest.execute().actionGet();
if (bulkResponse.hasFailures()) {
// process failures by iterating through each bulk response item
}
Using Bulk Processor
The BulkProcessor class offers a simple interface to flush bulk operations automatically based on the number or size of requests, or after a given period.
To use it, first create a BulkProcessor instance:
import org.elasticsearch.action.bulk.BulkProcessor;
BulkProcessor bulkProcessor = BulkProcessor.builder(
client,
new BulkProcessor.Listener() {
@Override
public void beforeBulk(long executionId,
BulkRequest request) { ... }
@Override
public void afterBulk(long executionId,
BulkRequest request,
BulkResponse response) { ... }
@Override
public void afterBulk(long executionId,
BulkRequest request,
Throwable failure) { ... }
})
.setBulkActions(10000)
.setBulkSize(new ByteSizeValue(1, ByteSizeUnit.GB))
.setFlushInterval(TimeValue.timeValueSeconds(5))
.setConcurrentRequests(1)
.build();
Add your elasticsearch client
This method is called just before bulk is executed. You can for example see the numberOfActions with request.numberOfActions()
This method is called after bulk execution. You can for example check if there was some failing requests with response.hasFailures()
This method is called when the bulk failed and raised a Throwable
We want to execute the bulk every 10 000 requests
We want to flush the bulk every 1gb
We want to flush the bulk every 5 seconds whatever the number of requests
Set the number of concurrent requests. A value of 0 means that only a single request will be allowed to be executed. A value of 1 means 1 concurrent request is allowed to be executed while accumulating new bulk requests.
Then you can simply add your requests to the BulkProcessor:
bulkProcessor.add(new IndexRequest("twitter", "tweet", "1").source(/* your doc here */));
bulkProcessor.add(new DeleteRequest("twitter", "tweet", "2"));
By default, BulkProcessor:
The search API allows one to execute a search query and get back search hits that match the query. It can be executed across one or more indices and across one or more types. The query can either be provided using the query Java API or the filter Java API. The body of the search request is built using the SearchSourceBuilder. Here is an example:
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.search.SearchType;
import org.elasticsearch.index.query.FilterBuilders.*;
import org.elasticsearch.index.query.QueryBuilders.*;
SearchResponse response = client.prepareSearch("index1", "index2")
.setTypes("type1", "type2")
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(QueryBuilders.termQuery("multi", "test")) // Query
.setPostFilter(FilterBuilders.rangeFilter("age").from(12).to(18)) // Filter
.setFrom(0).setSize(60).setExplain(true)
.execute()
.actionGet();
Note that all parameters are optional. Here is the smallest search call you can write:
// MatchAll on the whole cluster with all default options
SearchResponse response = client.prepareSearch().execute().actionGet();
For more information on the search operation, check out the REST search docs.
Read the scroll documentation first!
import static org.elasticsearch.index.query.FilterBuilders.*;
import static org.elasticsearch.index.query.QueryBuilders.*;
QueryBuilder qb = termQuery("multi", "test");
SearchResponse scrollResp = client.prepareSearch(test)
.setSearchType(SearchType.SCAN)
.setScroll(new TimeValue(60000))
.setQuery(qb)
.setSize(100).execute().actionGet(); //100 hits per shard will be returned for each scroll
//Scroll until no hits are returned
while (true) {
for (SearchHit hit : scrollResp.getHits()) {
//Handle the hit...
}
scrollResp = client.prepareSearchScroll(scrollResp.getScrollId()).setScroll(new TimeValue(600000)).execute().actionGet();
//Break condition: No hits are returned
if (scrollResp.getHits().getHits().length == 0) {
break;
}
}
The search API allows one to set the threading model the operation will be performed when the actual execution of the API is performed on the same node (the API is executed on a shard that is allocated on the same server).
There are three threading modes.The NO_THREADS mode means that the search operation will be executed on the calling thread. The SINGLE_THREAD mode means that the search operation will be executed on a single different thread for all local shards. The THREAD_PER_SHARD mode means that the search operation will be executed on a different thread for each local shard.
The default mode is THREAD_PER_SHARD.
See MultiSearch API Query documentation
SearchRequestBuilder srb1 = node.client()
.prepareSearch().setQuery(QueryBuilders.queryString("elasticsearch")).setSize(1);
SearchRequestBuilder srb2 = node.client()
.prepareSearch().setQuery(QueryBuilders.matchQuery("name", "kimchy")).setSize(1);
MultiSearchResponse sr = node.client().prepareMultiSearch()
.add(srb1)
.add(srb2)
.execute().actionGet();
// You will get all individual responses from MultiSearchResponse#getResponses()
long nbHits = 0;
for (MultiSearchResponse.Item item : sr.getResponses()) {
SearchResponse response = item.getResponse();
nbHits += response.getHits().getTotalHits();
}
The following code shows how to add two facets within your search:
SearchResponse sr = node.client().prepareSearch()
.setQuery(QueryBuilders.matchAllQuery())
.addFacet(FacetBuilders.termsFacet("f1").field("field"))
.addFacet(FacetBuilders.dateHistogramFacet("f2").field("birth").interval("year"))
.execute().actionGet();
// Get your facet results
TermsFacet f1 = (TermsFacet) sr.getFacets().facetsAsMap().get("f1");
DateHistogramFacet f2 = (DateHistogramFacet) sr.getFacets().facetsAsMap().get("f2");
See Facets Java API documentation for details.
The count API allows one to easily execute a query and get the number of matches for that query. It can be executed across one or more indices and across one or more types. The query can be provided using the Query DSL.
import static org.elasticsearch.index.query.FilterBuilders.*;
import static org.elasticsearch.index.query.QueryBuilders.*;
CountResponse response = client.prepareCount("test")
.setQuery(termQuery("_type", "type1"))
.execute()
.actionGet();
For more information on the count operation, check out the REST count docs.
The count API allows one to set the threading model the operation will be performed when the actual execution of the API is performed on the same node (the API is executed on a shard that is allocated on the same server).
There are three threading modes.The NO_THREADS mode means that the count operation will be executed on the calling thread. The SINGLE_THREAD mode means that the count operation will be executed on a single different thread for all local shards. The THREAD_PER_SHARD mode means that the count operation will be executed on a different thread for each local shard.
The default mode is SINGLE_THREAD.
The delete by query API allows one to delete documents from one or more indices and one or more types based on a query. Here is an example:
import static org.elasticsearch.index.query.FilterBuilders.*;
import static org.elasticsearch.index.query.QueryBuilders.*;
DeleteByQueryResponse response = client.prepareDeleteByQuery("test")
.setQuery(termQuery("_type", "type1"))
.execute()
.actionGet();
For more information on the delete by query operation, check out the delete_by_query API docs.
Elasticsearch provides a full Java API to play with facets. See the Facets guide.
Use the factory for facet builders (FacetBuilders) and add each facet you want to compute when querying and add it to your search request:
SearchResponse sr = node.client().prepareSearch()
.setQuery( /* your query */ )
.addFacet( /* add a facet */ )
.execute().actionGet();
Note that you can add more than one facet. See Search Java API for details.
To build facet requests, use FacetBuilders helpers. Just import them in your class:
import org.elasticsearch.search.facet.FacetBuilders.*;
Here is how you can use Terms Facet with Java API.
Here is an example on how to create the facet request:
FacetBuilders.termsFacet("f")
.field("brand")
.size(10);
Import Facet definition classes:
import org.elasticsearch.search.facet.terms.*;
// sr is here your SearchResponse object
TermsFacet f = (TermsFacet) sr.getFacets().facetsAsMap().get("f");
f.getTotalCount(); // Total terms doc count
f.getOtherCount(); // Not shown terms doc count
f.getMissingCount(); // Without term doc count
// For each entry
for (TermsFacet.Entry entry : f) {
entry.getTerm(); // Term
entry.getCount(); // Doc count
}
Here is how you can use Range Facet with Java API.
Here is an example on how to create the facet request:
FacetBuilders.rangeFacet("f")
.field("price") // Field to compute on
.addUnboundedFrom(3) // from -infinity to 3 (excluded)
.addRange(3, 6) // from 3 to 6 (excluded)
.addUnboundedTo(6); // from 6 to +infinity
Import Facet definition classes:
import org.elasticsearch.search.facet.range.*;
// sr is here your SearchResponse object
RangeFacet f = (RangeFacet) sr.getFacets().facetsAsMap().get("f");
// For each entry
for (RangeFacet.Entry entry : f) {
entry.getFrom(); // Range from requested
entry.getTo(); // Range to requested
entry.getCount(); // Doc count
entry.getMin(); // Min value
entry.getMax(); // Max value
entry.getMean(); // Mean
entry.getTotal(); // Sum of values
}
Here is how you can use Histogram Facet with Java API.
Here is an example on how to create the facet request:
HistogramFacetBuilder facet = FacetBuilders.histogramFacet("f")
.field("price")
.interval(1);
Import Facet definition classes:
import org.elasticsearch.search.facet.histogram.*;
// sr is here your SearchResponse object
HistogramFacet f = (HistogramFacet) sr.getFacets().facetsAsMap().get("f");
// For each entry
for (HistogramFacet.Entry entry : f) {
entry.getKey(); // Key (X-Axis)
entry.getCount(); // Doc count (Y-Axis)
}
Here is how you can use Date Histogram Facet with Java API.
Here is an example on how to create the facet request:
FacetBuilders.dateHistogramFacet("f")
.field("date") // Your date field
.interval("year"); // You can also use "quarter", "month", "week", "day",
// "hour" and "minute" or notation like "1.5h" or "2w"
Import Facet definition classes:
import org.elasticsearch.search.facet.datehistogram.*;
// sr is here your SearchResponse object
DateHistogramFacet f = (DateHistogramFacet) sr.getFacets().facetsAsMap().get("f");
// For each entry
for (DateHistogramFacet.Entry entry : f) {
entry.getTime(); // Date in ms since epoch (X-Axis)
entry.getCount(); // Doc count (Y-Axis)
}
Here is how you can use Filter Facet with Java API.
If you are looking on how to apply a filter to a facet, have a look at facet filter using Java API.
Here is an example on how to create the facet request:
FacetBuilders.filterFacet("f",
FilterBuilders.termFilter("brand", "heineken")); // Your Filter here
See Filters to learn how to build filters using Java.
Import Facet definition classes:
import org.elasticsearch.search.facet.filter.*;
// sr is here your SearchResponse object
FilterFacet f = (FilterFacet) sr.getFacets().facetsAsMap().get("f");
f.getCount(); // Number of docs that matched
Here is how you can use Query Facet with Java API.
Here is an example on how to create the facet request:
FacetBuilders.queryFacet("f",
QueryBuilders.matchQuery("brand", "heineken"));
Import Facet definition classes:
import org.elasticsearch.search.facet.query.*;
// sr is here your SearchResponse object
QueryFacet f = (QueryFacet) sr.getFacets().facetsAsMap().get("f");
f.getCount(); // Number of docs that matched
See Queries to learn how to build queries using Java.
Here is how you can use the Statistical Facet with Java API.
Here is an example on how to create the facet request:
FacetBuilders.statisticalFacet("f")
.field("price");
Import Facet definition classes:
import org.elasticsearch.search.facet.statistical.*;
// sr is here your SearchResponse object
StatisticalFacet f = (StatisticalFacet) sr.getFacets().facetsAsMap().get("f");
f.getCount(); // Doc count
f.getMin(); // Min value
f.getMax(); // Max value
f.getMean(); // Mean
f.getTotal(); // Sum of values
f.getStdDeviation(); // Standard Deviation
f.getSumOfSquares(); // Sum of Squares
f.getVariance(); // Variance
Here is how you can use the Terms Stats Facet with Java API.
Here is an example on how to create the facet request:
FacetBuilders.termsStatsFacet("f")
.keyField("brand")
.valueField("price");
Import Facet definition classes:
import org.elasticsearch.search.facet.termsstats.*;
// sr is here your SearchResponse object
TermsStatsFacet f = (TermsStatsFacet) sr.getFacets().facetsAsMap().get("f");
f.getTotalCount(); // Total terms doc count
f.getOtherCount(); // Not shown terms doc count
f.getMissingCount(); // Without term doc count
// For each entry
for (TermsStatsFacet.Entry entry : f) {
entry.getTerm(); // Term
entry.getCount(); // Doc count
entry.getMin(); // Min value
entry.getMax(); // Max value
entry.getMean(); // Mean
entry.getTotal(); // Sum of values
}
Here is how you can use the Geo Distance Facet with Java API.
Here is an example on how to create the facet request:
FacetBuilders.geoDistanceFacet("f")
.field("pin.location") // Field containing coordinates we want to compare with
.point(40, -70) // Point from where we start (0)
.addUnboundedFrom(10) // 0 to 10 km (excluded)
.addRange(10, 20) // 10 to 20 km (excluded)
.addRange(20, 100) // 20 to 100 km (excluded)
.addUnboundedTo(100) // from 100 km to infinity (and beyond ;-) )
.unit(DistanceUnit.KILOMETERS); // All distances are in kilometers. Can be MILES
Import Facet definition classes:
import org.elasticsearch.search.facet.geodistance.*;
// sr is here your SearchResponse object
GeoDistanceFacet f = (GeoDistanceFacet) sr.getFacets().facetsAsMap().get("f");
// For each entry
for (GeoDistanceFacet.Entry entry : f) {
entry.getFrom(); // Distance from requested
entry.getTo(); // Distance to requested
entry.getCount(); // Doc count
entry.getMin(); // Min value
entry.getMax(); // Max value
entry.getTotal(); // Sum of values
entry.getMean(); // Mean
}
By default, facets are applied on the query resultset whatever filters exists or are.
If you need to compute facets with the same filters or even with other filters, you can add the filter to any facet using AbstractFacetBuilder#facetFilter(FilterBuilder) method:
FacetBuilders
.termsFacet("f").field("brand") // Your facet
.facetFilter( // Your filter here
FilterBuilders.termFilter("colour", "pale")
);
For example, you can reuse the same filter you created for your query:
// A common filter
FilterBuilder filter = FilterBuilders.termFilter("colour", "pale");
TermsFacetBuilder facet = FacetBuilders.termsFacet("f")
.field("brand")
.facetFilter(filter); // We apply it to the facet
SearchResponse sr = node.client().prepareSearch()
.setQuery(QueryBuilders.matchAllQuery())
.setFilter(filter) // We apply it to the query
.addFacet(facet)
.execute().actionGet();
See documentation on how to build Filters.
By default, facets are computed within the query resultset. But, you can compute facets from all documents in the index whatever the query is, using global parameter:
TermsFacetBuilder facet = FacetBuilders.termsFacet("f")
.field("brand")
.global(true);
The percolator allows one to register queries against an index, and then send percolate requests which include a doc, getting back the queries that match on that doc out of the set of registered queries.
Read the main percolate documentation before reading this guide.
//This is the query we're registering in the percolator
QueryBuilder qb = termQuery("content", "amazing");
//Index the query = register it in the percolator
client.prepareIndex("myIndexName", ".percolator", "myDesignatedQueryName")
.setSource(jsonBuilder()
.startObject()
.field("query", qb) // Register the query
.endObject())
.setRefresh(true) // Needed when the query shall be available immediately
.execute().actionGet();
This indexes the above term query under the name myDesignatedQueryName.
In order to check a document against the registered queries, use this code:
//Build a document to check against the percolator
XContentBuilder docBuilder = XContentFactory.jsonBuilder().startObject();
docBuilder.field("doc").startObject(); //This is needed to designate the document
docBuilder.field("content", "This is amazing!");
docBuilder.endObject(); //End of the doc field
docBuilder.endObject(); //End of the JSON root object
//Percolate
PercolateResponse response = client.preparePercolate()
.setIndices("myIndexName")
.setDocumentType("myDocumentType")
.setSource(docBuilder).execute().actionGet();
//Iterate over the results
for(PercolateResponse.Match match : response) {
//Handle the result which is the name of
//the query in the percolator
}
elasticsearch provides a full Java query dsl in a similar manner to the REST Query DSL. The factory for query builders is QueryBuilders. Once your query is ready, you can use the Search API.
See also how to build Filters
To use QueryBuilders or FilterBuilders just import them in your class:
import static org.elasticsearch.index.query.QueryBuilders.*;
import static org.elasticsearch.index.query.FilterBuilders.*;
Note that you can easily print (aka debug) JSON generated queries using toString() method on QueryBuilder object.
The QueryBuilder can then be used with any API that accepts a query, such as count and search.
See Match Query
QueryBuilder qb = matchQuery(
"name",
"kimchy elasticsearch"
);
field
text
See MultiMatch Query
QueryBuilder qb = multiMatchQuery(
"kimchy elasticsearch",
"user", "message"
);
text
fields
See Boolean Query
QueryBuilder qb = boolQuery()
.must(termQuery("content", "test1"))
.must(termQuery("content", "test4"))
.mustNot(termQuery("content", "test2"))
.should(termQuery("content", "test3"));
must query
must not query
should query
See Boosting Query
QueryBuilder qb = boostingQuery()
.positive(termQuery("name","kimchy"))
.negative(termQuery("name","dadoonet"))
.negativeBoost(0.2f);
query that will promote documents
query that will demote documents
negative boost
QueryBuilder qb = constantScoreQuery(
termFilter("name","kimchy")
)
.boost(2.0f);
you can use a filter
filter score
QueryBuilder qb = constantScoreQuery(
termQuery("name","kimchy")
)
.boost(2.0f);
you can use a query
query score
QueryBuilder qb = disMaxQuery()
.add(termQuery("name","kimchy"))
.add(termQuery("name","elasticsearch"))
.boost(1.2f)
.tieBreaker(0.7f);
add your queries
add your queries
boost factor
tie breaker
See: * Fuzzy Like This Query * Fuzzy Like This Field Query
QueryBuilder qb = fuzzyLikeThisQuery("name.first", "name.last")
.likeText("text like this one")
.maxQueryTerms(12);
fields
text
max num of Terms in generated queries
QueryBuilder qb = fuzzyLikeThisFieldQuery("name.first")
.likeText("text like this one")
.maxQueryTerms(12);
field
text
max num of Terms in generated queries
See: * Has Child Query * Has Parent
// Has Child
QueryBuilder qb = hasChildQuery(
"blog_tag",
termQuery("tag","something")
);
child type to query against
query (could be also a filter)
QueryBuilder qb = hasParentQuery(
"blog",
termQuery("tag","something")
);
parent type to query against
query (could be also a filter)
See: * More Like This Query
// mlt Query
QueryBuilder qb = moreLikeThisQuery("name.first", "name.last")
.likeText("text like this one")
.minTermFreq(1)
.maxQueryTerms(12);
fields
text
ignore threshold
max num of Terms in generated queries
See Range Query
QueryBuilder qb = rangeQuery("price")
.from(5)
.to(10)
.includeLower(true)
.includeUpper(false);
field
from
to
include lower value means that from is gt when false or gte when true
include upper value means that to is lt when false or lte when true
See: * Span Term Query * Span First Query * Span Near Query * Span Not Query * Span Or Query
// Span Term
QueryBuilder qb = spanTermQuery(
"user",
"kimchy"
);
field
value
// Span First
QueryBuilder qb = spanFirstQuery(
spanTermQuery("user", "kimchy"),
3
);
query
max end position
// Span Near
QueryBuilder qb = spanNearQuery()
.clause(spanTermQuery("field","value1"))
.clause(spanTermQuery("field","value2"))
.clause(spanTermQuery("field","value3"))
.slop(12)
.inOrder(false)
.collectPayloads(false);
span term queries
slop factor: the maximum number of intervening unmatched positions
whether matches are required to be in-order
collect payloads or not
// Span Not
QueryBuilder qb = spanNotQuery()
.include(spanTermQuery("field","value1"))
.exclude(spanTermQuery("field","value2"));
span query whose matches are filtered
span query whose matches must not overlap those returned
// Span Or
QueryBuilder qb = spanOrQuery()
.clause(spanTermQuery("field","value1"))
.clause(spanTermQuery("field","value2"))
.clause(spanTermQuery("field","value3"));
span term queries
See Terms Query
QueryBuilder qb = termsQuery("tags",
"blue", "pill")
.minimumMatch(1);
field
values
how many terms must match at least
QueryBuilder qb = topChildrenQuery(
"blog_tag",
termQuery("tag", "something")
)
.score("max")
.factor(5)
.incrementalFactor(2);
field
query
max, sum or avg
how many hits are asked for in the first child query run (defaults to 5)
if not enough parents are found, and there are still more child docs to query, then the child search hits are expanded by multiplying by the incremental_factor (defaults to 2).
See Nested Query
QueryBuilder qb = nestedQuery(
"obj1",
boolQuery()
.must(matchQuery("obj1.name", "blue"))
.must(rangeQuery("obj1.count").gt(5))
)
.scoreMode("avg");
path to nested document
your query. Any fields referenced inside the query must use the complete path (fully qualified).
score mode could be max, total, avg (default) or none
See Indices Query
// Using another query when no match for the main one
QueryBuilder qb = indicesQuery(
termQuery("tag", "wow"),
"index1", "index2"
)
.noMatchQuery(termQuery("tag", "kow"));
query to be executed on selected indices
selected indices
query to be executed on non matching indices
// Using all (match all) or none (match no documents)
QueryBuilder qb = indicesQuery(
termQuery("tag", "wow"),
"index1", "index2"
)
.noMatchQuery("all");
query to be executed on selected indices
selected indices
none (to match no documents), and all (to match all documents). Defaults to all.
See GeoShape Query
Note: the geo_shape type uses Spatial4J and JTS, both of which are optional dependencies. Consequently you must add Spatial4J and JTS to your classpath in order to use this type:
<dependency>
<groupId>com.spatial4j</groupId>
<artifactId>spatial4j</artifactId>
<version>0.4.1</version>
</dependency>
<dependency>
<groupId>com.vividsolutions</groupId>
<artifactId>jts</artifactId>
<version>1.13</version>
<exclusions>
<exclusion>
<groupId>xerces</groupId>
<artifactId>xercesImpl</artifactId>
</exclusion>
</exclusions>
</dependency>
check for updates in Maven Central
check for updates in Maven Central
// Import Spatial4J shapes
import com.spatial4j.core.context.SpatialContext;
import com.spatial4j.core.shape.Shape;
import com.spatial4j.core.shape.impl.RectangleImpl;
// Also import ShapeRelation
import org.elasticsearch.common.geo.ShapeRelation;
// Shape within another
QueryBuilder qb = geoShapeQuery(
"location",
new RectangleImpl(0,10,0,10,SpatialContext.GEO)
)
.relation(ShapeRelation.WITHIN);
field
shape
relation
// Intersect shapes
QueryBuilder qb = geoShapeQuery(
"location",
new PointImpl(0, 0, SpatialContext.GEO)
)
.relation(ShapeRelation.INTERSECTS);
field
shape
relation
// Using pre-indexed shapes
QueryBuilder qb = geoShapeQuery(
"location",
"New Zealand",
"countries")
.relation(ShapeRelation.DISJOINT);
field
indexed shape id
index type of the indexed shapes
relation
elasticsearch provides a full Java query dsl in a similar manner to the REST Query DSL. The factory for filter builders is FilterBuilders.
Once your query is ready, you can use the Search API.
See also how to build Queries.
To use QueryBuilders or FilterBuilders just import them in your class:
import static org.elasticsearch.index.query.QueryBuilders.*;
import static org.elasticsearch.index.query.FilterBuilders.*;
Note that you can easily print (aka debug) JSON generated queries using toString() method on FilterBuilder object.
See And Filter
FilterBuilder filter = andFilter(
rangeFilter("postDate").from("2010-03-01").to("2010-04-01"),
prefixFilter("name.second", "ba"));
filters
Note that you can cache the result using AndFilterBuilder#cache(boolean) method. See ?.
See Bool Filter
FilterBuilder filter = boolFilter()
.must(termFilter("tag", "wow"))
.mustNot(rangeFilter("age").from("10").to("20"))
.should(termFilter("tag", "sometag"))
.should(termFilter("tag", "sometagtag"));
must filter
must not filter
should filter
Note that you can cache the result using BoolFilterBuilder#cache(boolean) method. See ?.
See IDs Filter
FilterBuilder filter = idsFilter("my_type", "type2")
.addIds("1", "4", "100");
FilterBuilder filter = idsFilter()
.addIds("1", "4", "100");
type is optional
See Limit Filter
FilterBuilder filter = limitFilter(100);
number of documents per shard
FilterBuilder filter = geoBoundingBoxFilter("pin.location")
.topLeft(40.73, -74.1)
.bottomRight(40.717, -73.99);
field
bounding box top left point
bounding box bottom right point
Note that you can cache the result using GeoBoundingBoxFilterBuilder#cache(boolean) method. See ?.
FilterBuilder filter = geoDistanceFilter("pin.location")
.point(40, -70)
.distance(200, DistanceUnit.KILOMETERS)
.optimizeBbox("memory")
.geoDistance(GeoDistance.ARC);
field
center point
distance from center point
optimize bounding box: memory, indexed or none
distance computation mode: GeoDistance.SLOPPY_ARC (default), GeoDistance.ARC (slighly more precise but significantly slower) or GeoDistance.PLANE (faster, but inaccurate on long distances and close to the poles)
Note that you can cache the result using GeoDistanceFilterBuilder#cache(boolean) method. See ?.
FilterBuilder filter = geoDistanceRangeFilter("pin.location")
.point(40, -70)
.from("200km")
.to("400km")
.includeLower(true)
.includeUpper(false)
.optimizeBbox("memory")
.geoDistance(GeoDistance.ARC);
field
center point
starting distance from center point
ending distance from center point
include lower value means that from is gt when false or gte when true
include upper value means that to is lt when false or lte when true
optimize bounding box: memory, indexed or none
distance computation mode: GeoDistance.SLOPPY_ARC (default), GeoDistance.ARC (slighly more precise but significantly slower) or GeoDistance.PLANE (faster, but inaccurate on long distances and close to the poles)
Note that you can cache the result using GeoDistanceRangeFilterBuilder#cache(boolean) method. See ?.
FilterBuilder filter = geoPolygonFilter("pin.location")
.addPoint(40, -70)
.addPoint(30, -80)
.addPoint(20, -90);
field
add your polygon of points a document should fall within
Note that you can cache the result using GeoPolygonFilterBuilder#cache(boolean) method. See ?.
See Geo Shape Filter
Note: the geo_shape type uses Spatial4J and JTS, both of which are optional dependencies. Consequently you must add Spatial4J and JTS to your classpath in order to use this type:
<dependency>
<groupId>com.spatial4j</groupId>
<artifactId>spatial4j</artifactId>
<version>0.4.1</version>
</dependency>
<dependency>
<groupId>com.vividsolutions</groupId>
<artifactId>jts</artifactId>
<version>1.13</version>
<exclusions>
<exclusion>
<groupId>xerces</groupId>
<artifactId>xercesImpl</artifactId>
</exclusion>
</exclusions>
</dependency>
check for updates in Maven Central
check for updates in Maven Central
// Import Spatial4J shapes
import com.spatial4j.core.context.SpatialContext;
import com.spatial4j.core.shape.Shape;
import com.spatial4j.core.shape.impl.RectangleImpl;
// Also import ShapeRelation
import org.elasticsearch.common.geo.ShapeRelation;
// Shape within another
FilterBuilder filter = geoShapeFilter(
"location",
new RectangleImpl(0,10,0,10,SpatialContext.GEO)
)
.relation(ShapeRelation.WITHIN);
field
shape
relation
// Intersect shapes
FilterBuilder filter = geoShapeFilter(
"location",
new PointImpl(0, 0, SpatialContext.GEO)
)
.relation(ShapeRelation.INTERSECTS);
field
shape
relation
// Using pre-indexed shapes
FilterBuilder filter = geoShapeFilter(
"location",
"New Zealand",
"countries")
.relation(ShapeRelation.DISJOINT);
field
indexed shape id
index type of the indexed shapes
relation
See: * Has Child Filter * Has Parent Filter
// Has Child
QueryBuilder qb = hasChildFilter(
"blog_tag",
termFilter("tag","something")
);
child type to query against
filter (could be also a query)
// Has Parent
QueryBuilder qb = hasParentFilter(
"blog",
termFilter("tag","something")
);
parent type to query against
filter (could be also a query)
See Missing Filter
FilterBuilder filter = missingFilter("user")
.existence(true)
.nullValue(true);
field
find missing field that doesn’t exist
find missing field with an explicit null value
See Not Filter
FilterBuilder filter = notFilter(
rangeFilter("price").from("1").to("2")
);
filter
See Or Filter
FilterBuilder filter = orFilter(
termFilter("name.second", "banon"),
termFilter("name.nick", "kimchy")
);
filters
Note that you can cache the result using OrFilterBuilder#cache(boolean) method. See ?.
See Prefix Filter
FilterBuilder filter = prefixFilter(
"user",
"ki"
);
field
prefix
Note that you can cache the result using PrefixFilterBuilder#cache(boolean) method. See ?.
See Query Filter
FilterBuilder filter = queryFilter(
queryString("this AND that OR thus")
);
query you want to wrap as a filter
Note that you can cache the result using QueryFilterBuilder#cache(boolean) method. See ?.
See Range Filter
FilterBuilder filter = rangeFilter("age")
.from("10")
.to("20")
.includeLower(true)
.includeUpper(false);
field
from
to
include lower value means that from is gt when false or gte when true
include upper value means that to is lt when false or lte when true
// A simplified form using gte, gt, lt or lte
FilterBuilder filter = rangeFilter("age")
.gte("10")
.lt("20");
field
set from to 10 and includeLower to true
set to to 20 and includeUpper to false
Note that you can ask not to cache the result using RangeFilterBuilder#cache(boolean) method. See ?.
See Script Filter
FilterBuilder filter = scriptFilter(
"doc['age'].value > param1"
).addParam("param1", 10);
script to execute
parameters
Note that you can cache the result using ScriptFilterBuilder#cache(boolean) method. See ?.
See Term Filter
FilterBuilder filter = termFilter(
"user",
"kimchy"
);
field
value
Note that you can ask not to cache the result using TermFilterBuilder#cache(boolean) method. See ?.
See Terms Filter
FilterBuilder filter = termsFilter(
"user",
"kimchy",
"elasticsearch"
)
.execution("plain");
field
terms
execution mode: could be plain, fielddata, bool, and, or, bool_nocache, and_nocache or or_nocache
Note that you can ask not to cache the result using TermsFilterBuilder#cache(boolean) method. See ?.
See Nested Filter
FilterBuilder filter = nestedFilter("obj1",
boolFilter()
.must(termFilter("obj1.name", "blue"))
.must(rangeFilter("obj1.count").gt(5))
);
path to nested document
filter
Note that you can ask not to cache the result using NestedFilterBuilder#cache(boolean) method. See ?.
By default, some filters are cached or not cached. You can have a fine tuning control using cache(boolean) method when exists. For example:
FilterBuilder filter = andFilter(
rangeFilter("postDate").from("2010-03-01").to("2010-04-01"),
prefixFilter("name.second", "ba")
)
.cache(true);
force caching filter