There are several query languages to query Elasticsearch (e.g. KQL, EQL, DSL, aso), in this blog I will speak about Domain Specific Language (DSL) which is the most flexible and gives access to all Elasticsearch options!
Elasticsearch provides a full Query DSL (Domain Specific Language) based on JSON to define queries. Think of the Query DSL as an AST (Abstract Syntax Tree) of queries, consisting of two types of clauses:
- Leaf query clauses
Leaf query clauses look for a particular value in a particular field, such as the match, term or range queries. These queries can be used by themselves. - Compound query clauses
Compound query clauses wrap other leaf or compound queries and are used to combine multiple queries in a logical fashion (such as the bool or dis_max query), or to alter their behaviour (such as the constant_score query).
Query clauses behave differently depending on whether they are used in query context or filter context.
Query and filter context
Lets understand what are both contexts, but first we need to understand how Elasticsearch sorts matching results.
Relevance scores
By default, Elasticsearch sorts matching search results by relevance score, which measures how well each document matches a query.
The relevance score is a positive floating point number, returned in the _score metadata field of the search API. The higher the _score, the more relevant the document. While each query type can calculate relevance scores differently, score calculation also depends on whether the query clause is run in a query or filter context.
Query context
In the query context, a query clause answers the question “How well does this document match this query clause?” Besides deciding whether or not the document matches, the query clause also calculates a relevance score in the _score metadata field.
Query context is in effect whenever a query clause is passed to a query parameter, such as the query parameter in the search API.
Filter context
In a filter context, a query clause answers the question “Does this document match this query clause?” The answer is a simple Yes or No - no scores are calculated. Filter context is mostly used for filtering structured data, e.g.
- Does this timestamp fall into the range 2022 to 2023?
- Is the status field of a blog set to “published”?
Example of query and filter contexts
Below is an example of query clauses being used in query and filter context in the search API. This query will match documents where all of the following conditions are met:
- The title field contains the word blog.
- The content field contains the word elasticsearch.
- The status field contains the exact word published.
- The publish_date field contains a date from 1 June 2023 onwards.
GET /_search
{
"query": {
"bool": {
"must": [
{ "match": { "title": "Blog" }},
{ "match": { "content": "Elasticsearch" }}
],
"filter": [
{ "term": { "status": "published" }},
{ "range": { "publish_date": { "gte": "2013-06-01" }}}
]
}
}
}
The query parameter indicates query context.
The bool and two match clauses are used in query context, which means that they are used to score how well each document matches.
The filter parameter indicates filter context. Its term and range clauses are used in filter context. They will filter out documents which do not match, but they will not affect the score for matching documents.
You got it, use query clauses in query context for conditions which should affect the score of matching documents, and use all other query clauses in filter context 😉
Queries groups
As you can imagine there are a lot of queries, that is why they are grouped by “type”, one blog is really not enough to go throw all groups queries, bellow is the list of important groups that will be detailed in next blogs:
Full Text queries
The full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing.
Compound queries
Compound queries wrap other compound or leaf queries, either to combine their results and scores, to change their behaviour, or to switch from query to filter context.
Geo queries
Elasticsearch supports two types of geo data: geo_point fields which support lat/lon pairs, and geo_shape fields, which support points, lines, circles, polygons, multi-polygons, etc…
In next blogs, I will go throw all these groups showing you some examples of each one 🙂