After we made a global overview on Elastic Stack and we went more in deep in Elasticsearch terminologies. This third blog in the Elastic Stack series will allow you to know how to set up Elasticsearch, from Downloading, Installing, Configuring, and Starting.

Supported platforms

At the beginning of any installation you have to check the compatibility matrix from operating systems and JVMs, this matrix is available here! Elasticsearch is tested on the listed platforms, but it is possible that it will work on other platforms too.

Host Elasticsearch

It is recommended to run Elasticsearch on a dedicated host or as a primary service. Several Elasticsearch features, such as automatic JVM heap sizing, assume it’s the only resource-intensive application on the host or container.
You can run Elasticsearch on your own hardware or use hosted Elasticsearch Service that is available on AWS, GCP, and Azure!

Download Elasticsearch

As I will install Elasticsearch myself, I need to download it, Elasticsearch is provided in different package formats depending on the OS.
In my case, I will download the latest stable version of tar.gz archives that are available for installation on any Linux distribution and MacOS.
The Linux archive for Elasticsearch v7.16.2 (latest version today) can be downloaded as follows:

[elastic@vmelastic app]$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.16.2-linux-x86_64.tar.gz
--2021-12-28 13:49:54--  https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.16.2-linux-x86_64.tar.gz
... connected.
Proxy request sent, awaiting response... 200 OK
Length: 343664171 (328M) [application/x-gzip]
Saving to: `elasticsearch-7.16.2-linux-x86_64.tar.gz'

100%[===================================================================================================================================================>] 343,664,171 35.0M/s   in 9.7s

2021-12-28 13:50:04 (33.8 MB/s) - `elasticsearch-7.16.2-linux-x86_64.tar.gz' saved [343664171/343664171]

[elastic@vmelastic app]$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.16.2-linux-x86_64.tar.gz.sha512
--2021-12-28 13:50:04--  https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.16.2-linux-x86_64.tar.gz.sha512
... connected.
Proxy request sent, awaiting response... 200 OK
Length: 171 [binary/octet-stream]
Saving to: `elasticsearch-7.16.2-linux-x86_64.tar.gz.sha512'

100%[===================================================================================================================================================>] 171         --.-K/s   in 0s

2021-12-28 13:50:04 (23.5 MB/s) - `elasticsearch-7.16.2-linux-x86_64.tar.gz.sha512' saved [171/171]

[elastic@vmelastic app]$ shasum -a 512 -c elasticsearch-7.16.2-linux-x86_64.tar.gz.sha512
elasticsearch-7.16.2-linux-x86_64.tar.gz: OK
[elastic@vmelastic app]$

Compares the SHA of the downloaded .tar.gz archive and the published checksum, which should output elasticsearch-{version}-linux-x86_64.tar.gz: OK. as shown above.

Install Elasticsearch

Simply, extract the tar.gz file:

[elastic@vmelastic app]$ tar -xzf elasticsearch-7.16.2-linux-x86_64.tar.gz
[elastic@vmelastic app]$ ls -rtl
total 335624
drwxr-x---. 9 elastic elastic      4096 Dec 18 19:48 elasticsearch-7.16.2
-rw-r-----. 1 elastic elastic 343664171 Dec 19 11:01 elasticsearch-7.16.2-linux-x86_64.tar.gz
-rw-r-----. 1 elastic elastic       171 Dec 19 11:01 elasticsearch-7.16.2-linux-x86_64.tar.gz.sha512
[elastic@vmelastic app]$ cd elasticsearch-7.16.2

This directory, the path to elasticsearch-7.16.2 folder is known as $ES_HOME.

The content of ES_HOME:

[elastic@vmelastic elasticsearch-7.16.2]$ ls -rtl
total 652
-rw-r-----.  1 elastic elastic   2710 Dec 18 19:40 README.asciidoc
-rw-r-----.  1 elastic elastic   3860 Dec 18 19:40 LICENSE.txt
drwxr-x---.  2 elastic elastic   4096 Dec 18 19:45 plugins
drwxr-x---.  2 elastic elastic   4096 Dec 18 19:45 logs
-rw-r-----.  1 elastic elastic 627787 Dec 18 19:45 NOTICE.txt
drwxr-x---.  3 elastic elastic   4096 Dec 18 19:48 lib
drwxr-x---.  2 elastic elastic   4096 Dec 18 19:48 bin
drwxr-x---.  9 elastic elastic   4096 Dec 18 19:48 jdk
drwxr-x---. 61 elastic elastic   4096 Dec 18 19:48 modules
drwxr-x---.  3 elastic elastic   4096 Dec 28 13:58 config

Elasticsearch writes the data you index to indices and data streams to a data directory which will be created, and writes in logs directory its own application logs which contain information about cluster health and operations.
It is highly recommended to set the path.data and path.logs in elasticsearch.yml (see below) to locations outside of $ES_HOME because files in $ES_HOME risk deletion during an upgrade.

Configure Elasticsearch

In fact, Elasticsearch ships with good defaults and requires very little configuration depending on your need. However, the configuration files should contain settings which are node-specific, such as node.name and paths, or settings which a node requires in order to be able to join a cluster, such as cluster.name and network.host.

Basically, you have to know about three configuration files:

  • elasticsearch.yml to configure Elasticsearch
  • jvm.options to configure Elasticsearch JVM settings
  • log4j2.properties to configure Elasticsearch logging

These configuration files format is YAML and are located by default in $ES_HOME/config, you can customize the location using ES_PATH_CONF environment variable which is recommended for the same reason as data and logs directories!
Please note that environment variables referenced like ${VARIABLE} notation within the configuration files will be replaced with the value of the environment variable. This is really helpful 😉

Set the JVM heap size

To override the default heap size, set the minimum and maximum heap size settings, Xms and Xmx.
The minimum and maximum values must be the same, and no more than 50% of your total memory, because Elasticsearch requires memory for purposes other than the JVM heap!

To do so, update the jvm.options in the custom config directory.

[elastic@vmelastic ~]$ vi $ES_PATH_CONF/jvm.options

Add the below two lines according to your environment and your need:

-Xms2g
-Xmx2g

Configure Elasticsearch

To configure elasticsearch, update elasticsearch.yml in the custom config directory:

[elastic@vmelastic ~]$ vi $ES_PATH_CONF/elasticsearch.yml

Cluster name
A node can only join a cluster when it shares its cluster.name with all the other nodes in the cluster. The default name is elasticsearch, but you should change it to an appropriate name which describes the purpose of the cluster, and of course if you run more then one cluster.

cluster.name: elasticsearch-logging

Node name
It is worth configuring a more meaningful name of a node, which will also have the advantage of persisting after restarting the node:

node.name: Master1

If you have dedicated host per node, it make sense to set to the server’s HOSTNAME as follows:

node.name: ${HOSTNAME}

Data and Log path
If you are using the .zip or .tar.gz archives, the data and logs directories are sub-folders of $ES_HOME as we saw above. Please be careful, if these important folders are left in their default locations, there is a high risk of them being deleted while upgrading Elasticsearch to a new version (I know I repeat it 😉 )

So, to change the locations of the data and log folder:

path.logs: /data/log/elasticsearch
path.data: /data/elasticsearch

Network Host
By default, Elasticsearch binds to loopback addresses only (127.0.0.1 and [::1]). This is maybe sufficient to run a single development node on a server, but not to build a cluster with multiple nodes.

I recommend you to set this value in all you installations:

network.host: XXX.XXX.X.XX

On the other hand, be aware that Elasticsearch assumes that you are moving from development mode to production mode, and upgrades a number of system startup checks from warnings to exceptions when you set the network.host!

Discovery
Provides a list of the addresses of the master-eligible nodes in the cluster. Each address has the format host:port or host. If the port is not given then it is determined by checking the following settings in order:

transport.profiles.default.port
transport.port

If neither of these is set then the default port is 9300. The default value for discovery.seed_hosts is:

discovery.seed_hosts: ["127.0.0.1", "[::1]"]

In our case, as we set only one node, we will specify it as master-eligible:

discovery.seed_hosts: ["XXX.XXX.X.X"]

Now, we are OK with the configuration, we can start Elasticsearch.

Start Elasticsearch

As said before, it is a best practice to change config, data, and logs directories location.

Define ES_PATH_CONF and then Elasticsearch can be started from the command line as follows:

ES_PATH_CONF=/path/to/config $ES_HOME/bin/elasticsearch

Once Elasticsearch started, you can check it:

curl -X GET "XXX.XXX.X.XX:9200/?pretty"

With X-Pack security enabled, you will need to set username and password in the curl:

curl -u username:password -X GET "XXX.XXX.X.XX:9200/?pretty"

The response should be like:

{
  "name" : "Master1",
  "cluster_name" : "elasticsearch-logging",
  "cluster_uuid" : "7rniLCvFRIGrsDqzJsoo6A",
  "version" : {
    "number" : "7.16.2",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "65f6e357953a5bc21073d89aa29",
    "build_date" : "2021-28-12T12:55:29.143308416Z",
    "build_snapshot" : false,
    "lucene_version" : "8.10.1",
    "minimum_wire_compatibility_version" : "1.2.3",
    "minimum_index_compatibility_version" : "1.2.3"
  },
  "tagline" : "You Know, for Search"
}

Elasticsearch has been downloaded, configured and started successfully, I hope that this blog helps you to begin with Elasticsearch. Don’t hesitate to ask questions I will try to reply as soon as possible 🙂