Srikanth Jeeva's technical Blog: May 2016

I have followed these steps in order to setup Elastic search in production.

# OS Requirements: Centos 6+ & Java 1.8+    

Step1:
------
------

Installing Java
---------------

Download JDK from : http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html 
tar xzvf jdk.tar.gz
sudo mkdir /usr/local/java
sudo mv jdk1.8.0_45 /usr/local/java/
sudo ln -s /usr/local/java/jdk1.8.0_45 /usr/local/java/jdk
export PATH="$PATH:/usr/local/java/jdk/bin"
export JAVA_HOME=/usr/local/java/jdk1.8.0_91/jre
sudo sh -c "echo export JAVA_HOME=/usr/local/java/jdk1.8.0_91/jre >> /etc/environment"


Step2:
-----
-----

Installing Elasticsearch
------------------------

wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/rpm/elasticsearch/2.3.3/elasticsearch-2.3.3.rpm
sudo rpm -ivh elasticsearch-2.3.3.rpm


Step 3:
-------
-------

Configure Elasticsearch
-----------------------

sudo vi /etc/elasticsearch/elasticsearch.yml
[cluster name. common name given to all elasticsearch servers to join the same cluster]
cluster.name: cluster_name # 
node.name: hostname-31 #[servername]
node.master: false #[set true only for master node]
node.data: true #[set true for all nodes, except for dedicated master node]

# This is the path where all data is stored. Give one or multiple paths. 
# Directories must have permsission to user elasticsearch
# chown elasticsearch:elasticsearch /home/es/
path.data: ["/home/es/data/es"] 

#This is the path where all log files are stored. 
# Directories must have permission to user elasticsearch
#chown elasticsearch:elasticsearch /var/logs/es
path.logs: /var/logs/es 

path.work: /home/es/work


bootstrap.mlockall: true #[This setting is must for production]

network.host: "hostname.com"
#http.port: 9200 [9200 is default port. if needed modify this]

#Enter all master nodes
discovery.zen.ping.unicast.hosts: ["hostname.com"]

# calculate by formula (total number of nodes / 2 + 1)
discovery.zen.minimum_master_nodes: 3

# required for production
node.max_local_storage_nodes: 1

# -------------------------------- Threads ------------------
threadpool.index.type: fixed
threadpool.index.size: 40
threadpool.index.queue_size: 10000

threadpool.search.type: fixed
threadpool.search.size: 40
threadpool.search.queue_size: 10000

threadpool.bulk.type: fixed
threadpool.bulk.size: 40
threadpool.bulk.queue_size: 30000


#------------------ allocate memory for write --------------
  indices.memory.index_buffer_size: 30%

#------------------ shard --------------
cluster.routing.allocation.same_shard.host: true

Step 4:
-------
-------

Configure Elasticsearch System Level Setting
--------------------------------------------
# (Important setting)

sudo vi /etc/sysconfig/elasticsearch

#setheap size to maximum 32GB or Half of the RAM size
ES_HEAP_SIZE=32g 
MAX_OPEN_FILES=65535
MAX_LOCKED_MEMORY=unlimited
MAX_MAP_COUNT=262144

Step 5:
-------
-------

Change number of open files
--------------------------------------------
Change ulimit Setting: [http://stackoverflow.com/a/36142698/453486]
-------------------------------------------------------------------

sudo vi /etc/security/limits.conf

#add line: 
elasticsearch - nofile 65535

Step 6:
-------
-------

disable swap
------------


sudo vi /etc/sysctl.conf
#add lines: 
vm.swappiness=0
vm.max_map_count=262144

Step 7:
-------
-------

Installing plugins for monitoring cluster
-----------------------------------------

sudo /usr/share/elasticsearch/bin/plugin install mobz/elasticsearch-head
sudo /usr/share/elasticsearch/bin/plugin install lmenezes/elasticsearch-kopf

# reboot system for the system level settings to work. /sbin/reboot

Step 8:
-------
-------

To Start/Stop/Restart Elasticsearch
------------------------------------

sudo /etc/init.d/elasticsearch start
sudo /etc/init.d/elasticsearch stop
sudo /etc/init.d/elasticsearch restart

Elastic search will index every field and every word within a value.

For Example:

Document 1 has : "text": "Hello World"

Document 2 has : "text": "Hello Srikanth"

ElasticSearch by default will create many indexes and in that the 3 index would be, ["Hello", "World", "Srikanth"]

In some case we want to disable the Full text search, So that we can aggregate by that value.

For Example:

Document 1 has : "filepath": "/home/srikanth/1.c"
Document 2 has : "filepath": "/home/srikanth/2.c"

By default, ElasticSearch will index these documents by ["home", "srikanth, ".c"] , So at the time of aggregating with the path, these values will mess up the aggregated document count.

So we have to tell ElasticSearch, not to index the data by

By this we tell ElasticSearch, that we will always search by the full string and not by sub-strings.

Srikanth Jeeva's technical Blog

Tuesday, May 31, 2016

How to setup Elasticsearch Custer in Centos

I have followed these steps in order to setup Elastic search in production.

Tuesday, May 24, 2016

How to disable Full text search in ElasticSearch