Tuesday, May 31, 2016

How to setup Elasticsearch Custer in Centos

I have followed these steps in order to setup Elastic search in production.

# OS Requirements: Centos 6+ & Java 1.8+    

Installing Java --------------- Download JDK from : http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html tar xzvf jdk.tar.gz sudo mkdir /usr/local/java sudo mv jdk1.8.0_45 /usr/local/java/ sudo ln -s /usr/local/java/jdk1.8.0_45 /usr/local/java/jdk export PATH="$PATH:/usr/local/java/jdk/bin" export JAVA_HOME=/usr/local/java/jdk1.8.0_91/jre sudo sh -c "echo export JAVA_HOME=/usr/local/java/jdk1.8.0_91/jre >> /etc/environment"
Step2: ----- -----
Installing Elasticsearch ------------------------ wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/rpm/elasticsearch/2.3.3/elasticsearch-2.3.3.rpm sudo rpm -ivh elasticsearch-2.3.3.rpm
Step 3: ------- -------
Configure Elasticsearch ----------------------- sudo vi /etc/elasticsearch/elasticsearch.yml [cluster name. common name given to all elasticsearch servers to join the same cluster] cluster.name: cluster_name # node.name: hostname-31 #[servername] node.master: false #[set true only for master node] node.data: true #[set true for all nodes, except for dedicated master node] # This is the path where all data is stored. Give one or multiple paths. # Directories must have permsission to user elasticsearch # chown elasticsearch:elasticsearch /home/es/ path.data: ["/home/es/data/es"] #This is the path where all log files are stored. # Directories must have permission to user elasticsearch #chown elasticsearch:elasticsearch /var/logs/es path.logs: /var/logs/es path.work: /home/es/work bootstrap.mlockall: true #[This setting is must for production] network.host: "hostname.com" #http.port: 9200 [9200 is default port. if needed modify this] #Enter all master nodes discovery.zen.ping.unicast.hosts: ["hostname.com"] # calculate by formula (total number of nodes / 2 + 1) discovery.zen.minimum_master_nodes: 3 # required for production node.max_local_storage_nodes: 1 # -------------------------------- Threads ------------------ threadpool.index.type: fixed threadpool.index.size: 40 threadpool.index.queue_size: 10000 threadpool.search.type: fixed threadpool.search.size: 40 threadpool.search.queue_size: 10000 threadpool.bulk.type: fixed threadpool.bulk.size: 40 threadpool.bulk.queue_size: 30000 #------------------ allocate memory for write -------------- indices.memory.index_buffer_size: 30% #------------------ shard -------------- cluster.routing.allocation.same_shard.host: true
Step 4: ------- -------
Configure Elasticsearch System Level Setting -------------------------------------------- # (Important setting) sudo vi /etc/sysconfig/elasticsearch #setheap size to maximum 32GB or Half of the RAM size ES_HEAP_SIZE=32g MAX_OPEN_FILES=65535 MAX_LOCKED_MEMORY=unlimited MAX_MAP_COUNT=262144
Step 5: ------- -------
Change number of open files -------------------------------------------- Change ulimit Setting: [http://stackoverflow.com/a/36142698/453486] ------------------------------------------------------------------- sudo vi /etc/security/limits.conf #add line: elasticsearch - nofile 65535
Step 6: ------- -------
disable swap ------------ sudo vi /etc/sysctl.conf #add lines: vm.swappiness=0 vm.max_map_count=262144
Step 7: ------- -------
Installing plugins for monitoring cluster ----------------------------------------- sudo /usr/share/elasticsearch/bin/plugin install mobz/elasticsearch-head sudo /usr/share/elasticsearch/bin/plugin install lmenezes/elasticsearch-kopf # reboot system for the system level settings to work. /sbin/reboot
Step 8: ------- -------
To Start/Stop/Restart Elasticsearch ------------------------------------ sudo /etc/init.d/elasticsearch start sudo /etc/init.d/elasticsearch stop sudo /etc/init.d/elasticsearch restart

Tuesday, May 24, 2016

How to disable Full text search in ElasticSearch

Elastic search will index every field and every word within a value.

For Example:

Document 1 has : "text": "Hello World"

Document 2 has : "text": "Hello Srikanth"

ElasticSearch by default will create many indexes and in that the 3 index would be, ["Hello", "World", "Srikanth"]

In some case we want to disable the Full text search, So that we can aggregate by that value.

For Example:

Document 1 has : "filepath": "/home/srikanth/1.c"
Document 2 has : "filepath": "/home/srikanth/2.c"

By default, ElasticSearch will index these documents by ["home", "srikanth, ".c"] , So at the time of aggregating with the path, these values will mess up the aggregated document count.

So we have to tell ElasticSearch, not to index the data by

By this we tell ElasticSearch, that we will always search by the full string and not by sub-strings.