Install and configure SolrCloud
This is a work in progress!
To start with SolrCloud one needs to download Solr version that supports SolrCloud. The latest releases of Solr can be downloaded from the official Solr resource. It is likely that one will need an older release of Solr rather than the latest. Here’s where you can get older releases of Solr.
This guide is written using Solr 5.2.1 and Windows OS. For Linux based systems reverse slashes and use Linux commands for file management.
Follow the usual drill to install Solr which is unpack the archive of Solr distributive into a local drive. For the sake of simplisity E:\app\solr will be used as a <SolrRoot> folder in this article.
Prepare Solr configuration for Sitecore indexes
Every Solr collection is tied to a certain Solr configuration that is uploaded into ZooKeeper that orchestrates file management for SolrCloud instances. Solr distributive comes with ZooKeeper instance baked into it. When running SolrCloud instance locally, there is no need to configure ZooKeeper itself.
Standalone configuration of ZooKeeper (Zk) is described later in this article.
Before collections for Sitecore indexes could be created, one needs to upload Solr configuration with Solr index schema into ZooKeeper. The default Sitecore index schema could be generated using the UI interface in Sitecore application. Refer to Configuring Solr for use with Sitecore 8 for more details on that.
Here are basic steps to prepare Solr configuration for Sitecore indexes.
- Duplicate
<SolrRoot>\server\solr\configsets\basic_configsfolder and give it a configuration name (e.g.sitecore_configs)Next two steps are optional and should be used if one needs to configure language specific stopwords so that the content parsed properly for those langauges. These steps use
sitecore_core_indexas an example sinceCoredatabase by default uses several different languages. By default Sitecore usestext_generalfield type for all language specific fields inschema.xmlfile which uses commonstopwords.txtfile located in theconffolder of the Solr configuration directory (e.g.sitecore_configs). - [Optional] Add the following files to
sitecore_configs\conf\langfolderstoptags_ja.txtstopwords_da.txtstopwords_de.txtstopwords_ja.txtuserdict_ja.txtstopwords_en.txtthis file should already be in there.These files can be copied from
<SolrRoot>\server\solr\configsets\data_driven_schema_configs\conf\langfolder.
All these files are used by Solr to parse language specific content properly. SitecoreCoredatabase by default usesDE,DA,JAandENlanguages which can be configured to use language specific stopwords.
For solutions that have other languages defined, corresponding files must be added to thelangfolder and language field types configured inschema.xmlfile.
- [Optional] Configure language specific dynamic fields and field types. Add field types for
DA,DEandJAlanguages to the `schema.xml’ file generated thru Sitecore app.- add field types for each language. Here is the example for
DAfield type definition: ```XML
Repeat this step for every language that needs to have its own type and refernece proper `stopwords` file. > Field type definitions could be copied from `<SolrRoot>\server\solr\configsets\data_driven_schema_configs\conf\managed-schema` file. - ensure language specific dynamic fields use proper field types. For example: ```XML <dynamicField name="*_t_en" type="text_general" indexed="true" stored="true" /> <dynamicField name="*_t_da" type="text_da" indexed="true" stored="true" /> <dynamicField name="*_t_de" type="text_de" indexed="true" stored="true" /> <dynamicField name="*_t_ja" type="text_ja" indexed="true" stored="true" /> - add field types for each language. Here is the example for
- Replace
schema.xmlfile insitecore_configs\conffolder with the one generated thru Sitecore or modified on steps 2 and 3.
Run SolrCloud instance locally
The easiest way to stand up SolrCloud instance is to run it locally. To do that open any command-line interface, navigate to Solr root folder and run the following command:
bin\solr -e cloud
This will launch an interactive command-line based process to get SolrCloud configured. Simply hitting the <Enter> thru the process steps will stand up 2 SolrCloud nodes on different ports running on the local machine. There will be a default collection called gettingstarted that is split into 2 shards with replication factor set to 2.
Local instance of SolrCloud by default gets placed into
<SolrRoot>\example\cloudfolder. Examine that forlder to see how SolrCloud distributes the collection(s) among its nodes.
Upload index configuration into ZooKeeper
Solr 5.2.1 provides ZooKeeper Command-line interface (a.k.a ZkCli) to work with ZooKeeper file system. The CLI is located at <SolrRoot>\server\scripts\cloud-scripts folder.
Run the following command to upload Solr configuration into ZooKeeper:
zkcli -zkhost localhost:9973 -cmd upconfig -confdir E:\app\solr-5.2.1\server\solr\configsets\sitecore_configs\conf -confname scbasic
Where
localhost:9973is the server and port Zk runs on which is local in this case.-cmd upconfigis command to upload configuration into Zk.Run
zkcli --helpto see all Zk commands-confdir <dirPath>should point to the folder that holdsconfdirectory with Solr configuration (e.g.<SolrRoot>\server\solr\configsets\sitecore_configs\conf).-confname <configurationName>specifies configuration that will be.
Once configuration is uploaded navigate tohttp://localhost:<port>/solr/#/~cloud?view=treeURL and expandconfigsnode to see available configurations.
Create collection
Each index collection must be linked to either existing configuration or should provide a path to configuration folder that will be uploaded into Zk and linked to the collection. Here are a few examples how one can create a collection:
- Run this command to create a collection based on existing configuration
bin\solr create -c scindex -n scbasic -shards 2 -replicationFactor 2 -p 8973createis command that instructs Solr to create a collection-cis collection name parameter-nis configuration name parameter. Thescbasicconfiguration was used in this example created in previous paragraph-shardsis the number of shards for the collection-replicationFactoris the number of replicas for each shard of the collection-pis the port that SolrCloud instance runs on
- Run this command to upload configuration into Zk and create a collection based on it
bin\solr create -c scitems2 -d E:\app\solr-5.2.1\server\solr\configsets\scbasic_configs -n scitems2 -shards 1 -replicationFactor 1 -p 8973Where
-dis path to the folder that holds configuration directory- rest of the parameters are the same as in step 1
Note, when uploading index configuration into Zk using ZkCli, the path in
-confdirparameter must point to the root of the folder that holdssolrconfig.xmlfile (i.e.sitecore_configs\conf).
When uploading index configuration usingbin\solr create -d <path>command, the path should either point to the root of directory that containssolrconfig.xmlfile (e.g.sitecore_configs\conf) or to its parent container that contansconfdirectory which holdssolrconfig.xmlfile (i.e.sitecore_configs).
Run disdributed SolrCloud instance
Distributed SolrCloud environment consists of ZooKeeper (Zk) ensemble and SolrCloud cluster. Zk is used to orchestrate file management among all SolrCloud nodes. SolrCloud nodes will hold all collections and manage indexing/searching operations. Both Zk and SolrCloud need to have redundancy to provide higher fault tolerance.
It’s not uncommon to share machines to run SolrCloud nodes and Zk instances side by side.
This section describes how to configure Zk ensemble and stand up a SolrCloud cluster.
Configuring ZooKeeper ensemble
Download ZooKeeper from Zk distributive resource.
ZooKeeper 3.4.6 was used for this guide.
Unpack Zk distributive into a folder. For the sake of simplisity E:\app\zk-3.4.6 will be used as a <ZkRoot> folder in this article.
Follow these steps to stand up Zk ensemble:
- Create
zkDatafolder that Zk will use to run its operations (e.g.e:\app\zkData).If Zk instances share machines with SolrCloud nodes, it’s highly recommended to place Zk and SolrCloud data folders on separate disks for the best performance.
- Rename
zoo_sample.cfgtozoo.cfgthen modify these configuration settings in the file:dataDir=e:/app/zkData- Add
serverentries that list all servers that run Zk instances. For example:server1=<server1.IP>:2888:3888server2=<server2.IP>:2888:3888server3=<server3.IP>:2888:3888
Rest of settings feel free to keep at default values or tweak as needed.
-
Create
myidfile (no extension) in thezkDatafolder (i.e.e:\app\zkData) and set server id number in that file. For example, onserver1the file should have1. Onserver2it should have2and so on. - [Optional] Use NSSM tool to configure Zk as a Windows service. Here are the steps to create ZooKeeper as a Win service:
- Download NSSM tool
- Run
nssm installin WinCMD - Configure service to use
<ZkRoot>\bin\zkServer.cmd
- Run the following command to start Zk instance manually
<ZkRoot>\bin\zkServer.cmdOn Linux based systems it will be
bin\zkServer.sh - Repeat steps 1-5 on all machines that run Zk instances.
Configuring SolrCloud with ZooKeeper ensemble
Once Zk ensemble is configured and running, one can add SolrCloud nodes to it. To start SolrCloud instance and link it to existing Zk ensemble, execute the following command:
bin\solr -c -f -z "<zkHost1>:<zkPort>,<zkHost2>:<zkPort>,<zkHost3>:<zkPort>" -m 1g -d <solrNodeHome> -p 8983
Where
-cstarts Solr in SolrCloud mode-fruns SolrCloud process in foreground. By default it starts in background-z "<zkHost1>:<zkPort>,<zkHost2>:<zkPort>,<zkHost3>:<zkPort>"specifies ZooKeeper connection string<zkHost>is the IP address of each Zk instance<zkPort>is the port number for each Zk instance
-mmemory cap for the SolrCloud instance. For example:-m 500mallocates 500MB-m 1gallocates 1GB
-pport number for SolrCloud instance
Run the command on every server that hosts SolrCloud node to join the cluster.
Once everything up and running one can start uploading configuration into Zk and creating Solr collections.