A key question in the minds of most Elasticsearch users when they create an index is “How many shards should I use?”
In this article we explain the design tradeoffs and performance consequences of choosing different values for the number of shards. Continue reading if you want to learn how to demystify and optimize your sharding strategy.
This is an important topic, and many users are apprehensive as they approach it — and for good reason. A major mistake in shard allocation could cause scaling problems in a production environment that maintains an ever-growing dataset.
On the other hand, we know that there is little Elasticsearch documentation on this topic. Most users just want answers — and they want specific answers, not vague number ranges and warnings for arbitrarily large numbers.
Well, we have some answers. After covering a few definitions and some clarifications, we present several common use cases and provide our recommendations for each.
If you’re fairly new to Elasticsearch, it’s important that you understand the basic jargon and grasp the elemental concepts. If you already have some expertise with ES, you might want to skip to the next section. If you are an ES beginner, let’s proceed on through and consider now this simple diagram of an Elasticsearch cluster:
Remember these definitions while refering to this diagram:
- cluster – An Elasticsearch cluster consists of one or more nodes and is identifiable by its cluster name.
- node – A single Elasticsearch instance. In most environments, each node runs on a separate box
or virtual machine.
- index – In Elasticsearch, an index is a collection of documents.
- shard – Because Elasticsearch is a distributed search engine, an index is usually split into elements known as shards that are distributed across multiple nodes. Elasticsearch automatically manages the arrangement of these shards. It also rebalances the shards as necessary, so users need not worry about the details.
- replica – In the most recent versions (ES 7.x), by default, Elasticsearch creates 1 primary shard and 1 replica for each index. In the earlier versions, the default number of shards was 5 and the number of replicas was
1 per index.
Allocating multiple shards and replicas is the essence of the design for distributed search capability, providing for high availability and quick access in searches against the documents within an index. The main difference between a primary and a replica shard is that only the primary shard can accept indexing requests. Both replica and primary shards can serve querying requests.
In the diagram above, we have an Elasticsearch cluster consisting of two nodes in a default shard configuration. Elasticsearch automatically arranges the five primary shards split across the two nodes. There is one replica shard that corresponds to each primary shard, but the arrangement of these replica shards is altogether different from that of the primary shards.
Again, think distribution.
Allow us to clarify: Remember, the
number_of_shards value pertains to indexes—not to the cluster as whole. This value specifies the number of shards for each index (not the total primary shards in the cluster).
A Word about Replicas
Replicas are primarily for search performance, and a user can add or remove them at any time. They give you additional capacity, higher throughput, and stronger failover. We always recommend a production cluster have at least 2 replicas for failover. As we said, this setting is flexible. The number of replica shards can be modified later, for example, depending on your benchmark tests, etc.
Allocate Shards Carefully
After you configure an Elasticsearch cluster, it’s critically important to realize that it’s hard to modify the shard allocation later. If you later find it necessary to change the number of shards you have two options:
- Reindex all the source documents. Although reindexing is a long process, it can be done without downtime.
- Use Shrink Index API. This API allows you to reduce the number of primary shards in an existing index.
Although these options are always available, selecting the right shard from the start will save a lot of your time.
The primary shard configuration is quite analogous to a hard disk partition in which a repartition of raw disk space requires a user to back up, configure a new partition, and rewrite data onto the new partition.
Small Static Dataset, 2-3 GB
The key consideration as you allocate shards is your expectation for the growth of your dataset.
We quite often see the tendency to unnecessarily overallocate on shard count. (By overallocation, we simply mean specifying more shards per index than is necessary for the current size (document count) for a particular dataset.) Since shard count is such a hot topic within the ES community, users may assume that overallocation is a safe bet.
Elastic was promoting this idea in the early days, but then many users began taking it too far—such as allocating 1,000 shards. Elastic now provides a bit more cautious rationale:
“A little overallocation is good. A kagillion shards is bad. It is difficult to define what constitutes too many shards, as it depends on their size and how they are being used. A hundred shards that are seldom used may be fine, while two shards experiencing very heavy usage could be too many.”
Shard overallocation should be especially avoided when we are dealing with small static datasets that are not expected to grow and for which new indexes are created regularly according to Index Management Lifecycle (IML).
So, why can shard overallocation become a burden to your Elasticsearch cluster?
Remember that there is an additional cost for each shard that you allocate:
- Since a shard is essentially a Lucene index, it consumes file handles, memory, and CPU resources. Although many small shards can speed up processing per shard, they may also form query queues that compromise the cluster performance and decrease query throughput.
- Each search request will touch a copy of every shard in the index, which isn’t a problem when the shards are spread across several nodes. However, contention arises and performance decreases when the shards are competing for the same hardware resources.
- Elasticsearch uses term frequency statistics to calculate relevance, but these statistics correspond to individual shards. Maintaining only a small amount of data across a many shards will tend to result in poor document relevance.
Now that you understand the dangers of shard overallocation, let’s discuss real-world best practices. In fact, there are several considerations to keep in mind when you select the shard count for your indexes.
Even though there is no fixed limit on shards imposed by Elasticsearch, the shard count should be proportional to the amount of JVM heap available. We know that the maximum JVM heap size recommendation for Elasticsearch is approximately 30-32GB.
No matter what actual JVM heap size you have, the upper bound on the maximum shard count should be 20 shards per 1 GB of heap configured on the server. So, for example, for a node with the heap size of 32 GB (close to the maximum) we should have a maximum of 640 shards. This is the upper bound on the shard number per node and should not be considered to be the recommended value.
By all means, try to keep the number of shards per node as reasonable as possible, especially in the case of small static indexes. In general, try to keep the shard size between 1 and 5 GB for such indexes. Further, don’t allocate for an inappropriately high goal of 10 terabytes that you might attain three years from now. It’s likely that you’ll see some performance strain—sooner than you like.
Also, a very important practice that can help you determine the optimal shard size is benchmarking using realistic queries and data. Benchmarking should always be done with queries and index loads similar to what you expect in production. This will help you determine the optimal shard size.
Although we aren’t explaining replicas in detail here, we do recommend that you plan for a modest number of shards and consider increasing the number of replicas. Production clusters should always have at least 2 replicas for failover.
Large and Growing Dataset
Our customers expect their businesses to grow and their datasets to expand accordingly. There is therefore always a need for contingency planning. Many users convince themselves that they’ll encounter explosive growth (although most never actually see an unmanageable spike). In addition, we all want to minimize downtime and avoid resharding.
We strongly encourage you to rely on overallocation for large datasets but only modestly. Also, remember that having very large shards can compromise the performance of your cluster. According to the Elasticsearch blog article:
There is no fixed limit on how large shards can be, but a shard size of 50GB is often quoted as a limit that has been seen to work for a variety of use-cases.
In general, the number of 50 GB per shard can be too big. Usually, you should keep the shard size under the heap size limit which is 32GB per node. For example, if you really think it possible that you could reach 200GB (but not much further without other infrastructure changes), then we recommend an allocation of 7 shards, or 8 shards at most.
We do, however, suggest that you continue to picture the ideal scenario as being one shard per index, per node. A good launch point for capacity planning is to allocate shards with a factor of 1.5 to 3 times the number of nodes in your initial configuration. For example, if you’re starting with 3 nodes, then we recommend that you specify at most 3 x 3 = 9 shards. The exact factor may vary per the user use-case (e.g static vs. dynamic indexes).
Remember that overly large shards can negatively affect the ability of the cluster to recover from failure. This is because it takes more time to rebalance shards to a new node after the failure.
Your shard size may be getting too large if you’re discovering issues through the cluster stats APIs or encountering minor performance degradations. If this is the case, simply add a node, and ES will rebalance the shards accordingly.
Once again, please note that we’re omitting the specification of replicas from our discussion here. The same ideal shard guideline of one shard per index per node also holds true for replica shards.
Do you accumulate daily indices and yet incur only small search loads? Perhaps these indices number in the hundreds but each index is 1GB or smaller. For these and similar problem spaces, our simple recommendation is that you choose one shard.
If you roll with the defaults for Logstash (daily indices) and ES 6.x (5 shards), you could generate up to 890 shards in 6 months. Further, your cluster will be hurting — unless you have 15 nodes or more.
Think about it: most Logstash users are infrequent searchers, performing fewer than one query per minute. Accordingly, we recommend a simple economical setup. Since search performance isn’t a primary requirement for such cases, we don’t need multiple replicas. A single replica is enough for basic redundancy. The data-to- memory ratio can also be quite high.
If you go with a single shard per index, then you could probably run a Logstash configuration for 6 months on a three-node cluster. Ideally, you’d use at least 4GB, but we’d recommend 8GB because 8GB is where network speed starts to get significantly better on most cloud platforms and much less resource-sharing occurs.
We reiterate that shards consume resources and require processing overhead. To compile results from an index consisting of more than one shard, Elasticsearch must query each shard individually (although in parallel), and then it must perform operations on the aggregated results.
Therefore, a machine with more IO headroom (SSDs) and a multi-core processor can definitely benefit from sharding, but you must consider the size, volatility, and future states of your dataset.
Remember that benchmarking indexing load and understanding the dynamics of your index (whether it’s static or highly dynamic) will help you determine the optimal configuration. While there is no one-size-fits-all solution with respect to shard allocation, we hope that you can benefit from this discussion.
Other Helpful Resources
Have a look at these other resources that can help you optimize your work with Elasticsearch:
If you like this article, consider using Qbox hosted Elasticsearch service. It’s stable and more affordable — and we offer top-notch free 24/7 support. Sign up or launch your cluster here, or click “Get Started” in the header navigation. If you need help setting up, refer to “Provisioning a Qbox Elasticsearch Cluster.”
Give It a Whirl!
It’s easy to spin up a standard hosted Elasticsearch cluster on any of our 47 Rackspace, Softlayer, Amazon data centers.