Saturday, March 17, 2018

FW: Solr document routing using composite key

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com]
Sent: 16 March 2018 21:54
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: Solr document routing using composite key

What Shawn said. 117 shards and 116 docs tells you absolutely nothing
useful. I've never seen the number of docs on various shards be off by more
than 2-3% when enough docs are indexed to be statistically valid.

Best,
Erick

On Fri, Mar 16, 2018 at 5:34 AM, Shawn Heisey <apache@elyograg.org> wrote:
> On 3/6/2018 11:53 AM, Nawab Zada Asad Iqbal wrote:
>>
>> I have 117 shards and i tried to use document ids from zero to 116. I
>> find that the distribution is very uneven, e.g., the largest bucket
>> receives total 5 documents; and around 38 shards will be empty. Is it
expected?
>
>
> With such a small data set, this fits what I would expect.
>
> Choosing buckets by hashing (which is what compositeId does) is not
> perfect, but if you send it thousands or millions of documents, it
> will be
> *generally* balanced.
>
> Thanks,
> Shawn
>

No comments:

Post a Comment