Saturday, March 17, 2018

FW: SolrCloud update and luceneMatchVersion

-----Original Message-----
From: Erick Erickson []
Sent: 14 March 2018 23:57
To: solr-user <>
Subject: Re: SolrCloud update and luceneMatchVersion


There's one problem with IndexUpgraderTool. As Shawn points out, it does a
forceMerge, which by default creates one large segment. This has some
implications in terms of the number of deleted documents if the index has
updates afterwards, see:

and the associated JIRA:

My recommendation would be to _not_ run the IndexUpgraderTool and let
background merging do what's necessary over time. Or, as Shawn says,
re-index from scratch.

1> your index is less than 5g. Since that's the default max segment
size (see the article), it won't matter.
2> you optimize frequently anyway
3> you _might_ getaway with a forceMerge where you specify the number
of segments to create is (index_size_in_gigabytes/5g). But frankly I don't
know enough about the algorithm for how segments are chosen in that case to
know whether that'd do exactly what you want.


On Wed, Mar 14, 2018 at 10:08 AM, Hendrik Haddorp <>
> Thanks for the detailed description!
> On 14.03.2018 16:11, Shawn Heisey wrote:
>> On 3/14/2018 5:56 AM, Hendrik Haddorp wrote:
>>> So you are saying that we do not need to run the IndexUpgrader tool
>>> if we move from 6 to 7. Will the index be then updated automatically
>>> or will we get a problem once we move to 8?
>> If you don't run IndexUpgrader, and the index version is one that the
>> new Solr can read, then existing index segments will remain in the
>> format they are. New segments will be written in the new format. If
>> any of the existing segments are merged, then the new larger segment
>> will be in the new format.
>> Summary: If an index starts out as 6.x, then is run for a while in
>> 7.x, but there are still 6.x segments left, then that index will not work
in 8.0.
>> IndexUpgrader is a Lucene tool. This tool just runs a forceMerge
>> process on the index, which will merge all of the existing segments
>> into a single segment. It's EXACTLY the same operation that Solr calls
>> (Lucene used to call it optimize too. Then they renamed it.)
>>> How would one use the IndexUpgrader at all with Solr? Would one need
>>> to run it against the index of every core?
>> The Solr server must be shut down during the IndexUpgrader run.
>> IndexUpgrader is a completely separate tool, part of Lucene. It has
>> zero knowledge of anything that you have configured in Solr, so you
>> must locate the index directory of any core you want to upgrade and
>> run the tool on that index directory.
>> Thanks,
>> Shawn

No comments:

Post a Comment