Sunday, April 1, 2018

FW: Problem accessing /solr/_shard1_replica_n1/get

-----Original Message-----
From: Hendrik Haddorp [mailto:hendrik.haddorp@gmx.net]
Sent: 25 March 2018 00:24
To: solr-user@lucene.apache.org
Subject: Re: Problem accessing /solr/<collection-name>_shard1_replica_n1/get

Ah, ok, that might then be related to the auto add replica feature.
Since trying Solr 7 I noticed that Solr is moving my cores around on its
own. I did not see that happening in Solr 6. I believe Solr 6 could also
move replicas on HDFS around but I actually never see that happening.

According to CloudConfig.java the default auto replica failover time is 30s
and I used to wait 2min when restarting nodes as otherwise I ran into
problems with the overseer queue, which got fixed in later Solr 6 releases.
I'm actually just experimenting with increasing the failover time to 5min so
that my nodes can restart before the replicas get moved.
Maybe that does then also resolve this type of problem. Issue SOLR-12114
does make changing the config a bit more tricky though but I got it updated.

thanks,
Hendrik

On 24.03.2018 18:31, Shawn Heisey wrote:
> On 3/24/2018 11:22 AM, Hendrik Haddorp wrote:
>> below is the full entry from the Solr log. I actually also found the
>> list of implicit request handlers later on. But that does make it
>> even more strange that Solr complains about a missing handler.
>
> The "not found" is rather generic, and might not be referring to the
> handler.  I wonder if we can improve those not found messages to
> indicate *what* wasn't found.
>
>> 2018-03-22 18:19:25.599 ERROR
>> (updateExecutor-3-thread-7-processing-n:search-agent3:9007_solr
>> x:collection-0005_shard1_replica_n2 s:shard1 c:collection-0005
>> r:core_node4) [c:collection-0005 s:shard1 r:core_node4
>> x:collection-0005_shard1_replica_n2] o.a.s.c.SyncStrategy
>> http://search-agent3:9007/solr/collection-0005_shard1_replica_n2/:
>> Could not tell a replica to
>>
recover:org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException
:
>> Error from server at http://search-agent3:9007/solr: Unable to locate
>> core collection-0005_shard1_replica_n1
>
> Based on the end of what I quoted here, I think that the issue here
> might be that the *core* doesn't exist, not that the handler doesn't
> exist.  Which may mean that the info in zookeeper doesn't match the
> cores that are actually present and working.
>
> If the core does exist on the disk, maybe Solr had a problem getting
> the core started.
>
> Thanks,
> Shawn
>

No comments:

Post a Comment