Saturday, March 17, 2018

FW: The Impact of the Number of Collections on Indexing Performance in Solr 6.0

-----Original Message-----
From: spoonerk [mailto:john.spooner@gmail.com]
Sent: 12 March 2018 18:26
To: solr-user@lucene.apache.org
Subject: Re: The Impact of the Number of Collections on Indexing Performance
in Solr 6.0

I have tried emailing to.unsubscribe. I have tried disrupting threads
hoping to anger the admin into getting me out of the spam list. All I get
is arrogant emails about headers

On Mar 12, 2018 1:15 AM, "苗海泉" <mseaspring@gmail.com> wrote:

> Thanks Erick and Shawn , Thank you for your patience. I said that
> the above phenomenon was caused by the IO, cpu, memory, and network
> io. The swap was turned off and the machine's memory was sufficient.
> When the speed of indexing is declining, QTime is found to take 3
> seconds to 4 seconds to reload the index, so it can be guessed that it
> is more likely to be a Solr problem than a jetty. It is worth
> mentioning that when the speed of the index under construction dropped
> sharply, the Solr used only about 5% of the CPU, and when it was
> normal, the CPU usage was 200 percent, and the overall system's CPU usage
was 100 percent. About twenty.
>
> Basic information:
> 1) The data volume of each collection is between 2 billion and 3 billion.
> 2) The configuration of the machine is 24 cpu and 128G memory.
> 3) The disk usage per copy is about 10G.
>
> In addition, I noticed that the work of zookeeper is normal and there
> is no error or warning message.
>
> So all these phenomena make me think that the internal specific
> mechanism of solr may lead to a sharp drop in the index construction
> speed. At present, it seems that our solr's machine resources are
sufficient.
>
> As for the reduction of the number of collections that you said, we
> also have this plan, and we are looking for ways to reform it. Are
> there any other suggestions?
>
>
> Best .
> miaohq
>
> 2018-03-11 10:15 GMT+08:00 spoonerk <john.spooner@gmail.com>:
>
> > Wow thanks. Just trying to unsubscribe. Most email lists let u do
> > that
> >
> > On Mar 10, 2018 2:36 PM, "Erick Erickson" <erickerickson@gmail.com>
> wrote:
> >
> > > Spoonerk:
> > >
> > > You say you've tried "many times", but you haven't provided full
> > > header as described in the "problems" link at the link below. You
> > > haven't e-mailed the list owner as suggested in the "problems" link.
> > > You haven't, in short, provided any of the information that's
> > > necessary to actually unsubscribe you.
> > >
> > > Please follow the instructions here:
> > > http://lucene.apache.org/solr/community.html#mailing-lists-irc. In
> > > particular look at the "problems" link.
> > >
> > > You must use the _exact_ same e-mail as you used to subscribe.
> > >
> > > If the initial try doesn't work and following the suggestions at
> > > the "problems" link doesn't work for you, let us know. But note
> > > you need to show us the _entire_ return header to allow anyone to
> > > diagnose the problem.
> > >
> > > Best,
> > > Erick
> > >
> > > On Sat, Mar 10, 2018 at 1:03 PM, spoonerk <john.spooner@gmail.com>
> > wrote:
> > > > I have manually unsubscribed many times. But I still get emails
> > > > from
> > the
> > > > list. Can some admin please unsubscribe me?
> > > >
> > > > On Mar 9, 2018 9:52 PM, "苗海泉" <mseaspring@gmail.com> wrote:
> > > >
> > > >> hello,We found a problem. In solr 6.0, the indexing speed of
> > > >> solr is influenced by the number of solr collections. The speed
> > > >> is normal
> > before
> > > >> the limit is reached. If the limit is reached, the indexing
> > > >> speed
> will
> > > >> decrease by 50 times.
> > > >>
> > > >> In our environment, there are 49 solr nodes. If each collection
> > > >> has
> 25
> > > >> shards, you can maintain high-speed indexing until the total
> > > >> number
> of
> > > >> collections is about 900. To reduce the number of collections
> > > >> to the
> > > limit,
> > > >> the speed will increase. Go up.
> > > >> If each collection is 49 shards, the total number of
> > > >> collections can
> > > only
> > > >> be about 700, exceeding this value will cause the index to drop
> > > >> dramatically.
> > > >> In the explanation, we are single copies, and multiple copies
> > > >> will
> > cause
> > > >> serious stability problems in the large solr cluster environment.
> > > >>
> > > >> At first I suspect that it was due to too many thread
> > > >> submissions,
> and
> > > >> there are still problems with this method, so I'm inclined to
> > > >> searcherExecutor thread pool thread. This is just my guess, I
> > > >> want
> to
> > > know
> > > >> the real reason. Can someone know if I can help?
> > > >>
> > > >> Also, I noticed that the searcherExecutor thread and solr
> collection's
> > > >> shards basically correspond to each other. How can I reduce the
> number
> > > of
> > > >> threads or even close it? Although there are many collections
> > > >> in our environment, there are few queries and it is not
> > > >> necessary to keep
> the
> > > >> threads open to provide queries. This is too wasteful.
> > > >>
> > > >> thank you .
> > > >>
> > >
> >
>
>
>
> --
> ==============================
> 联创科技
> 知行如一
> ==============================
>

No comments:

Post a Comment