Sunday, April 1, 2018

FW: InetAddressPoint support in Solr or other IP type?

-----Original Message-----
From: David Smiley [mailto:david.w.smiley@gmail.com]
Sent: 28 March 2018 10:01
To: solr-user@lucene.apache.org
Subject: Re: InetAddressPoint support in Solr or other IP type?

(I overlooked your reply; sorry to leave you hanging)

From a simplicity standpoint, Just use InetAddressPoint. Solr has no
rules/restrictions as to which Lucene module it's in.

That said, I *suspect* a Terms PrefixTree aligned to each byte would offer
better query performance, presuming that typical range queries are
byte-to-byte (as they would be for IPs?). The Points API internally makes
the splitting decision, and it's not customizable. It's blind to how people
will realistically query the data; it just wants a balanced tree.
For the same reason, I *suspect* (but have not benchmarked to see) that
DateRangeField has better query performance than DatePointField. That said,
a Points index is probably going to be leaner & faster to index.

~ David

On Fri, Mar 23, 2018 at 7:51 PM Mike Cooper <mcooper@carbonblack.com> wrote:

> Thanks David. Is there a reason we wouldn't want to base the Solr
> implementation on the InetAddressPoint class?
>
>
> https://lucene.apache.org/core/7_2_1/misc/org/apache/lucene/document/I
> netAddressPoint.html
>
> I realize that is in the "misc" package for now, so it's not part of
> core Lucene. But it is nice in that it has one class for both ipv4 and
> ipv6 and it's based on point numerics rather than trie numerics which
> seem to be deprecated. I'm pretty familiar with the code base, I could
> take a stab at implementing this. I just wanted to make sure there
> wasn't something I was missing since I couldn't find any discussion on
this.
>
> Michael Cooper
>
> -----Original Message-----
> From: David Smiley [mailto:david.w.smiley@gmail.com]
> Sent: Friday, March 23, 2018 5:14 PM
> To: solr-user@lucene.apache.org
> Subject: Re: InetAddressPoint support in Solr or other IP type?
>
> Hi,
>
> For IPv4, use TrieIntField with precisionStep=8
>
> For IPv6 https://issues.apache.org/jira/browse/SOLR-6741 There's nothing
> there yet; you could help out if you are familiar with the codebase.
> Or you might try something relatively simple involving edge ngrams.
>
> ~ David
>
> On Thu, Mar 22, 2018 at 1:09 PM Mike Cooper <mcooper@carbonblack.com>
> wrote:
>
> > I have scoured the web and cannot find any discussion of having the
> > Lucene InetAddressPoint type exposed in Solr. Is there a reason this
> > is omitted from the Solr supported types? Is it on the roadmap? Is
> > there an alternative recommended way to index and store Ipv4 and
> > Ipv6 addresses for optimal range searches and subnet searches?
> > Thanks for your help.
> >
> >
> >
> > *Michael Cooper*
> >
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

No comments:

Post a Comment