Sunday, April 1, 2018

FW: Using Solr to build a product matcher, with learning to rank

-----Original Message-----
From: Rahul Singh [mailto:rahul.xavier.singh@gmail.com]
Sent: 29 March 2018 22:03
To: solr-user@lucene.apache.org
Subject: Re: Using Solr to build a product matcher, with learning to rank

Maybe overthinking this. There is a "more like this" feature at basically
does this. Give that a try before digging deeper into the LTR methods. It
may be good enough for rock and roll.

--
Rahul Singh
rahul.singh@anant.us

Anant Corporation

On Mar 28, 2018, 12:25 PM -0400, Xavier Schepler
<xavier.schepler@recommerce.com>, wrote:
> Hello,
>
> I'm considering using Solr with learning to rank to build a product
matcher.
> For example, it should match the titles:
> - Apple iPhone 6 16 Gb,
> - iPhone 6 16 Gb,
> - Smartphone IPhone 6 16 Gb,
> - iPhone 6 black 16 Gb,
> to the same internal reference, an unique identifier.
>
> With Solr, each document would then have a field for the product title
> and one for its class, which is the unique identifier of the product.
> Solr would then be used to perform matching as follows.
>
> 1. A search is performed with a given product title.
> 2. The first three results are considered (this requires an initial
> product title database).
> 3. The most frequent identifier is returned.
>
> This method corresponds roughly to a k-Nearest Neighbor approach with
> the cosine metric, k = 3, and a TF-IDF model.
>
> I've done some preliminary tests with Sci-kit learn and the results
> are good, but not as good as the ones of more sophisticated learning
algorithms.
>
> Then, I noticed that there exists learning to rank with Solr.
>
> First, do you think that such an use of Solr makes sense?
> Second, is there a relatively simple way to build a learning model
> using a sparse representation of the query TF-IDF vector?
>
> Kind regards,
>
> Xavier Schepler

No comments:

Post a Comment