Sunday, April 1, 2018

FW: querying vs. highlighting: complete freedom?

-----Original Message-----
From: Arturas Mazeika [mailto:mazeika@gmail.com]
Sent: 22 March 2018 14:48
To: solr-user@lucene.apache.org
Subject: querying vs. highlighting: complete freedom?

Hi Solr-Users,

I've been playing with a german collection of documents, where I tried to
search for one word (q=Tag) and highlighted another: (hl.q=Kundigung). Is
this a "legal" use case? My key question is how can I tell solr which query
analyzer to use for highlighting? Strictly speaking, I should use
hl.q=Kündigung to conceptually look for relevant information, but in this
case, no highlighting is returned (as all umlauts are left out in the
index) .

Additional infos:

solr version: 7.2
urls to query:

http://localhost:8983/solr/trans/select?q=trans:Zeit&hl=true&hl.fl=trans&hl.
q=Kundigung&hl.snippets=3&wt=xml&rows=1


http://localhost:8983/solr/trans/select?q=trans:Zeit&hl=true&hl.fl=trans&hl.
q=K%C3%BCndigung&hl.snippets=3&wt=xml&rows=1

<http://localhost:8983/solr/trans/select?q=trans:Zeit&hl=true&hl.fl=trans&hl
.q=Kundigung&hl.snippets=3&wt=xml&rows=1
>

Managed-schema:

<fieldType name="text_de" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" format="snowball"
words="lang/stopwords_de.txt" ignoreCase="true"/>
<filter class="solr.GermanNormalizationFilterFactory"/>
<filter class="solr.GermanLightStemFilterFactory"/>
</analyzer>
</fieldType>


Other additional infos:
https://stackoverflow.com/questions/49276093/solr-highlighting-terms-with-um
laut-not-found-not-highlighted


Cheers,
Arturas

No comments:

Post a Comment