Sunday, April 1, 2018

FW: Indexing multi level Nested JSON

-----Original Message-----
From: Zheng Lin Edwin Yeo [mailto:edwinyeozl@gmail.com]
Sent: 22 March 2018 21:10
To: solr-user@lucene.apache.org
Subject: Re: Indexing multi level Nested JSON

I'm trying to index the following JSON with 2 child level using the
following curl command:

.\curl '
http://localhost:8983/solr/collection1/update/json/docs?split=/|/orgs'
-H 'Content-type:application/json' -d '
{
"id":"1",
"name_s": "JoeSmith",
"phone_s": 876876687,
"orgs": [
{
"name1_s" : "Microsoft",
"city_s" : "Seattle",
"zip_s" : 98052,

"orgs":[{"name2_ss":"alan","phone2_ss":"123"},{"name2_ss":"edwin","phone2_ss
":"456"}]
},
{
"name1_s" : "Apple",
"city_s" : "Cupertino",
"zip_s" : 95014,

"orgs":[{"name2_ss":"alan","phone2_ss":"123"},{"name2_ss":"edwin","phone2_ss
":"456"}]
}
]
}'

However, after indexing, this is what is shown in Solr. The 2nd child have
been place together under the 1st child as a multi-valued field, which is
wrong

{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":41,
"params":{
"q":"phone_s:876876687",
"fl":"*,[child parentFilter=phone_s:876876687]",
"sort":"id asc"}},
"response":{"numFound":1,"start":0,"docs":[
{
"id":"1",
"name_s":"JoeSmith",
"phone_s":"876876687",
"language_s":"en",
"_version_":1595632041779527680,
"_childDocuments_":[
{
"name1_s":"Microsoft",
"city_s":"Seattle",
"zip_s":"98052",
"orgs.name2_ss":["alan",
"edwin"],
"orgs.phone2_ss":["123",
"456"],
"_version_":1595632041779527680},
{
"name1_s":"Apple",
"city_s":"Cupertino",
"zip_s":"95014",
"orgs.name2_ss":["alan",
"edwin"],
"orgs.phone2_ss":["123",
"456"],
"_version_":1595632041779527680}]}]
}}


How can we structure the curl command so it will be able to accept child of
child relationship? We should not be doing any pre-processing to the JSON to
achieve that.

Regards,
Edwin


On 20 March 2018 at 16:44, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com> wrote:

> Hi Mikhail,
>
> Thanks for your reply.
> Meaning the only way to identify them is to add in the fields, like Eg:
> contentType during indexing?
>
> Regards,
> Edwin
>
> On 20 March 2018 at 16:34, Mikhail Khludnev <mkhl@apache.org> wrote:
>
>> Edwin,
>> You need to add necessary fields into child/grands to keep multiple
>> levels and reconstruct them in result post processing.
>> There is nothing ready-made for it.
>>
>>
>> On Tue, Mar 20, 2018 at 7:02 AM, Zheng Lin Edwin Yeo <
>> edwinyeozl@gmail.com>
>> wrote:
>>
>> > I have found that we can index multi level Nested JSON with child
>> > of
>> child
>> > relationship.
>> >
>> > However, how can we identify it from the output that it is the
>> > child of child relationship? From what I have see, all the line
>> > results are tied
>> and
>> > pointed to the parents, so it seems that all are the parent-child
>> > relationship, and I can't identify which are the child of child
>> > relationship.
>> >
>> > Regards,
>> > Edwin
>> >
>> > On 19 March 2018 at 11:16, Zheng Lin Edwin Yeo
>> > <edwinyeozl@gmail.com>
>> > wrote:
>> >
>> > > Hi,
>> > >
>> > > I have this sample multi level Nested JSON, with 2 level of child
>> > > Documents.
>> > >
>> > > [
>> > > {
>> > > "id": "1",
>> > > "title_s": "Solr adds block join support",
>> > > "contenttype_s": "parentDocument",
>> > > "_childDocuments_": [
>> > > {
>> > > "id": "3",
>> > > "comments_s": "SolrCloud supports it too!",
>> > > "_childDocuments_":[{"name_s":"alan","phone_s":"123"},{"
>> > > name_s":"edwin","phone_s":"456"}]
>> > > },
>> > > {
>> > > "id": "3a",
>> > > "comments_s": "SolrCloud supports it too 2!",
>> > > "_childDocuments_":[{"name_s":"alan","phone_s":"123"},{"
>> > > name_s":"edwin","phone_s":"456"}]
>> > > }
>> > > ]
>> > > },
>> > > {
>> > > "id": "2",
>> > > "title_s": "New Lucene and Solr release is out",
>> > > "contenttype_s": "parentDocument",
>> > > "_childDocuments_": [
>> > > {
>> > > "id": "4",
>> > > "comments_s": "Lots of new features",
>> > > "_childDocuments_":[{"name_s":"alan","phone_s":"123"},{"
>> > > name_s":"edwin","phone_s":"456"}]
>> > > }
>> > > ]
>> > > },
>> > > {
>> > > "id": "5",
>> > > "title_s": "Testing of Nested JSON",
>> > > "contenttype_s": "parentDocument",
>> > > "_childDocuments_": [
>> > > {
>> > > "id": "6",
>> > > "comments_s": "See if this is a child",
>> > > "_childDocuments_":[{"name_s":"alan","phone_s":"123"},{"
>> > > name_s":"edwin","phone_s":"456"}]
>> > > }
>> > > ]
>> > > }
>> > > ]
>> > >
>> > >
>> > > However, when it is indexed into Solr, there is only one level,
>> > > and
>> the
>> > > output becomes like this.
>> > >
>> > > {
>> > > "responseHeader":{
>> > > "zkConnected":true,
>> > > "status":0,
>> > > "QTime":1,
>> > > "params":{
>> > > "q":"contenttype_s:parentDocument",
>> > > "fl":"*,[child parentFilter=contenttype_s:parentDocument]",
>> > > "sort":"id asc"}},
>> > > "response":{"numFound":3,"start":0,"docs":[
>> > > {
>> > > "id":"1",
>> > > "title_s":"Solr adds block join support",
>> > > "contenttype_s":"parentDocument",
>> > > "signature":"0000000000000000",
>> > > "_version_":1595334082096529408,
>> > > "_childDocuments_":[
>> > > {
>> > > "name_s":"alan",
>> > > "phone_s":"123",
>> > > "_version_":1595334082096529408},
>> > > {
>> > > "name_s":"edwin",
>> > > "phone_s":"456",
>> > > "_version_":1595334082096529408},
>> > > {
>> > > "id":"3",
>> > > "comments_s":"SolrCloud supports it too!",
>> > > "_version_":1595334082096529408},
>> > > {
>> > > "name_s":"alan",
>> > > "phone_s":"123",
>> > > "_version_":1595334082096529408},
>> > > {
>> > > "name_s":"edwin",
>> > > "phone_s":"456",
>> > > "_version_":1595334082096529408},
>> > > {
>> > > "id":"3a",
>> > > "comments_s":"SolrCloud supports it too 2!",
>> > > "_version_":1595334082096529408}]},
>> > > {
>> > > "id":"2",
>> > > "title_s":"New Lucene and Solr release is out",
>> > > "contenttype_s":"parentDocument",
>> > > "signature":"0000000000000000",
>> > > "_version_":1595334082099675136,
>> > > "_childDocuments_":[
>> > > {
>> > > "name_s":"alan",
>> > > "phone_s":"123",
>> > > "_version_":1595334082099675136},
>> > > {
>> > > "name_s":"edwin",
>> > > "phone_s":"456",
>> > > "_version_":1595334082099675136},
>> > > {
>> > > "id":"4",
>> > > "comments_s":"Lots of new features",
>> > > "_version_":1595334082099675136}]},
>> > > {
>> > > "id":"5",
>> > > "title_s":"Testing of Nested JSON",
>> > > "contenttype_s":"parentDocument",
>> > > "signature":"0000000000000000",
>> > > "_version_":1595334082101772288,
>> > > "_childDocuments_":[
>> > > {
>> > > "name_s":"alan",
>> > > "phone_s":"123",
>> > > "_version_":1595334082101772288},
>> > > {
>> > > "name_s":"edwin",
>> > > "phone_s":"456",
>> > > "_version_":1595334082101772288},
>> > > {
>> > > "id":"6",
>> > > "comments_s":"See if this is a child",
>> > > "_version_":1595334082101772288}]}]
>> > > }}
>> > >
>> > >
>> > > Is Solr able to support the indexing of multi level Nested JSON?
>> > >
>> > > I have tested this on Solr 6.5.1.
>> > >
>> > > Regards,
>> > > Edwin
>> > >
>> >
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>>
>
>

No comments:

Post a Comment