eZ Community » Forums » Extensions » eZ Find distributed search
expandshrink

eZ Find distributed search

eZ Find distributed search

Monday 16 February 2015 5:39:06 pm - 10 replies

Hi,

Does anyone here have experience setting up eZ Find to search in two different eZ Publish installations?

We have the following situation:

  • Site A in one eZ Publish installation
  • Site B in antoher eZ Publish installation

Both sites resides in the same web server, but they are NOT just plain site accesses. They are totally separate installations.

Now, what we want is to make site A able to search with eZ Find in BOTH the Site A data AND the Site B data.

We have set up multiple cores, indexed the sites and everything so far looks good.

However, the site B data is never searched.

Looking in advancedsearch.tpl, we have added the distributed_search param as follows:

{set $search=fetch( ezfind,search,
                        hash( 'query', $search_text,
                              'offset', $view_parameters.offset,
                              'limit', $page_limit,
                              'sort_by', hash( 'score', 'desc' ),
                              'filter', $filterParameters,
                              'publish_date', $search_date,
                              'spell_check', array( true() ),
                              'class_id',$search_contentclass_id,
                              'section_id',$search_section_id,
                              'as_objects', false(),
                               'distributed_search', hash(
                               'shards',array('SiteA','SiteB'))
))}

But NOTHING happens.

Now, Looking in the classes/ezfezpsolrquerybuilder.php, there seem to be something missing.

This is the ONLY part of the code, that handles shards.

      foreach ( $distributedSearch['shards'] as $shard ){ 
                 $shardURLs[] = $iniShards[$shard];
      }
             $shardQuery = implode( ',', $shardURLs );
        }

 

So, there is a $shardQuery. But it seem to never get handled over to SOLR?

Any ideas on whats going on here is very welcome. Is this functionality still not fully in place, or what?

 

Cheers,

Nicklas

Modified on Tuesday 17 February 2015 7:39:47 am by Nicklas Lundgren

Tuesday 17 February 2015 7:25:16 am

Hello Nicklas,

I don't know anything about this topic (sorry) but I did find this doc on the subject:

https://doc.ez.no/Extensions/eZ-P...guration/Distributed-Search-Features

I hope this helps!

Cheers,
Heath

Tuesday 17 February 2015 7:50:32 am

Thanks Heath,

I have been looking at that page, to no avail.

I wish this functionality was better documented. 

From what I see in the code, there seem to be things missing. And thats strange, as this has been a marketed feature of eZ Find for years now.

Tuesday 17 February 2015 9:09:33 am

Hello Nicklas,

Your welcome. I do what I can, when I can.

I too wish this functionality was better documented (this sadly was always an issue with ezfind in my opinion). I remember reading about this advertised feature too but have never had the need to use it.

I see what your talking about in the php code. For others reading this thread (trying to follow along) this is the part of the code Nicklas is talking about (additional distributed parameters code not being used (seemingly) re: $shardQuery).

https://github.com/ezsystems/ezfi...sses/ezfezpsolrquerybuilder.php#L153

I also just found this document which adds a little more detail on required fields in this use case.

https://doc.ez.no/Extensions/eZ-Publish-extensions/eZ-Find/eZ-Find-LS-5.2.0/Advanced-Configuration/Archiving/Archiving/How-to-use-the-archiving-function#eztoc135967_2

Still the above doc link is mostly the same as the last one I share before.

Cheers,
Heath

Modified on Tuesday 17 February 2015 9:12:06 am by // Heath

Tuesday 17 February 2015 9:59:49 am

Hello Nicklas,

I think the trouble your having must be configuration based since it is very much unlike eZ Systems to include incomplete work in it's products especially since this has been advertised since eZ Find v2.2

This might also help as it was modified with relation to the distributed search doc creation:

https://github.com/ezsystems/ezfind/blob/4c4ca9b7739781f4d72cfdec1356a807a103097c/doc/design/2.2/multi-lingual-improvements-with-multi-core.txt

https://github.com/ezsystems/ezfind/commit/cab12c84b7b625c051a69468b38ff7d7fef96087#diff-0028ec20f86e30dbee66c1c3b2d1205c

After doing a large amount of read only reasearch into the eZ Find code I think your setup must be missing something, in particular your ini settings configuration. Could you share your ezfind and solr.ini settings override file contents via gist? This might help (me at least) try to confirm that the setup would run the multi-core/shard features in question for distributed search.

At a glance in the code it seems that tpl configuration / specification of shards to use in the search is not actually used the class you mentioned in your first post. I'm trying to do more research on how all this works, i'm working blind here as I use ezfind like a big black box most times.

Here was a search I used to learn more about the use of shards, https://github.com/ezsystems/ezfind/search?utf8=%E2%9C%93&q=shards

Warning: I don't honestly understand what a core or a shard is or how they are different from each other. I welcome being educated (as needed blunk.gif Emoticon

It seems that shards are being injected into the query here (in the abstraction above the class you mention in your first post). https://github.com/ezsystems/ezfind/blob/a044741d1c72d3e6754ef3cb757adbfb63501e12/search/plugins/ezsolr/ezsolr.php#L987

If you read all of the code above and bellow following the shard related code execution it seems that shards are being used just differently that you seem to have been expecting.

https://github.com/ezsystems/ezfi...earch/plugins/ezsolr/ezsolr.php#L962

https://github.com/ezsystems/ezfi...arch/plugins/ezsolr/ezsolr.php#L1522

I think that the search results come from ezsolr class at a top level since that is how the fetch function is defined here: 

https://github.com/ezsystems/ezfi.../ezfmodulefunctioncollection.php#L82

https://github.com/ezsystems/ezfi...s/ezfind/function_definition.php#L12

Questions:

This was the best research I could do in such short time and with so little experience. I truly hope that you will study my findings in detail and be able to solve your issue.

I hope this helps!

Cheers,
Heath 

Tuesday 17 February 2015 4:32:37 pm

Hi,

Thanks Heath, I am really impressed of the effort you put into my question. Much appreciated!

I have spent parts of this day trying to geta hold on this issue. The links you provided has been of great help grasping this quite complex peace of code.

From what I have come to understand, there is a rather unfortunate mixup between two concepts here, regarding shards/cores. I think the code and the documentation could have been much clearer and to the point.

Looking in ezfind.ini, there is a section [LanguageSearch].

In this section  it is possible to set Multicore enabled or disabled. However, from what I can see, this only affects languages in different cores, not cores hosting data from different sites.

I have managed to tweak the code in solr.ini, to have Solr make a search also on site B. However, this search doesnt give any results.

What I did was change ezfind/search/plugins/ezsolr/ezsolr.php, around row nr. 980.

I added the $shardQueryPart-line in the middle of the sample below.

This inserts the correct shard for site B into the query to Solr. There seems to be no need to insert the siteA shard here. Probably as it is defined as default shard.

else        {            
                           eZDebug::createAccumulator( 'Query build', 'eZ Find' );            
                           eZDebug::accumulatorStart( 'Query build' );            
                          $queryBuilder = new ezfeZPSolrQueryBuilder( $this );  
 
$shardQueryPart = array( 'shards' => implode( ',', array("localhost:8983/solr/siteB") ) );
 
                      $queryParams = $queryBuilder->buildSearch( $searchText, $params, $searchTypes ); 
 
                           if ( !$shardQueryPart == null ){
                                     $queryParams = array_merge( $shardQueryPart, $queryParams );
                             }

With this "hack", I can see in the output in the console that Solr runs the query on both shards. However, there are no hits in the SiteB shard.

I dont know - but I believe this proves that this part of the querystring was missing in the code to start with.

I will keep up my trial and error for a while, and get back with more info should I succeed.

/Nicklas

Modified on Tuesday 17 February 2015 4:33:16 pm by Nicklas Lundgren

Tuesday 17 February 2015 8:33:39 pm

Hello Nicklas,

I think we actually met (briefly) at the eZ Conference (long ago)!

In any case I'm happy to (try) to help as much as I can.

I'm just thinking out loud, filling time atm.

I think that you should not need to add a hack or change as you describe above to add the shards param into the query part IF you use the ini settings configuration so that this method can do what your hack does (as it truly is designed to do):

https://github.com/ezsystems/ezfind/blob/a044741d1c72d3e6754ef3cb757adbfb63501e12/search/plugins/ezsolr/ezsolr.php#L1522

Which is used later here:

https://github.com/ezsystems/ezfind/blob/a044741d1c72d3e6754ef3cb757adbfb63501e12/search/plugins/ezsolr/ezsolr.php#L962

This would mean that you would need the following settings in place:

  • ezfind.ini.append.php:[LanguageSearch]MultiCore=enabled
  • ezfind.ini.append.php:[LanguageSearch]SearchMainLanguageOnly=disabled
  • ezfind.ini.append.php:[LanguageSearch]LanguagesCoresMap[eng-GB]=eng-GB
  • ezfind.ini.append.php:[LanguageSearch]LanguagesCoresMap[eng-GB@siteB]=eng-GB@siteB
  • ezsolr.ini.append.php:[SolrBase]Shards[eng-GB]=http://localhost:8983/solr/siteA
  • ezsolr.ini.append.php:[SolrBase]Shards[eng-GB@siteB]=http://localhost:8983/solr/siteB
  • site.ini.append.php:[RegionalSettings]SiteLanguageList[]=eng-GB@siteB
  • site.ini.append.php:[RegionalSettings]SiteLanguageList[]=eng-GB

If you don't use the above ini configuration (order seems to matter btw, re: site.ini.append.php:[RegionalSettings]SiteLanguageList see $languages[0]) then the core/shard combination would not be used: https://github.com/ezsystems/ezfind/blob/master/search/plugins/ezsolr/ezsolr.php#L959

https://github.com/ezsystems/ezfind/blob/master/search/plugins/ezsolr/ezsolr.php#L957

If you follow the code execution you see that the above really is required by default in ezsolr.php search plugin class.Another more difficult aspect to follow / understand is that the implementation is written (in english language terms / variable name words) to focus on the multi-language implementation and so I think it's harder to read it and follow how it also works for distributed search of separate core/shards.

Remember to clear all caches and I would even consider doing a clean re-indexing of the search data (at some point to test if that is required, not thinking it would be tho).

The 'initLanguageShards' method is key to all of this working and it depends on the correct ini settings being in place first before what your trying to do will work with the default search plugin implementation (ezsolr in ezfind).

This all starts at the beginning in the constructor: https://github.com/ezsystems/ezfind/blob/master/search/plugins/ezsolr/ezsolr.php#L16

All this said, I'm not clear why you were not getting results with your hack alone.

BUT .. with my example configuration above it would actually force your external core to be used instead of the default which might help you get a focus on getting results (in testing) from it first before troubling with problems with priority of combined core results (unknown, again I don't know what I'm talking about here, I'm not paulb or kristof) I just read the code till my eyes hurt.

You could also enable debug settings so that you could fetch the entire solr raw query sent to the service and -might- be able to use the ezfind / solr backend to learn / refine this manually in search of a raw query that returns the results and transform that into code (as needed)... I dunno I've never done that before. https://github.com/ezsystems/ezfind/blob/master/search/plugins/ezsolr/ezsolr.php#L994

I also wonder if you (doing what your trying to do) technically need to add additional distributed_search parameters like searchfields or returnfields in order to get matching results returned.

But this is where the answers begin to become heavily buried into solr domain and I don't know if I can go that far at this time by myself.

Re: https://doc.ez.no/Extensions/eZ-Publish-extensions/eZ-Find/eZ-Find-LS-5.2.0/Advanced-Configuration/Distributed-Search-Features

I hope this helps!

Cheers,
Heath

Modified on Tuesday 17 February 2015 8:35:03 pm by // Heath

Tuesday 17 February 2015 9:37:05 pm

Hi again,

Yeah, I am too quite sure we have met, back in the old days! Remember meeting someone from Brookins, TX. Must have been you. I guess it was in Norway, right? Or maybee Germany? happy.gif Emoticon

And this is for sure one of the most amazing replies I have ever seen here! Thanks! 

I think you are on to something important - event though I guess that would prove that the code has some redundancy that should have been taken care of. I mean, the $shardQuery-part of the code in ezfezpsolrquerybuilder.php cant be of any use?

And as you say - it´s hard to follow the code, and make the right settings, when the purpose is distributed search. I wonder why no one has made the effort to document this properly.

But anyhow, I am eager to try your suggestion thoroughly and see where it leads.

Will take a few days, though, before I have the time.

Again - I really appreciate the help!

Wednesday 18 February 2015 12:27:40 pm

Hello Nicklas,

Yes happy.gif Emoticon If I remember correctly it was during the eZ Conference 2007 in Skien Norway just after the awards ceremony on the bus on the way to the after party.

I'm happy to impress happy.gif Emoticon Though I agree with you this entire area / use case / etc really should be better documented and prolly could have been implemented in a more clean, easy to understand/use and more flexible way.

With regards to the $shardQuery part in ezfezpsolrquerybuilder.php It absolutely looks like an incomplete implementation. It looks like they just got distracted in the middle of making the implementation and just committed what they had at the moment without thinking. Very disappointing. Even more disappointing that it has been left this way for a very long time.

I think that once you get it working / find a way to get this working a clean up pull request might be in order or at least a blog post documenting how to setup and use this feature.

eZ Find was one of those extensions which started off as an advanced solution made for eZ Systems big customers, implemented by eZ Systems themselves, so the lack of documentation, while sad does not really surprise me.

Feel free to take your time in testing my suggestions. I wish you the very best.

I hope this helps!

Cheers,
Heath

Sunday 22 February 2015 4:59:05 pm

Hi again,

I have now tried to follow your advice above. But I am sorry to say, I have no luck.

I have set up a total of six cores.

  • siteA_swe-SE
  • siteA_eng-GB
  • siteA_deu-DE
  • siteB_swe-SE
  • siteB_eng-GB
  • siteB_deu-DE

I have made exaktly as you suggested.

In the Solr Admin, I can see all six cores, and trying to search works good there.

For instance, I can select the core siteB_swe-SE and search for something. The search result is correct.

 

But now, I want to search from SiteA:s search box, and get results from both siteA and siteB. And it doesnt work.

When seraching in siteA from the website, I only get results from siteA.

However, I can see in the console output from Solr that all shards are used in the queries now. 

So you were right, Heath. Setting it up with LanguageShards worked so far.

 

However, I have one question for you.

You wrote that I should do like this:

  • ezfind.ini.append.php:[LanguageSearch]LanguagesCoresMap[eng-GB]=eng-GB
  • ezfind.ini.append.php:[LanguageSearch]LanguagesCoresMap[eng-GB@siteB]=eng-GB@siteB
  • ezsolr.ini.append.php:[SolrBase]Shards[eng-GB]=http://localhost:8983/solr/siteA
  • ezsolr.ini.append.php:[SolrBase]Shards[eng-GB@siteB]=http://localhost:8983/solr/siteB

I am a little confused here. In ezfind.ini.append, can I really name a language [eng-GB@siteB]? Dont I have to use the ordinary languagecodes? Is it possible to refer to a language like that?

Cheers

Nicklas

Sunday 22 February 2015 9:16:01 pm

Hello Nicklas,

I'm sorry if my suggestions did not work / help 100%.

With regards to the last question ... in the research I did for this forum thread question ...

I found that the 'LanguagesCoresMap' setting is only used to associate / include additional shards in a search.

I found that they were not used anywhere else within the limits of search (if you search they are used for other separate functions). Re: https://github.com/ezsystems/ezfi...mp;q=LanguagesCoresMap&type=Code

So Yes, you really can name a LanguagesCoresMap setting index (language code) like eng-GB@siteB as basically any string is supported here, though I encourage you to use the language code string naming convention I suggest as it is both clear, verbose and this naming convention is already used in other projects I've worked on which override default locals (not pertinent but if a convention works, use it).

To be clear, the LanguagesCoresMap settings is not actually related in any way to actual content object languages or language codes. Which I must admit can be confusing without proper usage instructions / documentation.

I hope this helps!

Cheers,
Heath

expandshrink

You must be logged in to post messages in this topic!

36 542 Users on board!

Forums menu

Proudly Developed with from