-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 12 posts ] 
Author Message
 Post subject: Connecting Hibernate Search with Solr
PostPosted: Tue Aug 21, 2012 11:47 am 
Newbie

Joined: Tue Aug 21, 2012 11:04 am
Posts: 2
Hi,

in an existing project (which uses Hibernate and Hibernate Search) I want to switch full text search to Solr. As I want to reuse all application code written for Hibernate Search and preferably allow the possibility to simply switch back and forth between HS and Solr, I am considering the way of implementing my own BackendQueueProcessor which will index all the operations to Solr.

Is that correct approach?


Top
 Profile  
 
 Post subject: Re: Connecting Hibernate Search with Solr
PostPosted: Fri Aug 24, 2012 8:55 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

I am not so sure whether a custom BackendQueueProcessor is a good idea. We discussed a Solr / Search integration only recently on this forum and on the hibernate-dev mailing list. See https://forum.hibernate.org/viewtopic.php?f=9&t=1015540 and http://lists.jboss.org/pipermail/hibernate-dev/2012-May/008499.html

Obviously there are several ways on how you could achieve an integration, but it will depend on your usecase. What do you want to achieve and why? Why do you want to use Solr?

--Hardy


Top
 Profile  
 
 Post subject: Re: Connecting Hibernate Search with Solr
PostPosted: Mon Aug 27, 2012 3:37 am 
Newbie

Joined: Wed Sep 21, 2011 2:20 pm
Posts: 16
I am not Miroslav Stastny, but I e.g. using bobobrowse in addition to HS. I think HS facetting possibilities are to limited for most real world scenarios. And as facetting is fundamental for my application, it can happen, that I have to get rid of HS some time in the future.

But I would probably rather go for erlastic search than for Solr. however... the best would be, if HS would have some serious improvements i this area -- as beside the facetting I realy like HS. :-)


Top
 Profile  
 
 Post subject: Re: Connecting Hibernate Search with Solr
PostPosted: Mon Aug 27, 2012 3:48 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

agreed, there is a lot of room for improvement regarding the faceting capabilities. We definitely planning to improve on it. Much could be gained by switching to the faceting functionality offered by the latest Lucene releases. The current approach using a custom fieldcache could also be improved once we switch to Lucene 4 (which needs to be released first of course), because then the limitation of single values fields would be obsolete.

Bottom line, we want to improve in this area. It is just a matter of time. If you have some spare time we are always looking for a helping hand which makes the plans move along faster :-)

--Hardy


Top
 Profile  
 
 Post subject: Re: Connecting Hibernate Search with Solr
PostPosted: Mon Aug 27, 2012 5:03 am 
Newbie

Joined: Tue Aug 21, 2012 11:04 am
Posts: 2
hardy.ferentschik wrote:
Obviously there are several ways on how you could achieve an integration, but it will depend on your usecase. What do you want to achieve and why? Why do you want to use Solr?


We need to de-embed the searching functionality from the application. We expect the search indices to be really huge so we will need to shard, replicate, etc. Our Solr implementation is for now a pure evaluation. We will also try Elastic Search.

The proposed BackendQueueProcessor would allow us to easily switch between Hibernate Search and Solr without the need to change the code.

Miroslav


Top
 Profile  
 
 Post subject: Re: Connecting Hibernate Search with Solr
PostPosted: Mon Aug 27, 2012 5:33 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Miroslav Stastny wrote:
We need to de-embed the searching functionality from the application. We expect the search indices to be really huge so we will need to shard, replicate, etc. Our Solr implementation is for now a pure evaluation. We will also try Elastic Search.

What is really huge? You are actually able to shard indexes in Search as well and you can also have a master/slave setup. I guess my my concern it the mixing of technologies. I think you probably create more issues than you solve. Before I would start mixing technologies I would try to use a single one, be it Solr, Elastic Search or Search.

Miroslav Stastny wrote:
The proposed BackendQueueProcessor would allow us to easily switch between Hibernate Search and Solr without the need to change the code.

How? And what is your exact goal with the custom BackendQueueProcessor? This processor is used for indexing. You still have the whole search side of things. What is the plan there? If you want to do do Solr searches independent of Search why not working with the Lucene index directly. For me that is the one common point between Search and Solr. Search does not do anything special to the Lucene index. Provided you create the right Solr schema the index should be usable by Solr as well.

--Hardy


Top
 Profile  
 
 Post subject: Re: Connecting Hibernate Search with Solr
PostPosted: Mon Aug 27, 2012 1:05 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
We need to de-embed the searching functionality from the application. We expect the search indices to be really huge so we will need to shard, replicate, etc. Our Solr implementation is for now a pure evaluation. We will also try Elastic Search.


Both Solr and Elastic Search are "de-embedded", but you don't need that to get sharing and replication: the Infinispan integration module can support that when paired with sharding with a multi-master configuration.
We would like to integrate both Elastic Search and Solr for the indexing+searching backend, but as Hardy said we'd need a hand with it. Adding the backend is easy and solved the indexing aspect, the distributed search aspect is a bit more complex *if* if you expect to do it via the usual API (would be easier if falling back the their native clients for searching - that could work for an intermediate step for this larger work).

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Connecting Hibernate Search with Solr
PostPosted: Thu Aug 30, 2012 2:36 pm 
Newbie

Joined: Thu Apr 16, 2009 12:16 pm
Posts: 3
Location: Ottawa CND
I would like to see this kind of integration between Solr and HS. My motivation is based on the architecture of the application we have developed. We have a tool used internally that is a fat client. It makes heavy use of Hibernate and JPA spec to provide some cool analysis capabilities over several databases (there is a lot of man hours invested in this tool). As a result of this architecture I cannot see having each of the client machines mounting the directory where the indexes are stored. Also are indexes are very large (billions of field values). So using the DirectoryProvider is not a strong enough abstraction in our case. We want to delegate all Lucene lookups to Solr and in the future SolrCloud.

The issue that I see with HS is that the architecture assumes that you will have access to the index files and can make use of Lucene libraries. This is not possible in our architecture as mentioned above. But I am wondering if there are hooks that we can use in the HS space to force queries off to Solr. If not can you point me in a direction for further investigation? This integration between hibernate client -> Solr server seems more inline with the hibernate client -> database server that is the foundation of Hibernate. Also once SolrCloud is released you can benefit from the added flexibility and performance that indexing in the cloud can provide!

Thank you,
Alex


Top
 Profile  
 
 Post subject: Re: Connecting Hibernate Search with Solr
PostPosted: Thu Aug 30, 2012 3:46 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi Alex,
there isn't a single hook to provide Solr integration; but if you're willing to help coding and beta test it we can give this a strong push forward.

I think that there are 3 relatively independent aspects to address, not all are mandatory to get it working, depending on your needs.

- Schema synchronization
From the entity mapping, annotations and configuration properties (for example the chosen analyzers) it should generate a compatible Solr schema.
This should be automated, as it would be tricky to get it right manually, but of course it's totally possible to make a first version in which we expect people to configure Solr "correctly" whatever that means.

- Index updates
This one should be relatively easy as we already did a good deal of abstraction. Technically, one should implement an
Code:
org.hibernate.search.indexes.spi.IndexManager
.
The role of such an implementation is "a way to forward index change elements to an indexing engine", so basically one has to adapt the incoming Hibernate Search operations and transform and forward them to a solrj client, and handle some configuration properties to handle connection to the correct Solr server.
I don't think this is too hard at all, and could be considered optional if you are using Lucene "read only".

- Performing searches / queries
This is a bit harder as the internal component handling queries - the org.hibernate.search.query.engine.spi.HSQuery - would need some refactoring as it's not completely abstracting the notion of IndexReader / IndexSearcher, which are specific of a direct reference to the Lucene engine; of course I'd be glad to help in the area.
For your use case, you could consider this optional I guess if you're fine in running Queries only via the solrj client directly with your application's code, and take advantage of the index updates.

That's all it takes; it would be very cool and I'm glad to help if you want to try it.
Starting point: https://community.jboss.org/wiki/ContributingToHibernateSearch
but feel free to ask! either here, on mailing list or IRC
http://hibernate.org/community/irc.html

On a totally different page, I think Search *could* address your needs without any code changes with this different setup:
- you need a central server which is considered the reference index, the "master" node (like you would with Solr)
- each fat client connects to the master node with either JGroups or the JMS backend, to delegate writes to the master node
- each fat client has a copy of the index which it "downloads" on need using either a network shared filesystem or by using the Infinispan Directory

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Connecting Hibernate Search with Solr
PostPosted: Fri Aug 31, 2012 4:11 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Right, I agree w/ Sanne. I would at least investigate the master/slave solution. It has the benefit of working w/ already existing components.


Top
 Profile  
 
 Post subject: Re: Connecting Hibernate Search with Solr
PostPosted: Fri Aug 31, 2012 11:19 am 
Newbie

Joined: Thu Apr 16, 2009 12:16 pm
Posts: 3
Location: Ottawa CND
Hi guys,

I will do some investigation and see if that is the direction that we want to go. I will try to remember to follow up on this forum. The other choice is to bake something generic to meet our needs in our product. Of course it is somewhat selfish but time constraints may force this solution.

The reason that we cannot do the master/slave solution is that the index has grown already to 60GB with an intended size of somewhere around 200GB and potentially a lot more. This is the reason that I am already investigating SolrCloud and Elastic Search. So realistically having each client copying the index is simply impossible.

Thank you for your feedback... off to go do some reading!
Alex


Top
 Profile  
 
 Post subject: Re: Connecting Hibernate Search with Solr
PostPosted: Sun Sep 02, 2012 4:52 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Such sized aren't easily digested by any of the tools you listed; although they all can manage it, you're at a level in which it should be worth spending some time to develop the most suited components for your specific need.

I don't think the tasks I defined above would take an unreasonable amount of time to implement, still I've got a different new idea to propose as well:

you could use the master/slave configuration to delegate the write operations from the fat clients to the central server, and develop a new component to run the queries remotely too, on the same remote server: so this server would be the only needing a copy of the index (and maybe a hot standby for failover).
This would be beneficial as well for this server to reuse its search caches.

You could make your own ad-hoc search reusing the JMS setup, or setup a REST api with something quick and easy such as resteasy; you would send the Lucene Query, serialized [1] together with some pagination information (like "results from 50 to 100), and return the primary keys of the matches; then the fat client loads the list of matches. This query remoting step - which is all what you need - could be contributed and then maintained by us, as we need to abstract the Query execution anyway to support Solr or ElasticSearch - and I'm actually liking this suggestion so it could be a fourth option.

[1] we expose a serialization and a deserialization service for Lucene queries, since Apache Lucene dropped support for "Serializable" components we implemented one using Apache Avro.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 12 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.