-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 6 posts ] 
Author Message
 Post subject: Is it possible to specify per-entity backend strategy in HS?
PostPosted: Thu Aug 11, 2011 6:54 pm 
Newbie

Joined: Mon Jan 31, 2005 10:01 pm
Posts: 3
Location: Toronto, Canada
We are at cross-roads as to what to do with clustering HS indexing. From what I understand there are only two realistic approaches: shared SAN or JMS. We have no problem going the JMS route except for one thorny issue. One of the entities literally needs to be indexed in real time. The algorithm is the following:

1. We index a set of entities.
2. We run Lucene queries against them.
3. We purge the index of those entities

This is one serial operation. Up until now I thought I could force this entity to be indexed differently (not through JMS but through standard Lucene). But now looking through the Hibernate Search code (3.4.0), it looks like this approach is a non-starter because FullTextSession#flushToIndexes relies on the BatchedQueueingProcessor which cannot be specified on a per entity basis.

So, I just want you either to confirm it or, maybe, suggest a workaround. Also, if the answer is negative, would Hibernate Search 4.0 address this problem? It seems you’re trying to do that: http://relation.to/20991.lace. If not let me know if you want a Jira task. 

I really appreciate your time and answer.


Top
 Profile  
 
 Post subject: Re: Is it possible to specify per-entity backend strategy in HS?
PostPosted: Thu Aug 11, 2011 7:19 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
yes indeed I'm aware of the limitation, and it's going to be addressed by version 4.0. In fact, it already was and if you checkout the current master branch you should be able to try it out; alas I didn't document it yet, but basically all you have to do is provide different backend configurations "scoped" under the index name; so where it was
Code:
hibernate.search.[backendoptions]

now all properties are relative to the index name they apply to:
Code:
hibernate.search.default.[backendoptions] //all indexes
hibernate.search.[indexName].[backendoptions]. //overrides per index

If you can't wait for Hibernate Search 4.0, in fact with Hibernate Search 3.4 you have a third alternative, which is to use Infinispan as clustering engine; it would still require a JMS backend to write the changes, but you can setup the backend and the JMS queue in synchronous mode, and as soon as the update is delivered all other nodes will be able to query it.

The main characteristic of using Infinispan as a directoryprovider is that state is transferred to all nodes in real time.

Another option which might be viable - depends on your model - is to use a different Hibernate instance for this specific entity.

In all cases I would not recommend the SAN approach - that's going to open a can of worms, file locking *is* an issue with Lucene on distributed file systems.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Is it possible to specify per-entity backend strategy in HS?
PostPosted: Thu Aug 11, 2011 7:45 pm 
Newbie

Joined: Mon Jan 31, 2005 10:01 pm
Posts: 3
Location: Toronto, Canada
Hmm... I am willing to give HS4.0 a shot even in its alpha state. Let me do an upgrade and see where it leads us. I have not thought of Infinispan, I suspect that since it is still JMS based then index propagation is not in-fact real-time (meaning I cannot do a manual index and query it immediately on the same thread), probably near real-time? Using a second Hibernate session factory is another intriguing option. I am going to give HS 4.0 a try first.

I really value your feedback on clustered SAN. Our systems guys feel very uneasy about this option. I guess file locking problems there are not specific to Lucene only.


Top
 Profile  
 
 Post subject: Re: Is it possible to specify per-entity backend strategy in HS?
PostPosted: Thu Aug 11, 2011 7:56 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
I have not thought of Infinispan, I suspect that since it is still JMS based then index propagation is not in-fact real-time (meaning I cannot do a manual index and query it immediately on the same thread), probably near real-time?

No, it's real real-time, provided as I said that you don't index async and you configure JMS to send messages in sync as well. It has of course an impact on write latency, but not on thoughput as it's all done in parallel, and you have strong guarantees that after a commit is completed you'll see the values in the next transaction (even with zero milliseconds between the two).

Quote:
I really value your feedback on clustered SAN. Our systems guys feel very uneasy about this option. I guess file locking problems there are not specific to Lucene only.

any network file system can have locking issues, but especially Lucene relies heavily on proper POSIX file locking, especially it might delete files still being used by queries (or worse, resource pools for many IndexReaders), this works fine on local mounts since the file handles are not invalidated untile the last handle is closed, but dosn't work at all on remotely mounted FS. You can workaround it in several ways, basically prohibiting Lucene from deleting segment or postponing the cleanup, but it's a setup I wouldn't recommend. Not least, it's much slower to perform queries.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Is it possible to specify per-entity backend strategy in HS?
PostPosted: Thu Aug 11, 2011 8:13 pm 
Newbie

Joined: Mon Jan 31, 2005 10:01 pm
Posts: 3
Location: Toronto, Canada
Interesting... it is my understanding that properly mounted fibre-channel SANs are treated by the OS as virtually normal file systems with proper lock distribution. I understand that the locking problem is obvious on NFS mounts... Please confirm that it pertains to fibre-channel SANs too.


Top
 Profile  
 
 Post subject: Re: Is it possible to specify per-entity backend strategy in HS?
PostPosted: Fri Aug 12, 2011 4:45 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
I'm not saying that it won't work, if properly configured, and I'm not an expert on distributed file systems.
What I learned so far is that SAN work on block level, and does not provide the file locks you need for Lucene, but this is provided by the filesystem you will use to mount it. So much depends on how you configure it.

You could use NFS, which is NOT going to work properly with Lucene, or you could use GFS which was reported to be used with Lucene successfully (http://en.wikipedia.org/wiki/Global_File_System); You should be able to find some resource on Lucene's mailing lists.. don't trust people's experience reports with Solr, as with Solr the indexes are unlikely to have real time change/search iterations and so it is more unlikely to notice issues, but still they are possible: I've seen a Solr deployment in which they strongly wanted to use NFS on a SAN, I warned them to test carefully like I'm doing now, they made stress tests for three week certifying that locking was fine, then in production it collapsed in an hour.. clearly I would be careful.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 6 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.