-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 
Author Message
 Post subject: How search ensures index is in sync with the transaction
PostPosted: Thu Jun 06, 2013 1:43 am 
Newbie

Joined: Tue May 28, 2013 8:00 am
Posts: 3
Hi,

In the master slave pattern(using JMS or JGroups), how does hibernate search ensure that by the time the transaction is completed the indexing will also be completed ? From what I have understood, after the transaction is successful, the slaves send the index update request to the master. But there will be a delay caused by building the index and propagating the changes back to the slaves. By that time if a user does a search he won't get the expected result. How do we solve this issue?

FYI, I've setup hibernate search in a master slave setup(tried both JMS and JGroups) using infinispan as the directory and everything works and index building and propagation happens fast under low load. But when the application is under moderate to high load, which will be the case with most apps in production environment, I'm facing this visible delay of upto 10 seconds with JMS backend and 4 to 5 seconds with JGroups backend.

I'm actually evaluating hibernate search for our application which has heavy reads and writes - reads are higher than writes but that doesn't mean writes are less frequent :)

Hibernate(Search and Core) version - 4.2.0.Final
Infinispan version - 5.2.6.Final
JGroups version - 3.2.7.Final


Top
 Profile  
 
 Post subject: Re: How search ensures index is in sync with the transaction
PostPosted: Sat Jun 08, 2013 11:08 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi,
there are two main aspects to configure on a multi-node configuration: the Directory and the Backend.

JMS or JGroups are both backends, and both can be configured in async mode or in synchronous mode (although the details vary on the technology); if your application requires to immediately affect queries after a transaction committed, you need sync mode for the backend AND a Directory which provides instant replication.

For Directory options, besides the default FSDirectory which is not suited for multi node, you can either have the filesystem-master and filesystem-slave combination, or use the Infinispan one. Only Infinispan is able to provide immediate replication.

If your application requires immediate replication you should verify that the backend is configured to be synchronous; for JGroups that means you would need Hibernate Search 4.3.0.Final as 4.2.0.Final could only use JGroups in asynchronouse mode.

It seems you already configured Infinispan for the Directory - correct as that's a requirement - but make sure you find an appropriate tuning for the Infinispan (and your network) configuration first before trying the multiple nodes, as Infinispan can get slower than a plain FSDirectory when not properly configured.

Also I'm not sure what your definition of "high load" would be; consider that beyond a certain point tuning won't be enough and you will need to enable sharding. When enabling sharding it's good to remember that each shard can use a different master.

Finally a tip: synchronous operations are significantly slower than the asynchronous ones, as this can take advantage of batching. Ideally you want it all async, but even if that is not possible you might be able to identify some indexes for which it is ok to be async.

Hibernate Search 4.3.0.Final will be released next week, and while Infinispan 5.2 is still the reference it's known to work fine also with the upcoming Infinispan 5.3, which could be worth a shot in your case.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: How search ensures index is in sync with the transaction
PostPosted: Wed Jun 12, 2013 6:29 am 
Newbie

Joined: Tue May 28, 2013 8:00 am
Posts: 3
Thanks Sanne for the detailed explanation. Please see my comments below.

sanne.grinovero wrote:
If your application requires immediate replication you should verify that the backend is configured to be synchronous; for JGroups that means you would need Hibernate Search 4.3.0.Final as 4.2.0.Final could only use JGroups in asynchronouse mode.

Ok. Will wait for HS 4.3.0.Final

Quote:
It seems you already configured Infinispan for the Directory - correct as that's a requirement - but make sure you find an appropriate tuning for the Infinispan (and your network) configuration first before trying the multiple nodes, as Infinispan can get slower than a plain FSDirectory when not properly configured.

I have tried the different tuning options as mentioned in hibernate search and infinispan docs. I have increased the OS buffer size and ran the performance tests that comes bundled with JGroups which showed that the network can handle large messages. Now when i increase the default chunk size from 16 KB to say 7 MB and restrict the segment size generated under 7 MB, I used to get the Read past EOF exception during initial state transfer using replication. I noticed that the exception was thrown for a particular entity, X, which was annotated with @IndexedEmbedded inside entity Y, but the entity X itself is not indexed while using MassIndexer. When I also included X to be indexed using MassIndexer, the problem went away. Next, the initial state transfer was taking a lot of time even though I've configured the chunk size and chunk batch size w.r.t our network settings. Finally, I've kept the chunk size as 512 KB and without any segment size restriction and chunk size batch to 100 and now its working fine. Any input from your end will also be helpful.

Quote:
Also I'm not sure what your definition of "high load" would be; consider that beyond a certain point tuning won't be enough and you will need to enable sharding. When enabling sharding it's good to remember that each shard can use a different master.

I haven't tried index sharding yet. Will give it a shot


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.