-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 11 posts ] 
Author Message
 Post subject: How to configure a hot standby cluster in HA setup?
PostPosted: Thu Mar 03, 2011 6:09 am 
Newbie

Joined: Wed May 31, 2006 2:34 pm
Posts: 9
Hi,

we are using Hibernate Search with success on a productive system. The setup is a high availability cluster (Linux-HA) with two servers. At every time one instance is the master, but on both systems the application is running. Each application has its own Lucene directory on its server. The database is replicated. If the active server fails for whatever reason, Heartbeat will switch over to the hot standby server. This works great for the application and the database.

But here comes the problem: As every application instance has its own Lucene directory, the Lucene index isn't up to date after switching.

What is the preferred Lucene / Hibernate Search configuration for a hot standby setup like ours, with instant failover capabilities?

I read about clustering and the possibilities with FSMasterDirectoryProvider and the JMS solution as well. But I don't feel that it works for my situation, because the update interval is very long (default 60 Minutes). The application must be aware of index changes nearly instantly, so it wouldn't be a solution to sync every 5 Minutes or so, because data were still lost, if the master switch occured after - let's say - 4 Minutes.
And the described solutions are nearly always, that a single instance is the index master and the slaves are just reading the index.

In our case, we want to be able to switch the application to let the hot standby application become the new master.

Some questions, which came to my mind on this topic:
  • Is it possible to synchronize the Lucene directory from master server to the slave server via a hardware synchronize solution?
  • Does Hibernate Search/Lucene recognizes changes made to the Lucene FS directory, when the application is running? Or are information held in memory additionally?
  • Is it possible to force a FSMasterDirectoryProvider to synchronize from sourceBase upon becoming the active server?
  • Can I assume, that the Lucene directory is NOT touched by Lucene, if there are no interactions with the database?
  • Is it possible to let two applications work simultaneously on the same Lucene directory if parallel access is precluded by the HA setup? After a switch, would the new master pick up the changes in the Lucene directory made by the other master before switching the master?

Thank you in advance for your help.


Top
 Profile  
 
 Post subject: Re: How to configure a hot standby cluster in HA setup?
PostPosted: Thu Mar 03, 2011 7:54 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

if you want to have this fail-over functionality you really need to keep the Lucene indexes in sync as well. I see several options here:


The latter is quite new, but I think it suits your use case.

--Hardy


Top
 Profile  
 
 Post subject: Re: How to configure a hot standby cluster in HA setup?
PostPosted: Thu Mar 03, 2011 10:38 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
yes I'd use the Infinispan Directory, you can set it up to store the index write-through to the same database you're using for hibernate.

master/slave are not an option, as they always require at least two machines running - and as you say it has delays in synchronizing the index

shared location approaches (samba/NFS/other network disk) are strongly discouraged by the Lucene dev team, it might look like working but sooner or later you'll end up with a corrupt index because of some locking issue, or open file descriptor leackage.

When using Infinispan you'll even have better performance, provided you can allocate to the JVM enough memory to keep the index in memory.
And after you've setup Infinispan for index replication, you might as well use it as second level cache for Hibernate;)

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: How to configure a hot standby cluster in HA setup?
PostPosted: Thu Mar 03, 2011 11:31 am 
Newbie

Joined: Wed May 31, 2006 2:34 pm
Posts: 9
Hi,

thank you very much for your answers. I will investigate in the Infinispan solution.

I heard about this Shared-Location problem. But I thought of replicating the directory via DRDB. Have you made some experiences with this? At this setup we always have two servers running and only one is the master, which means only one instance at a time is performing write operations to the database/index. In this case, isn't a shared location a simple solution?

One thing I don't understand is, if Hibernate Search / Lucene always opens the files at the directory when performing a search or an update or does HS/Lucene store same values in memory and only synchronizes its internal state with the file system from time to time?

To put it in a nutshell: If at one time only one instance is able to write to the index and if the shared location is synchronized in a fast and safe way with DRDB and a master server switch occurs, will the new master server be instantly be able to use the shared lucene index without restarting the application?

Timo


Top
 Profile  
 
 Post subject: Re: How to configure a hot standby cluster in HA setup?
PostPosted: Thu Mar 03, 2011 11:46 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
I heard about this Shared-Location problem. But I thought of replicating the directory via DRDB. Have you made some experiences with this?

Never heard of it before, seems interesting. Let me know how it goes if you use it; keep in mind the many issues other systems had.. this might be different but I'd be very cautious.

Quote:
One thing I don't understand is, if Hibernate Search / Lucene always opens the files at the directory when performing a search or an update or does HS/Lucene store same values in memory and only synchronizes its internal state with the file system from time to time?

Lucene keeps some things in memory, but Hibernate Search makes sure to flush and commit writes on disk at every transaction commit. One exception: when using the massindexer, flushing and commit are performed regularly and at the end, but not after every operation (to speed up things).

Quote:
To put it in a nutshell: If at one time only one instance is able to write to the index and if the shared location is synchronized in a fast and safe way with DRDB and a master server switch occurs, will the new master server be instantly be able to use the shared lucene index without restarting the application?

should work fine, the only issue you might have is the locking. A Lucene IndexWriter acquires a lock (writes a marker file) when making changes; you should make sure that during a failover the secondary node opens an unlocked index directory. The easy solution is to disable locking, but make sure you have guarantees that you won't ever have both nodes trying to write to it.
http://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#search-configuration-directory-lockfactories
The Infinispan Directory adds an additional locking implementation suited for clusters, but this is also not automatically removed if the node crashes after having taken it but not yet released. On Infinispan, it's easy to create a listener to the "cluster view changed" event and forcefully remove the lock. This is the default, as people unaware of this might prefer restarting a node than corrupting the index ;)

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: How to configure a hot standby cluster in HA setup?
PostPosted: Thu Mar 03, 2011 12:01 pm 
Newbie

Joined: Wed May 31, 2006 2:34 pm
Posts: 9
Thanks for clarifying this.

I think, I will give DRBD a try. With Heartbeat we have the opportunity to execute a script after switching the master. I think I will search for a lock file in the FS after switching and remove the lock if appropriate.

I will report my experiences with this setup. Thank you.

_________________
Timo


Last edited by timomeinen on Thu Mar 03, 2011 12:38 pm, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: How to configure a hot standby cluster in HA setup?
PostPosted: Thu Mar 03, 2011 12:37 pm 
Newbie

Joined: Wed May 31, 2006 2:34 pm
Posts: 9
Hah! Even simpler. I could use a org.apache.lucene.store.SingleInstanceLockFactory which is held in memory by the application. As both application instances run on separate JVM this should be the solution.

BTW: I made a misspelling error. The name for the replication is DRBD

_________________
Timo


Top
 Profile  
 
 Post subject: Re: How to configure a hot standby cluster in HA setup?
PostPosted: Thu Mar 03, 2011 12:50 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
little warning: these locks are especially useful to prevent you to mistakenly start the application twice. using that implementation you're not protected from this scenario;
if you don't care, you might as well like the "none" option.

The best option is likely to create a custom implementation extending the NativeFSLockFactory to write the locks to some other path which is not shared across nodes, and combine that with the Infinispan Directory.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: How to configure a hot standby cluster in HA setup?
PostPosted: Sun Mar 06, 2011 4:14 am 
Newbie

Joined: Thu Mar 03, 2011 1:20 pm
Posts: 5
You should implement the fail safe functionality and keep the indexes up and synchronized as well. Reconfigure your clustering approach, The indexes should be shareable among its users. Also have a look at the Infinispan directory. That is relatively new but you can already set it up with the same database you are using for hibernate. You will have better performance but you need to make sure they are allocated properly.


Top
 Profile  
 
 Post subject: Re: How to configure a hot standby cluster in HA setup?
PostPosted: Thu Mar 10, 2011 11:27 am 
Newbie

Joined: Wed May 31, 2006 2:34 pm
Posts: 9
Thanks for all the replies. In the meantime I created a HighAvailabilityLockFactory which only gives write access to the active node of the cluster:

I used NoLockFactory as a template, wherefore I created a singleton instance of a HighAvailabilityLock, which only checks for the existence of a file. The file is only available on the active node and will be removed/created in case of a switchover by the Heartbeat-Daemon:
Code:
class HALock extends Lock {
    private static File activeNodeGuard;
    public HALock(String activeNodeGuardLocation) {
        activeNodeGuard = new File(activeNodeGuardLocation);
    }
    public boolean obtain() throws IOException {
        return isActiveNode();
    }
    public void release() throws IOException {
    }
    public boolean isLocked() {
        return !isActiveNode();
    }
    private boolean isActiveNode() {
        return activeNodeGuard.exists();
    }
}


With this approach I clearly disabled locking ability of Lucene. I think the behaviour is the same with the NoLock strategy on the active node. Does Hibernate Search needs a locking on a filesystem basis if the system garantuees that the index is not shared? Is my solution safe for production use? Does HS care for single access to the index on the file system?

I'm curious about your opinion. Thanks.

_________________
Timo


Top
 Profile  
 
 Post subject: Re: How to configure a hot standby cluster in HA setup?
PostPosted: Thu Mar 10, 2011 11:56 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hibernate Search has two backends using the lock, the regular one and the one used by the MassIndexer.
If you only use one of the two, there are no issues, but the lock is used to prevent one to open the index while the other backend is busy.

(If you only use the regular one (i.e. you never use the MassIndexer feature) then you're safe even without any lock, it never needs the lock as it's design prevents that.)

I think you should add an instance level lock, so you can use the MassIndexer as well.
Have a look at org.apache.lucene.store.SingleInstanceLock.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 11 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.