Hi Sanne,
Ok have had a look at it today and hit a wall.
Quick recap.......I am re-using HS master and slave directories out of the box without the JMS setup. I have implemented my own jobs/secheduled tasks for keeping the master index up to date. My concern or worry is that the master node will copy the master index while an update is occuring thus copying a corrupt index for clients to use. What I want to do is prevent HS master directory from copying the index while an update is occuring to the index.
I re-used the locking mechanism that is used in the FSMasterDirectory implementation. A snippet of my implementation of an update looks like so
Code:
session = getFullTextSession();
directoryProviderLock = this.getDirectoryLock(session, entity);
if(directoryProviderLock != null){
directoryProviderLock.lock();
removeOldIndex(entity,session);
indexEntity(entity,session);
}else{
log.error("Lock is null, failed to get a hold of directory lock for entity"+entity);
}
........lock is released
Problem with this approach is this, the directory gets locked first by calling lock(). Later on though in removeOldIndex() i have
Code:
private void removeOldIndex(Class<?> entity,FullTextSession fullTextSession){
log.info("About to purge index for "+entity);
try {
fullTextSession.beginTransaction();
//purge all records before we add new ones,no update in lucene out of the box
fullTextSession.purgeAll(entity);
fullTextSession.getTransaction().commit();//triggers a new thread which looks to acquire a lock on the directory but wont get it because we acquired it earlier
log.info("Finished purging for "+entity);
} catch (HibernateException e) {
log.error("Problem occured for purging index "+entity,e);
}
}
On the commit, it spawns off a seperate thread or "FutureTask", this futureTask looks also to acquire a lock on the directory but it wont get it because the parent thread has the lock. Basically it ends up in an infinite wait in PerDPQueueProcessor.java(84).
I have split the deletion and addition across two transactions, should i be doing this or should i be using one transaction?? I read somewhere that to split it was best practice(Cant find source now)!! I dont think even the one transaction would solve my problem because in my "adding to index" code im calling the "fullTextSession.flushToIndexes()" which I assume writes to disk!! (Very similar to whats in the manual for manual indexing)
I have looked at the SnapshotDeletionPolicy briefly, it looks like a round about way of what I want to do, i just want to prevent the master directory from copying the index while an update is in process on the index. You must have something like this already in place for the JMS implementation?? Update comes in off the queue and the master node needs to apply this update to the index, can you point me to this code in the tree?? I cant find it thanks!!
It looks like at the moment I have hit a stumbling block, only idea i have at the moment is to come up with my own version of FSMasterDirectoryProvider, this version would check to see if any updates are in progress and hold off on the copying till the updates are finished.
Quote:
As an update is delete+add these two operations are wrapped in the ownership of the directory lock.
So Search will acquire this lock to make sure the index it copies is not containing an applied delete operation while it's missing a write operation.
This is probably the lock i need to get but if my purge spins off a seperate thread and looks to acquire the same lock then the current way i do it wont work for me.
Any ideas or input is welcome??
Thanks Sanne,
LL