-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 12 posts ] 
Author Message
 Post subject: Searching while manually reindexing returns partial result
PostPosted: Fri Jan 16, 2009 10:20 am 
Newbie

Joined: Fri Jan 16, 2009 10:00 am
Posts: 7
If I do a search while I'm performing a manual reindexing I only get a partial result back. By doing multiple searches back to back I can see how the index is being built up. As I understand it this is how Lucene is supposed to work, right?

The problem is that I do reindexing at regular intervals and I don't want the users to get partial results because of this. There seems to be a solution in place for this when using Hibernate search in a clustered setup. The old index will be used until the new index is fully in place. How come this is not used when the index is being rebuilt locally? Or is there another way to get around this problem?

Thanks!


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 20, 2009 6:14 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

The idea in the standalone version of Hibernate Search is that you index only once from scratch and then only incrementally (which is also much faster). In your case you would have to turn off automatic index updates via the even listeners. Otherwise it could be that while you are re-indexing into a 'new' index, updates to the existing index occur via the event system which then would get lost when you do the index switch.

You could use the JMS setup locally on one single box. However, there is still no guarantee that the users never see a partial index. Index switches are time based, so if the timing is bad you still might end up with a 'incomplete' index.

In case you can identify the entities you want to reindex (eg a last modified timestamp in the database or something) you can just reindex these entities. This means there is never a time where the index is empty and rebuild from scratch and the whole process is much more resource friendly.

--Hardy


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 20, 2009 7:15 am 
Newbie

Joined: Fri Jan 16, 2009 10:00 am
Posts: 7
Hi Hardy,

I'm not bothered by that some updates might not make it into the index. They will still go in the next time the index is rebuilt. Currently I'm using the manual reaindexing as a poor mans clustered search setup. The plan is to switch to a JMS setup at a later stage.

I guess what I'm after is really a way to switch between indexes. This way I could build a new index, switch over to it when it's done and delete the old index. Maybe I'm better off implementing the JMS queue since it will bring performance benefits as well.

Thanks!


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 20, 2009 3:46 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi Daniel,
you should make your own DirectoryProvider implementation, Search lets you use your own implementation by setting it in configuration.

If you take a look at FSSlaveDirectoryProvider, you'll see that the JMS client is actually using two indexes, and switching from one to the other during copies. You could skip all code about file copying, and copy the structure you need to control when and how to switch indexes.

It's quite simple, you just have to return the index you want to be used by your searchers.
The tricky part is that you should make sure that nobody is using the index you're going to delete; that's easy if you use lucene's API to empty the index as it will not change the state of indexreaders still in use, and schedule the real file deletion for later.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject:
PostPosted: Wed Jan 21, 2009 10:11 am 
Newbie

Joined: Fri Jan 16, 2009 10:00 am
Posts: 7
Hi Sanne!

That sounds like a good idea, I'm gonna give it a shot!

Thanks!


Top
 Profile  
 
 Post subject:
PostPosted: Wed Jan 21, 2009 12:26 pm 
Newbie

Joined: Fri Jan 16, 2009 10:00 am
Posts: 7
Hi

I started looking into the solution proposed by Sanne, but I ran into a problem. How do I create an index in another directory than the one being used for searches? When I switch the directory, that affects all the search operations in the application, right? Is there some way to override the directory used when indexing manually. I'm indexing using the example described in the Hibernate Search in Action book.

Or did you mean that I should create the index externally from another application and then just copy the index?

Thanks!


Top
 Profile  
 
 Post subject:
PostPosted: Wed Jan 21, 2009 5:17 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi,
I was assuming you would disable the automatic index updates in the searching application, so this only uses to index to do searches;

to update a runtime chosen index you can start a second SessionFactory: not necessarily a second application.
Most JUnit tests in the Search code start a SessionFactory configured by code; you could read the same configuration of your main SessionFactory and then make some changes programmatically, like changing the index path, and then start the SessionFactory and use it to update the index.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 27, 2009 7:10 am 
Newbie

Joined: Fri Jan 16, 2009 10:00 am
Posts: 7
Hi!

Ok, I've created a second sessionFactory that uses the same configuration as the first one. I've verified that my directoryProvider is created twice for each indexed entity. Now I have a new problem. No matter which sessionFactory I use to get my fullTextSession the same directoryProvider instance is returned. It seems the first directoryProvider instance is replaced by the second one. The first one is still there though, I can see that my timer is still running in it.

In my directoryProvider I have logic for switching to a new indexPath. I call this before I start updating the index, but since the same directoryProvider is used when searching I still have the original problem where partial results are returned when the index is being built.

It feels like I'm really close to solving the problem, this is the last piece of the puzzle. Does anyone have any ideas?

Thanks!


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 27, 2009 7:29 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
please try using different configurations; Just one different property should be enough.
When configuration.equals(cfg) you get the same instances back to save memory and build time.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 27, 2009 8:35 am 
Newbie

Joined: Fri Jan 16, 2009 10:00 am
Posts: 7
I tried changing the configuration but it didn't help. I even tried changing the default directory provider setting in the config file. Now both sessionFactories seem to be using the second configuration. Really weird since they are definitely different instances.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 27, 2009 8:41 am 
Newbie

Joined: Fri Jan 16, 2009 10:00 am
Posts: 7
They are actually not using the same configuration. If I change the max fetch depth setting I can see that they are using different values. So there's something else going on here.


Top
 Profile  
 
 Post subject: Re: Searching while manually reindexing returns partial result
PostPosted: Wed Jul 29, 2009 5:20 am 
Newbie

Joined: Wed Jul 29, 2009 5:10 am
Posts: 1
Hi Daniel,

I'm facing the same problem, any update on the configuration issue?

Thanks, Renaud


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 12 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.