-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 4 posts ] 
Author Message
 Post subject: Manual reindexing and optimizing without purging
PostPosted: Sat Mar 21, 2009 4:08 pm 
Beginner
Beginner

Joined: Tue Feb 03, 2009 12:29 pm
Posts: 49
hi,

I need to manually index my entities periodically. Please let me know if there will be any issues if the following approach is taken.

1. Batch manual indexing every 15 minutes
2. Optimizing the index occasionally without purging

What is the effect of avoiding purging since optimizing seems to clean up the index?

Thanks,
Seema


Top
 Profile  
 
 Post subject: Re: Manual reindexing and optimizing without purging
PostPosted: Mon Mar 23, 2009 12:56 pm 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Quote:
1. Batch manual indexing every 15 minutes

Sure you can do that. Provided 15 minutes is enough to re-index. The question is more why you want to do that and why you can not rely on the automatic index updates? Search performance will suffer while you re-indexing. Maybe you should use the master/slave setup? You could re-index on the master and regularly (eg every 15 minutes) update the slave indexes. But it all depends on your requirements.
Quote:
2. Optimizing the index occasionally without purging

Optimizing and purging are orthogonal. In fact there is no point to optimize the index if you just purged it. Optimizing is useful after a batch index or in regular intervals during automatic index synchronization.
Quote:
What is the effect of avoiding purging since optimizing seems to clean up the index?

Purging removed all entities of the specified class (or the whole index). In the Lucene context this means that the documents are marked for deletion. They are not searchable anymore, but the data is actually still in the index. Optimizing will then remove the data and perform an update of the index data structure. Just optimizing the index has nothing to do with purging. If you haven't deleted some documents before optimized is called nothing will be removed.

--Hardy


Top
 Profile  
 
 Post subject: Re: Manual reindexing and optimizing without purging
PostPosted: Tue Mar 24, 2009 1:22 pm 
Beginner
Beginner

Joined: Tue Feb 03, 2009 12:29 pm
Posts: 49
hardy.ferentschik wrote:
Quote:
1. Batch manual indexing every 15 minutes

Sure you can do that. Provided 15 minutes is enough to re-index. The question is more why you want to do that and why you can not rely on the automatic index updates? Search performance will suffer while you re-indexing. Maybe you should use the master/slave setup? You could re-index on the master and regularly (eg every 15 minutes) update the slave indexes. But it all depends on your requirements.
Quote:


We are searching from a table that is periodically updated by a stored procedure. Since the table is not updated from our application, we cannot go for automatic index updates. Since the search will be slow while re-indexing, we are planning to use 2 different indexes and swap between them.


2. Optimizing the index occasionally without purging

Optimizing and purging are orthogonal. In fact there is no point to optimize the index if you just purged it. Optimizing is useful after a batch index or in regular intervals during automatic index synchronization.
Quote:
What is the effect of avoiding purging since optimizing seems to clean up the index?

Purging removed all entities of the specified class (or the whole index). In the Lucene context this means that the documents are marked for deletion. They are not searchable anymore, but the data is actually still in the index. Optimizing will then remove the data and perform an update of the index data structure. Just optimizing the index has nothing to do with purging. If you haven't deleted some documents before optimized is called nothing will be removed.


ok. Hope I can assume that both the following approaches are fine and there is no need to favor one over the other.
1. Re-index (on the existing index) and then optimize
2. Purge the existing index and then re-index without optimizing


Thanks,
Seema


Top
 Profile  
 
 Post subject:
PostPosted: Tue Mar 24, 2009 1:45 pm 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

I think your usecase might be a good fit for a master/slave setup. You have to keep in mind though that if your procedures manipulate the database while you are still having a 'old' copy of the index you might get synchronisation problems. It depends really on the requirements of your application.

Regarding your second question - "Re-index and then optimize" is not the same as "Purge the existing index and then re-index without optimizing". If for example your database procedure deleted a record which was previously indexed, the corresponding Lucene document would not get deleted from the index during a re-indexing. You would have to purge first. Also, it it recommended to optimize the index after batch indexing. Optimizing is not only there to remove documents marked for deletion, but the process also performs other optimizations, eg segment merging.
I recommend you read more about this in "Hibernate Search in Action" or "Lucene in Action". Either book would give you a deeper insight into Lucene.

--Hardy


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 4 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.