-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 10 posts ] 
Author Message
 Post subject: best way to update already index domain with hb search.
PostPosted: Tue Sep 16, 2008 9:20 am 
Regular
Regular

Joined: Tue Apr 01, 2008 5:39 pm
Posts: 61
Need help with Hibernate? Read this first:
http://www.hibernate.org/ForumMailingli ... AskForHelp

Hibernate version:
3.3.0 GA
hibernate search 3.1.0 beta

Hi i was wondering whats the fastest way to perform weekly batch process where you do manual batch processing (i read the guide). It seems it would delete all old index and reindex again. I want to be able to index the same thing again through a periodical update (like once every three to five days, or week, etc).


Top
 Profile  
 
 Post subject:
PostPosted: Fri Sep 19, 2008 3:02 pm 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

you could implement a TimerTask or your a scheduling library like quartz to configure your batch indexing.

Why do you want to run this batch indexing in a scheduled way. Once you have an initial index it should stay in sync due to Hibernate Search transparent index synchronization.

--Hardy


Top
 Profile  
 
 Post subject:
PostPosted: Fri Sep 19, 2008 4:20 pm 
Regular
Regular

Joined: Tue Apr 01, 2008 5:39 pm
Posts: 61
in order to keep the domain object clean to be exported as java rmi services.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Sep 22, 2008 4:42 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

I am not quite sure what this has to do with RMI. Can you maybe elaborate on your whole use case?

--Hardy


Top
 Profile  
 
 Post subject:
PostPosted: Mon Sep 22, 2008 9:06 pm 
Regular
Regular

Joined: Tue Apr 01, 2008 5:39 pm
Posts: 61
1.) read from a third party database.

2.) index them.

3.) repeat every two days.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Sep 23, 2008 2:46 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Ok now I understand, at least I think so.
You are indexing and searching against a third part database. You are not modifying this database, but the third party might. Well, in this case using some sort of timer to indx seems to be the best approach. Whether you have to reindex everything in this case depends on whether you can detect via a SQL/Hibernate query which entities have changed (eg last update time stamp). If you cannot determine the changes via a query you have to reindex from scratch.

--Hardy


Top
 Profile  
 
 Post subject:
PostPosted: Tue Sep 23, 2008 2:46 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Ok now I understand, at least I think so.
You are indexing and searching against a third part database. You are not modifying this database, but the third party might. Well, in this case using some sort of timer to indx seems to be the best approach. Whether you have to reindex everything in this case depends on whether you can detect via a SQL/Hibernate query which entities have changed (eg last update time stamp). If you cannot determine the changes via a query you have to reindex from scratch.

--Hardy


Top
 Profile  
 
 Post subject:
PostPosted: Tue Sep 30, 2008 9:13 am 
Newbie

Joined: Wed Sep 24, 2008 9:53 am
Posts: 4
Hello,

I'm having a similar problem. I also want to update the index via a TimerTask because I get updates of my database from a third party. So I can not benefit of the Hibernate Search transparent index synchronization.
And I can not determine which entities have changed.

So I tried the folowing:

Code:
public void indexInitialization() {
       org.hibernate.search.jpa.FullTextEntityManager fullTextEntityManager = org.hibernate.search.jpa.Search.createFullTextEntityManager(em);

       Query query = em.createNamedQuery("findAllUnlockedUserfood");

      List<Userfood> userfoods = query.getResultList();
      
      for (Userfood userfood : userfoods) {
         // Force the (re)indexing of the userfood
             fullTextEntityManager.index(userfood);
                }
}


This method is called e.g. every week.
But this method increases the file size of my index even if there are no changes in the database.
Do I have to remove the old Entities from the index before I force the reindex process with the index(Object o) method? Or is there a different approach to do it?

PS: I'm using Hibernate Search 3.0.1 GA

Thanks for your help!


Top
 Profile  
 
 Post subject:
PostPosted: Tue Sep 30, 2008 10:03 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

Have you tried optimizing the index after re-indexing? In Lucene documents cannot be just changed. Effectively documents gets deleted and readded with the new values. This fragments the index and hence you should periodically optimize the index. If you rely on the automatic index synchronisation, optimisation is configurable and taken care of. If you index maually you should also optimize the index afterwards.

--Hardy


Top
 Profile  
 
 Post subject:
PostPosted: Wed Oct 01, 2008 2:37 am 
Newbie

Joined: Wed Sep 24, 2008 9:53 am
Posts: 4
Thanks a lot Hardy,

after optimizing the index everything is fine. And the file size will not increase anymore if there are no changes in the database.

I thought that the automatic optimization of Hibernate Search has some default value and I whould not have to configure it explicitly.

Thanks a lot for your help!


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 10 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
cron
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.