-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 
Author Message
 Post subject: Using MassIndexer on a subset of data
PostPosted: Wed Nov 03, 2010 3:14 pm 
Newbie

Joined: Wed Sep 09, 2009 12:28 am
Posts: 8
Is it possible to use MassIndexer or is there something equivalent that will allow us to index a subset of data in a table instead of the entire table data. My table contains 15 million rows of data for 21 days.

For instance, I have an application that continually indexes manually data every minute. So far its working fine. The manual indexing is way more efficient (in terms of time) than auto-indexing which proved very costly response time wise. Concern is if the indexing application goes down, we lose that time interval and i have an alternative way to manually index the interval at a later point of time, but wondering if any API exists (that i am unaware of) which will let me manually index (such as MassIndexer) instead of implementing my own (Have multiple threads running concurrently - divide the entire time interval into 10 and run in parallel).

BTW, the searches are "blazingly" fast was how one of our users put it. Our users are so much happier with the hibernate search function (The JDBC searches implemented earlier would hardly allow us to query or summarize 2 days' data in a minute, but with hibernate search users can search for 21 days' data and get a summary as well. Results are always returned in less than a minute). Hibernate search really rocks !!!


Top
 Profile  
 
 Post subject: Re: Using MassIndexer on a subset of data
PostPosted: Wed Nov 03, 2010 6:47 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi julie09,
first of all thank you very much for the nice feedback.

continually indexing shouldn't be a huge hit on performance, assuming you enable async processing and also have exclusive_index_use turned on.
Also having second level caches enabled helps both during query time and improving performance at indexing time (as sometimes more data needs to be loaded as all fields of the Document need to be re-written).

about limiting the data targeted by the MassIndexer, sorry that's planned but not implemented yet: HSEARCH-499
feel free to provide suggestions about the API and/or implementation

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.