-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 
Author Message
 Post subject: HS: How to have multiple indexes on the same table.
PostPosted: Fri Feb 13, 2009 5:19 am 
Newbie

Joined: Fri Feb 13, 2009 4:54 am
Posts: 10
Hi,

We are using the latest version of Hibernate Search. Our MySQL table has over 1 billion records and we are finding that Lucene is going to take about a month to index it all if it does it as one huge index - even using all the optimisations given in "HS in Action".

We don't need that anyway. The table has a field in it called marcTable which can contain values from 1 to 999 (only about half of the values are actually valid anyway). We would like to have 999 indexes for this one table with a discriminator using marcTable. Our use-case only requires searching to be carried out within each marcTable type, not across all of them.

I have already investigated sharding on the values of marcTable, but this is really not a solution because Lucene will still index across all rows for the table (as far as I understand sharding; it's splitting the index into different files, not different, siloed, indexes).

We could have 999 tables (marcTable1, marcTable2, etc) which would give us 999 separate indexes, but that is a horrible solution - not to mention having to define 999 classes and the enormous startup time for hibernate with so many classes.

Is there any way to set this up with Hibernate Search?

Thanks.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Feb 14, 2009 8:09 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
as far as I understand sharding; it's splitting the index into different files, not different, siloed, indexes

No, the purpose is really to get you to different indexes, "siloed" in different directories. Then each index may be managed indipendently and Lucene might have to make each one indipendently in different files, as usual for indexes.

So it should really be the feature you're looking for; we could have to fix some documentation? Could you point us to some reference which gave you the idea of "different files, coupled index" ??

The indexing speed is another issue: the index structure could be the performance bottleneck, but I doubt it. First you should check the way you entities get loaded: try enabling hibernate's query log and verify the way your data is being loaded; most of the time the problem source is the way collections are loaded with subsequent lazy queries, one new query for each root entity, for each collection, makes up a lot of DB queries and delays.

I am working on "indexing accelaration" but I am having big delays in releasing it in making it "general purpose" and not only tailored to my testcases. If you could provide me with the relevant model classes I am willing to take a look at it.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.