-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 4 posts ] 
Author Message
 Post subject: Weird Indexing Problem
PostPosted: Wed Jan 13, 2010 6:23 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Hi Guys,
Im having a strange problem with my indexing. Some field values arent getting indexed and I have no idea why!!

Im following the standard approach that you have outlined in your manual http://docs.jboss.org/hibernate/stable/search/reference/en/html_single/#search-batchindex-indexing

Im doing a batch index on my entity type Equity. Every entity contains an ISIN value. There roughly around 42,000 equity records. I can see from the log file at trace mode that it contains the ISIN but when I query and open luke i see that the isin isnt in the index even though the rest of the record is there!! (So it aint like the record is completely missing).

What is funny is this, in luke i can see that all records with a document id up till 4671 have the isin value in the index, all those after dont have the isin in the index. I have re-indexed from scratch several times and it is from the same document id on(4671) that the records have no ISIN.

I checked the logs for exceptions and i see none.
For example this is the next record after document id 4671, i see from the trace log that the isin is present but when i query and open in luke the value is missing for ISIN in the index.
Code:
10:08:07,901 [pool-17-thread-1] TRACE impl.lucene.works.AddWorkDelegate  - add to Lucene index: class com.mypackage.model.Equity#701251:
Document<stored/uncompressed,indexed<_hibernate_class:com.mypackage.model.Equity> stored/uncompressed,indexed<id:701251> stored/uncompre
ssed,indexed,tokenized<name: NAME OF COMPANY-> stored/uncompressed,indexed<active:true> stored/uncompressed,inde
xed<isin:US541XXXXX> stored/uncompressed,indexed,tokenized<shortName:ShortNameOfCompany> indexed<riskClass:DYNAMIC>>


Im baffled as to why it isnt there and why I have from 0-4671 records with ISINS and from 4671 onwards there is no isins indexed even though I can see values for it in the log file!!

How do i turn on the logging for lucene via HS?? Ive read somewhere that I have to set an input stream to get it to work!! How is this done via HS??

Thanks Guys,
LL


Top
 Profile  
 
 Post subject: Re: Weird Indexing Problem
PostPosted: Wed Jan 13, 2010 7:11 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Ok ignore above, found out the cause. Indexing twice, my generated join is returning me same record twice but with different values. Need to look more into it!!


Top
 Profile  
 
 Post subject: Re: Weird Indexing Problem
PostPosted: Sat Jan 16, 2010 1:51 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
hi, what do you mean with "my generated join" ?
Could you try the new approach in Hibernate Search 3.2.0 Beta1 ?
Re-indexing your 42,000 records should be much faster and less error-prone.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Weird Indexing Problem
PostPosted: Mon Jan 18, 2010 9:33 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
By generated I mean this

Code:
ScrollableResults results = fullTextSession.createCriteria( Equity.class )
    .setFetchSize(BATCH_SIZE)
    .scroll( ScrollMode.FORWARD_ONLY );



The select generated by this generates a select where I get the same records twice (even more, depends on number of values in map)but with different values or should i say values missing.

There is a collection(Map) inside Equity which maps to a seperate table but it can have several values, usually two but with the generated join, it is doing a right sided join where it is selecting all the values from the Map table and then mapping it back to the Equity table thus giving me the same record twice or 3 times depending on how many values are in the map.

So if i have 42,000 equity records and the map collection has two values for every equity, i get 84,000 records back with my generated join and the map only has one value for each equity record, thus the last indexed record will be missing a value in the map.

Hope thats clear,
LL


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 4 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.