-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 5 posts ] 
Author Message
 Post subject: Hibernate Search: Projecting DOCUMENT_ID from search
PostPosted: Sat Dec 17, 2011 9:07 am 
Regular
Regular

Joined: Thu Oct 07, 2004 4:45 pm
Posts: 92
If I use FullTextQuery.DOCUMENT_ID in a projection to get the document ID out of a search, what (if anything) guarantees that the IndexReader I subsequently get through the ReaderProvider will be the same one that was used for the search? I'm using shared readers, but index updates may be going on in the background. Does Hibernate hold onto the IndexReader until the end of the transaction? Thanks!


Top
 Profile  
 
 Post subject: Re: Hibernate Search: Projecting DOCUMENT_ID from search
PostPosted: Mon Dec 19, 2011 9:31 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
There are several parts to the answer of the question. First of all, there is no guarantee whether you get a same IndexReader. Each time you call ReaderProvider#openIndexReader, IndexReader#reopen() is called under the hood, depending on whether or not the index has changed. So there is no guarantee there.

Now to the transactions. Here a lot will depend on the isolation level and whether you are using transactional batching (which is the default). In this case all Lucene work is batched and executed as a transaction synchronization. So the question is whether in your setup it is possible for an update/delete transaction to complete while another read transaction is in progress. In the default isolation level of READ COMMITTED this can in fact happen, so there is no guarantee here. You could change the isolation level, but that might have other implications and should be carefully thought through.

Is there a particular use-case you have in mind?

--Hardy


Top
 Profile  
 
 Post subject: Re: Hibernate Search: Projecting DOCUMENT_ID from search
PostPosted: Mon Dec 19, 2011 11:44 am 
Regular
Regular

Joined: Thu Oct 07, 2004 4:45 pm
Posts: 92
It's possible that an update/delete could happen, and there's not much I can do about that, unfortunately.

The use case I have in mind is running Lucene's Highlighter module on a field in the index to show users the text that matched. Ideally, it would be great if I could get a callback invoked by Hibernate Search for each hit so that I can pass the document and IndexReader to the highlighter. But I don't think that's possible in the current release, right? So my next thought was to project the document ID and document from the Hibernate Search query result and iterate through them. But if there's a chance that the IndexReader could be different, then the document IDs are obsolete. I would have to run a separate search via the native Lucene API just to get the same documents again.


Top
 Profile  
 
 Post subject: Re: Hibernate Search: Projecting DOCUMENT_ID from search
PostPosted: Mon Dec 19, 2011 12:08 pm 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

another solution which does not require a IndexReader is to re-analyze the text. Depending on your setup this might only be minimally slower. In fact the Lucene folks even say that "Whether to index term vectors or reanalyze the text is an application-dependent decision: run your own tests to measure the difference in runtime and index size for each approach.".

You can store the original text in the index and project the value. Once you have the value you use the right analyzer to create the token stream and pass everything to the highlighter.

--Hardy


Top
 Profile  
 
 Post subject: Re: Hibernate Search: Projecting DOCUMENT_ID from search
PostPosted: Mon Dec 19, 2011 12:41 pm 
Regular
Regular

Joined: Thu Oct 07, 2004 4:45 pm
Posts: 92
I'm concerned about how much RAM that could consume; the text field is lengthy. But I will investigate further. Thanks.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 5 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.