-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 1 post ] 
Author Message
 Post subject: Behavior of Hibernate Search queries
PostPosted: Thu May 19, 2011 3:42 pm 
Regular
Regular

Joined: Tue May 17, 2011 1:45 am
Posts: 52
Hi,

I still trying out various searching options in Hibernate Search and find the most accurate as well as well performing option. I would like to highlight my usecase as follows

I have three fields

Field 1 ---> Captures the name of a city
Field 2 ---> Key attractions of the city
Field 3 ---> Describes the city in around 1000- 3500 words



The search forum as expected has only one text box that allows searching across these indexes

My questions are as follows (Assuming all these fields have been indexed using NGram and other typical analyzers)

1. Supposing a user enters "what are the museums in New York City"
How do I distinguish that Field 1 should actually be searching for the city part of the Text while Field 2 and Field 3 should take it all of the text?

2. I tried boosting, Field 2 and Field 3 higher than Field 1, since I use a BooleanQuery and combine all these fields (which means each field is in OR combination ). I assumed that since New York can occur many times the search can bring up results that show things other than museums (because of the OR) so I tried boosting Field 2 and Field 3 expecting that those documents where "museum" was found would be boosted higher compared to those results where Museum was not found. However it dint have any effect. I use a sort field too, could it be because of that? I removed the sort , but still results do not seem to come in the order of boosting.

3. My next problem is something I am sure many of you have faced and that is finding the containing words. For example, if I have a search text as : which trains are available between Park Avenue and Brooklyn . I want to search this in Field 2 and Field 3. I found out that using PhraseQuery or QueryParser with ~ option (slop) returns me results but the problem is slops tend to be performance heavy. I could have a full blown text of 5000 words and using a slop of ~5000 is slow. My requirement is that I should be able to search all the search text words in the fields. Which means that for the above search text results should return only if all the tokens were found anywhere in the Field 2 or Field 3 . What would be the approach for this?


Pointers to these would be very helpful

Thanks ,
David


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 1 post ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
cron
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.