Hi,
I still trying out various searching options in Hibernate Search and find the most accurate as well as well performing option. I would like to highlight my usecase as follows
I have three fields
Field 1 ---> Captures the name of a city Field 2 ---> Key attractions of the city Field 3 ---> Describes the city in around 1000- 3500 words
The search forum as expected has only one text box that allows searching across these indexes
My questions are as follows (Assuming all these fields have been indexed using NGram and other typical analyzers)
1. Supposing a user enters "what are the museums in New York City" How do I distinguish that Field 1 should actually be searching for the city part of the Text while Field 2 and Field 3 should take it all of the text?
2. I tried boosting, Field 2 and Field 3 higher than Field 1, since I use a BooleanQuery and combine all these fields (which means each field is in OR combination ). I assumed that since New York can occur many times the search can bring up results that show things other than museums (because of the OR) so I tried boosting Field 2 and Field 3 expecting that those documents where "museum" was found would be boosted higher compared to those results where Museum was not found. However it dint have any effect. I use a sort field too, could it be because of that? I removed the sort , but still results do not seem to come in the order of boosting.
3. My next problem is something I am sure many of you have faced and that is finding the containing words. For example, if I have a search text as : which trains are available between Park Avenue and Brooklyn . I want to search this in Field 2 and Field 3. I found out that using PhraseQuery or QueryParser with ~ option (slop) returns me results but the problem is slops tend to be performance heavy. I could have a full blown text of 5000 words and using a slop of ~5000 is slow. My requirement is that I should be able to search all the search text words in the fields. Which means that for the above search text results should return only if all the tokens were found anywhere in the Field 2 or Field 3 . What would be the approach for this?
Pointers to these would be very helpful
Thanks , David
|