-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 7 posts ] 
Author Message
 Post subject: Does Hibernate Search support finding similar objects?
PostPosted: Wed Sep 19, 2007 10:02 pm 
Newbie

Joined: Mon Jun 12, 2006 4:11 am
Posts: 6
Hi all,

I have to find similar objects in my application. Say, given a book object, I need to search similar book objects accroding to the category, keyword and summary.etc and return them to the end user in my application. I can use Lucene Search Similarity to achieve that. Since Hibernate Search can handle most of my needs, I would like to use Hibernate Search as my wrapper to the underlying Lucene, but I wonder how to achieve that in Hibernate Search, does it support similarity?

I think a better way is something like QBE, I can give the target book object, specify the fields needed to be concerned, and make a query and then get the list of similar book objects. Many thanks.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Sep 20, 2007 12:02 pm 
Hibernate Team
Hibernate Team

Joined: Sun Sep 14, 2003 3:54 am
Posts: 7256
Location: Paris, France
Do you mean you want to inject an implementation of the Similarity class of Lucene (the score algorithm)?
Or you want to use a FuzzyQuery based on the properties of an example object?

I guess something similar to the QBE API could make it, you can very well build the Lucene query out of an example object, and then inject the lucene que ry into the hibernate search query

_________________
Emmanuel


Top
 Profile  
 
 Post subject:
PostPosted: Thu Sep 20, 2007 9:37 pm 
Newbie

Joined: Mon Jun 12, 2006 4:11 am
Posts: 6
Hi emmanuel,

Thanks for your reply. To make it clear, I only want a fuction like google's similar page when you get a search result list. For each page, you can get its similar pages by similarity score algorithsm (Lucene has a similarity algorithsm via term vector).

I am afraid that common QBE would not to do that correctly, because it will be turned into SQL query which is not good at full text search(Also this is why we need search engine like lucene, am I wrong?). I think it is not the same thing as fuzzy search. For example, if I get a book talking about how to use hibernate, maybe I also want other similar books (they also talk about the same topic), but I can not give a definite query about this. Lucene will use a term vector to calculate all terms' frequency and give comparisons in all docs, then return the similar docs. In one word, fuzzy query still needs you to input a query condition/target while similarity function will automatically analyze all the information/terms in one or several fields specified by you (just like google, you do not need to input a query to get similar pages).

The reason I refer to QBE is that I think it would be better to get this funtion in a way like QBE. In the google case, each page returned by google is a page object with a field named content, if you click the similar pages button, google will automatically analyze and compare its content (in content field) to all of pages it indexed and return similar pages to you. In this process, you do not need to make a definite query condition, instead, you only know the target page.

I hope I can express myself clearly and I just wonder how to acheive the fuction I mentioned above, and could you give me a hand? Many thanks.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Sep 21, 2007 12:13 am 
Hibernate Team
Hibernate Team

Joined: Sun Sep 14, 2003 3:54 am
Posts: 7256
Location: Paris, France
Ah got you know.
I look at how this stuff is implemented in plain Lucene (through the TermFreqVector), and I am pretty sure this can be done for Hibernate Search through a QBE-like API (not the exact QBE API). This QBE like API could create a Lucene query which will be consumed by Hibernate Search.

If you're interested in implementing that, let's talk about it (probably on the dev mailing list), I can definitely give you my advice to make sure it goes back into the HSearch code base.

In any case, open a JIRA issue, that's a great idea

_________________
Emmanuel


Top
 Profile  
 
 Post subject:
PostPosted: Fri Sep 21, 2007 12:45 am 
Newbie

Joined: Mon Jun 12, 2006 4:11 am
Posts: 6
Hi Emmanuel,
Thanks a lot for your quick reply. I would like to try with your help and advice. What should I do for this?

BTW, I've already added this to JIRA.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Oct 01, 2007 11:12 am 
Hibernate Team
Hibernate Team

Joined: Sun Sep 14, 2003 3:54 am
Posts: 7256
Location: Paris, France
hi,
sorry for the delay,
If you have a copy of Lucene in Action, check the section about finding similar object (using TermFreqVector). This is how the core algorithm should be done, probably in an helper method of some kind.

As a first draft, look at the Example class, the first methods are agnostic to Hibernate, we will then just need to have a toLuceneQuery() method that will be able to create the query.
The hard part is that I don't expose the HSearch metadata today, this needs to be worked out as well.

_________________
Emmanuel


Top
 Profile  
 
 Post subject:
PostPosted: Wed Oct 03, 2007 9:58 am 
Newbie

Joined: Mon Jun 12, 2006 4:11 am
Posts: 6
Hi emmanuel,
Thanks for your hints, I will check it out and talk to you later.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 7 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.