Hi emmanuel,
Thanks for your reply. To make it clear, I only want a fuction like google's similar page when you get a search result list. For each page, you can get its similar pages by similarity score algorithsm (Lucene has a similarity algorithsm via term vector).
I am afraid that common QBE would not to do that correctly, because it will be turned into SQL query which is not good at full text search(Also this is why we need search engine like lucene, am I wrong?). I think it is not the same thing as fuzzy search. For example, if I get a book talking about how to use hibernate, maybe I also want other similar books (they also talk about the same topic), but I can not give a definite query about this. Lucene will use a term vector to calculate all terms' frequency and give comparisons in all docs, then return the similar docs. In one word, fuzzy query still needs you to input a query condition/target while similarity function will automatically analyze all the information/terms in one or several fields specified by you (just like google, you do not need to input a query to get similar pages).
The reason I refer to QBE is that I think it would be better to get this funtion in a way like QBE. In the google case, each page returned by google is a page object with a field named content, if you click the similar pages button, google will automatically analyze and compare its content (in content field) to all of pages it indexed and return similar pages to you. In this process, you do not need to make a definite query condition, instead, you only know the target page.
I hope I can express myself clearly and I just wonder how to acheive the fuction I mentioned above, and could you give me a hand? Many thanks.
|