-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 5 posts ] 
Author Message
 Post subject: Hibernate Search: Relaxing Searches Between 1/L or O/0
PostPosted: Thu Aug 21, 2008 6:03 pm 
Regular
Regular

Joined: Fri Oct 05, 2007 1:17 pm
Posts: 78
I am working with Hibernate Search and finding it to be very cool, but I received an interesting request from my customer. For this particular search parameter, they want the search to be lenient between the number 1 (one) and the lowercase letter L as well as the letter O and the number 0 (zero).

I was wondering if Lucene offers any sort of query that would provide relevant functionality. If not, how would you go about implementing such a thing?

Thanks.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Aug 22, 2008 4:36 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

the way to solve this is via a custom Analyzer. The idea would be to insert additional tokens when analyzing the text. For example, 'Oval' would be indexed as 'Oval', '0val', 'Ova1' and '0va1'. The idea is similar to the synonym engine described in Lucene in Action. Maybe this http://www.onjava.com/pub/a/onjava/2003/01/15/lucene.html helps as well (though the article is old).

--Hardy


Top
 Profile  
 
 Post subject:
PostPosted: Fri Aug 22, 2008 5:40 pm 
Regular
Regular

Joined: Fri Oct 05, 2007 1:17 pm
Posts: 78
I am new to Hibernate Search (and thus to Lucene), so I am unclear where to do what you describe. Is it at the Token level or at the Filter level or at the Analyzer level? (Let's assume that question makes sense.)

I am thinking a custom filter that gets called inside a custom analyzer that basically decorates StandardAnalyzer. Does that sound reasonable?

Any hints as to where the implementation should be are appreciated.

Incidentally, I can apply my custom analyzer at a field level that overrides the default analyzer specified (in my persistence.xml as this is a JPA app). Right?

Thanks.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Aug 22, 2008 5:53 pm 
Regular
Regular

Joined: Fri Oct 05, 2007 1:17 pm
Posts: 78
Just had another thought: a RegexQuery.

To use your example, I could take the term "oval" and do something like this:

[o | 0]va[l | 1]

Ugly I know...basically "(lowercase O or the number zero) then lowercase V then lowercase A then (lowercase L or the number one)"

and pass that to a RegexQuery that will look at the index for "oval" generated by plain old StandardAnalyzer.

Would that work? If so, would there be a performance drawback compared to a more elegant approach?

Thanks.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Aug 23, 2008 1:31 pm 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

the Lucene FAQ has an example on how to write your own analyzer and yes you can annotate (specify) the analyzer on field level.

I recommend you are going down this path since it is the Lucene way :)

--Hardy


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 5 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.