-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 
Author Message
 Post subject: Querying fields containing punctuation characters (''', '.')
PostPosted: Fri Jul 10, 2009 12:59 am 
Newbie

Joined: Thu Mar 05, 2009 3:56 pm
Posts: 10
Hi, I have a field name in an entity and name can contain [dot,.] as a seperator between First, Middle or last name. I am using StandardFilterFactory which takes care of such tokenization. But when I give a query which contain first/middle/last names seperated by spaces, it doesn't return the result which means that It din't match the query with the entity.
Code:
@Entity
@Indexed
@AnalyzerDef(name = "customanalyzer",
      tokenizer = @TokenizerDef(factory =
      StandardTokenizerFactory.class),
      filters = {
      @TokenFilterDef(factory = StandardFilterFactory.class),
      @TokenFilterDef(factory = LowerCaseFilterFactory.class),
      @TokenFilterDef(factory = SnowballPorterFilterFactory.class,
      params = {
      @Parameter(name = "language", value = "English")
      })
      })
public class Book implements Serializable{

   @Id @GeneratedValue
   private Long id;
   
   @Field(index=Index.TOKENIZED, store=Store.NO)
   @Analyzer(definition = "customanalyzer")
   private String title;
   
   @ManyToMany(cascade = CascadeType.ALL)
   @IndexedEmbedded
   @Analyzer(definition = "customanalyzer")
   private List<Author> authors;
}

public class Author implements Serializable{

   @Id @GeneratedValue
   private Long id;
   
   @Field(index=Index.TOKENIZED, store=Store.NO)
   @Analyzer(definition="customanalyzer")
   private String name;
}


In above code, author.name is the field I want to query. For example I have a author name "H.C.Verma" in the database. But when I query with the string "H C Verma", It doesn't return the entity containing "H.C.Verma" as the author. But when I give "H.C.Verma" as the query, then appropriate result is returned. Can you please help me?

I have one more query which is related to stemmer. I have "Mathematics" as one of the title of the book. But when I query for "Math" or "Maths", it doesn't return the result of entities containing "Mathematics" but when I query with "Mathematics", appropriate result is returned. How can I solve this issue too? I am using SnowballPorterFilterFactory which acts as a stemmer.


Top
 Profile  
 
 Post subject: Re: Querying fields containing punctuation characters (''', '.')
PostPosted: Fri Jul 10, 2009 4:17 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
satyendra_411 wrote:
In above code, author.name is the field I want to query. For example I have a author name "H.C.Verma" in the database. But when I query with the string "H C Verma", It doesn't return the entity containing "H.C.Verma" as the author. But when I give "H.C.Verma" as the query, then appropriate result is returned. Can you please help me?

The StandardTokenizer is not cutting it in your case. The javadoc says:"Splits words at punctuation characters, removing punctuation. However, a dot that's not followed by whitespace is considered part of a token." So if you want to split your names you will have to use a custom tokenizer or maybe use Solr's PatternTokenizer.

satyendra_411 wrote:
I have one more query which is related to stemmer. I have "Mathematics" as one of the title of the book. But when I query for "Math" or "Maths", it doesn't return the result of entities containing "Mathematics" but when I query with "Mathematics", appropriate result is returned. How can I solve this issue too? I am using SnowballPorterFilterFactory which acts as a stemmer.

Well, as you can see here the stem for "Mathematics" is "mathemat", so you probably need something like "Math*".

Hope this helps.

--Hardy


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.