Confusion with Standard Analyzer

anilit99 · **Joined:** Wed Feb 20, 2008 6:42 pm **Posts:** 14

Hi,
Is it just me, or everybody know how to get around this issue ? Basically I am having a persistence class more or less like this:

Code:

@Entity
@Indexed
@Table(name = "products") 
@SequenceGenerator(name = "products_seq", sequenceName = "products_seq")
public class Product {
  @DocumentId
  private Long productId;
  @Field(index = Index.TOKENIZED, store = Store.NO)
  private String productNm;
  @Field(index = Index.UN_TOKENIZED, store = Store.NO)
  private String catalogKy;
}

I indexed the class and every thing is ok, I could open the index using Luke and every thing seems to be fine. My retrieval code is this :

Code:

   QueryParser parser = new QueryParser("catalogKy", new StandardAnalyzer());
    Query luceneQuery = parser.parse("catalogKy:086244CK0101");

This was just not giving me the hit. The problem, I thought, I found was, when I was querying using the standard analyzer it was changing the entire value to lower case (086244ck0101), but if I open the index file in Luke it was showing me the original value (086244CK0101). But the below query was successful (with all numeric values):

Code:

   QueryParser parser = new QueryParser("catalogKy", new StandardAnalyzer());
    Query luceneQuery = parser.parse("catalogKy:086244120101");

Its happening only with the un tokenized columns. Am I missing some basic thing ? I searched the forums for this kind of problem with no success. Any help is greatly appreciated.

thanks
Anil.

emmanuel · **Posted:** Tue Feb 26, 2008 2:07 pm

A Lucene Query parser will apply the analyzer (ie tokenize) to the query. So querying an untokenize field with the query parser is not reliable.

You can write a Lucene query programmatically (in your case a TermQuery) to avoid the analyzer work.

anilit99 · **Joined:** Wed Feb 20, 2008 6:42 pm **Posts:** 14

Thanks a lot Emmanuel,
that should resolve my issue.

thanks
Anil.