-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 4 posts ] 
Author Message
 Post subject: Problem with NumericRangeQuery and precisionStep
PostPosted: Thu Aug 01, 2013 2:29 am 
Newbie

Joined: Thu Aug 01, 2013 2:21 am
Posts: 4
I have a contact table with all the below data :-

Id: 5 | Name:Li Chao | Email:lichao@email.com | Salary:7000
Id: 1 | Name:Li Abhijit Ghosh | Email:abhijit@email.com | Salary:5000
Id: 3 | Name:Li My Name | Email:my_email@email.com | Salary:6000
Id: 6 | Name:Li Your Name | Email:tomli@email.com | Salary:10000
Id: 7 | Name:Li Your Name | Email:tomli@email.com | Salary:11000
Id: 24 | Name:Li Your Name | Email:your_email@email.com | Salary:25000

Iam trying to put a Range search on my column Salary . Iam using NumericRangeFilter for this ,can anyone tell us how to ascertain the precisionStep value while applying NumericRangeFilter in my class Filter.java . Iam confused with the results coming while using different precisionStep values.

App.java:-

Code:
// Search Code
// queryString value that is passed is "Li" so that it matches all documents
public static List<Contact> search(String queryString) {
   Session session = HibernateUtil.getSession();
   FullTextSession fullTextSession = Search.getFullTextSession(session);
   
   QueryBuilder queryBuilder = fullTextSession.getSearchFactory().buildQueryBuilder().forEntity(Contact.class).get();
   org.apache.lucene.search.Query luceneQuery = queryBuilder.keyword().onFields("name").matching(queryString).createQuery();

   FullTextQuery    fullTextQuery = fullTextSession.createFullTextQuery(luceneQuery, Contact.class);
   fullTextQuery.enableFullTextFilter("security").setParameter("session", fullTextSession);
   System.out.println("full text query : "+fullTextQuery.toString());
   List<Contact> contactList = fullTextQuery.list();
   
   fullTextSession.close();
   
   return contactList;
}


Filter.java:-

Code:

public class MyFilter {
   
   private FullTextSession session;

   public void setSession(FullTextSession session) {
      this.session = session;
   }

   @Factory
   public Filter buildDistributorFilter() {
      
      NumericRangeFilter<Integer> filter = NumericRangeFilter.newIntRange("salary", 14, 6999, 25023, true, true);
      return filter;
      
   }

   @Key
   public FilterKey getKey() {
      StandardFilterKey key = new StandardFilterKey();
      key.addParameter(session);
      return key;
   }
}



Scenario 1 :-

Here when we put precisonStep as 14, then all the 4 results appear in search .

Code:
NumericRangeFilter<Integer> filter = NumericRangeFilter.newIntRange("salary", 14, 6999, 25023, true, true);


Id: 5 | Name:Li Chao | Email:lichao@email.com | Salary:7000
Id: 6 | Name:Li Your Name | Email:tomli@email.com | Salary:10000
Id: 7 | Name:Li Your Name | Email:tomli@email.com | Salary:11000
Id: 24 | Name:Li Your Name | Email:your_email@email.com | Salary:25000

Scenario 2

Once I change the precisionStep to 13 and put the filter as below then only below results come.

Code:
NumericRangeFilter<Integer> filter = NumericRangeFilter.newIntRange("salary", 13, 6999, 25023, true, true);


Id: 5 | Name:Li Chao | Email:lichao@email.com | Salary:7000
Id: 24 | Name:Li Your Name | Email:your_email@email.com | Salary:25000

As you can see, I have only changed the precisionStep and nothing else here.

Is there any rule to determine precision step . I have read that you will need to do testing for getting the exact precisionStep which works for you.

Excerpt from NumericRangeQuery Javadocs taken from http://lucene.apache.org/core/3_6_2/api/all/org/apache/lucene/search/NumericRangeQuery.html

Quote:
Precision Step:

You can choose any precisionStep when encoding values. Lower step values mean more precisions and so more terms in index (and index gets larger). On the other hand, the maximum number of terms to match reduces, which optimized query speed. The formula to calculate the maximum term count is:

n = [ (bitsPerValue/precisionStep - 1) * (2^precisionStep - 1 ) * 2 ] + (2^precisionStep - 1 )


(this formula is only correct, when bitsPerValue/precisionStep is an integer; in other cases, the value must be rounded up and the last summand must contain the modulo of the division as precision step). For longs stored using a precision step of 4, n = 15*15*2 + 15 = 465, and for a precision step of 2, n = 31*3*2 + 3 = 189. But the faster search speed is reduced by more seeking in the term enum of the index. Because of this, the ideal precisionStep value can only be found out by testing.


Can somebody please help me on this?


Top
 Profile  
 
 Post subject: Re: Problem with NumericRangeQuery and precisionStep
PostPosted: Thu Aug 01, 2013 5:19 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi,
it's important to match the same precision you're using during indexing.

http://docs.jboss.org/hibernate/search/4.3/api/org/hibernate/search/annotations/NumericField.html#precisionStep()

If you experiment with different values, did you rebuild the index? If the index content and the query strategy are out of synch, it gets quite unpredictable as the units don't match.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Problem with NumericRangeQuery and precisionStep
PostPosted: Fri Aug 02, 2013 2:04 am 
Newbie

Joined: Thu Aug 01, 2013 2:21 am
Posts: 4
Hi Sanne,

Thanks for your post!!

This makes sense. I have also read that the optimum value of (long, double) precisionStep for 64 bit data types should be 6 or 8 whereas the default precisionStep is 4.

Taken from http://lucene.apache.org/core/3_6_2/api/all/org/apache/lucene/search/NumericRangeQuery.html

Quote:

The default for all data types is 4, which is used, when no precisionStep is given.
Ideal value in most cases for 64 bit data types (long, double) is 6 or 8.
Ideal value in most cases for 32 bit data types (int, float) is 4.



What value of precisionStep would you recommend for these data types and I assume we will have to change the value of precisionStep in both the @NumericField annotation and while querying.

Thanks again!


Top
 Profile  
 
 Post subject: Re: Problem with NumericRangeQuery and precisionStep
PostPosted: Fri Aug 02, 2013 4:19 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Right, it's best to follow the advice from the Lucene javadoc: as a starting point set the precision according to the types you use (can be different for each different field), but then you can of course try different values to see how they perform in your case. As long as you remember to rebuild the index when redeploying an application having an updated precision.

Correct, you'll have to change both the precisionStep on the mapping and on the queries.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 4 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.