-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 
Author Message
 Post subject: How to correctly use boosting for custom scoring results?
PostPosted: Wed Nov 18, 2015 11:36 am 
Newbie

Joined: Wed Nov 18, 2015 10:46 am
Posts: 2
Hello,

For our application I am trying to realize a custom scoring mechanism based on some fields of our Route class. The idea is to disable the default scoring of Lucene and boost certain range and keyword SHOULD queries to boost results upward in the result list. Some examples of this is that we want to boost Route objects that have more than 1000 views, Route objects that have 1 or more pictures, Route objects that have a RoutePoint object that has a Poi object that has the participant boolean set to true, and so on.

The way I tried to realize this, is to use a custom Similarity, to first hopefully disable the default Lucene scoring. This is the code that I use for that:

Code:
public class IgnoreScoringSimilarity extends DefaultSimilarity {
    @Override
    public float idf(long docFreq, long numDocs) {
        return 1.0f;
    }
   
    @Override
    public float tf(float freq) {
        return 1.0f;
    }
   
    @Override
    public float coord(int overlap, int maxOverlap) {
        return 1.0f;
    }
   
    @Override
    public float lengthNorm(FieldInvertState state) {
        return 1.0f;
    }
   
    @Override
    public float queryNorm(float sumOfSquaredWeights) {
        return 1.0f;
    }
}

And the persistence.xml entry:
Code:
<property name="hibernate.search.default.similarity" value="com.package.to.search.IgnoreScoringSimilarity"/>


Then, when we build up the BooleanJunction's for filtering our results (using MUST clauses), we add SHOULD clauses with our scoring parameters, and boost those clauses. An example of this is:

Code:
        QueryBuilder qb = fullTextSession.getSearchFactory().buildQueryBuilder().forEntity(Route.class).get();
        BooleanJunction bj = qb.bool();

        for (Map.Entry<Integer, List<Filter>> entry : topLevelGroupedFilters.entrySet()) {
            BooleanJunction entryJunction = qb.bool();

            // For every filter in the entry, add it as a 'should' to the entry boolean junction, so when multiple filters of the same type get added,
            // they get added in an OR. When the entry has only one should for a type, it defaults to an AND
            for (Filter filter : entry.getValue()) {
                // Group owner filters
                List<GroupFilter> gfs = filterService.getGroupFilters(filter);
                if (gfs != null && !gfs.isEmpty()) {
                    BooleanJunction groupsBJ = qb.bool();
                    if (gfs.size() > 1) {
                        for (GroupFilter groupFilter : gfs) {
                            groupsBJ.should(qb.keyword().onField("group.id").matching(Integer.toString(groupFilter.getGroup().getId())).createQuery());
                        }
                    } else {
                        groupsBJ.must(qb.keyword().onField("group.id").matching(Integer.toString(gfs.get(0).getGroup().getId())).createQuery());
                    }
                    entryJunction.should(groupsBJ.createQuery());
                }
            }
            if (!entryJunction.isEmpty()) {
                bj.must(entryJunction.createQuery());
            }
            bj.should(qb.range().onField("descriptionLength").above(1000).createQuery()).boostedTo(2.0f);
            bj.should(qb.range().onField("views.views").above(1000).createQuery()).boostedTo(1.0f);
            bj.should(qb.range().onField("nameLength").above(20).createQuery()).boostedTo(1.0f);
            bj.should(qb.range().onField("picturesLength").above(0).createQuery()).boostedTo(5.0f);
            bj.should(qb.range().onField("routeCategoriesLength").above(2).createQuery()).boostedTo(2.0f);
            bj.should(qb.keyword().onField("routePoints.poi.participant").matching("true").createQuery().boostedTo(10.0f));
           
            org.apache.lucene.search.Query luceneQuery = bj.createQuery();

            org.hibernate.search.FullTextQuery query = fullTextSession.createFullTextQuery(luceneQuery, Route.class);
        }


This yielded results that I was not expecting. It seemed some results were pretty high up in the results sorting that didn't have any of the boosting query match. This caused me to just test only one boost and set it really high, to see if that worked.

I tried setting only:

Code:
bj.should(qb.keyword().onField("routePoints.poi.participant").matching("true").createQuery().boostedTo(10000.0f));


And the results seemed for 80% what I expected. There were some duds in the top results that had routePoints.poi.participant false, and it seemed to happen more when I set more filters. That is why I think I am doing something wrong, or that Lucene is still scoring my results in spite of me setting the custom Similarity. I made a Stackoverflow question (http://stackoverflow.com/questions/30708833/how-to-disable-default-scoring-boosting-in-hibernate-search-lucene) for this, asking how to disable default scoring, but it has yet to get a relevant answer.

Basically my question boils down to:

How do I use Hibernate Search/Lucene API to score my results based on some keyword/range queries on my Route object?

I am using Wildfly 8.2 (Has Hibernate 4.3.7 core built-in I believe) and Hibernate Search 5.2.0-Final.


Top
 Profile  
 
 Post subject: Re: How to correctly use boosting for custom scoring results?
PostPosted: Mon Nov 23, 2015 9:12 am 
Hibernate Team
Hibernate Team

Joined: Sat Jan 24, 2009 12:46 pm
Posts: 388
Hi,

Have you looked into dynamic boosting already (see http://docs.jboss.org/hibernate/search/5.5/reference/en-US/html_single/#section-dynamic-boost)? That way you should essentially be able to give different Route instances different boost values depending on their object state.

Hth,

--Gunnar

_________________
Visit my blog at http://musingsofaprogrammingaddict.blogspot.com/


Top
 Profile  
 
 Post subject: Re: How to correctly use boosting for custom scoring results?
PostPosted: Fri Nov 27, 2015 4:11 am 
Newbie

Joined: Wed Nov 18, 2015 10:46 am
Posts: 2
Gunnar wrote:
Hi,

Have you looked into dynamic boosting already (see http://docs.jboss.org/hibernate/search/5.5/reference/en-US/html_single/#section-dynamic-boost)? That way you should essentially be able to give different Route instances different boost values depending on their object state.

Hth,

--Gunnar


Thanks for your reply!

I have tried using a custom Dynamic Boost class and annotate it with @DynamicBoost(impl = CustomScoring.class), then reindexed. However, the results I am seeing don't seem to take the dynamic boost into account. Could it be that I need to set higher values to compensate for the default scoring of Lucene?


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.