Hello,
For our application I am trying to realize a custom scoring mechanism based on some fields of our
Route class. The idea is to disable the default scoring of Lucene and boost certain range and keyword SHOULD queries to boost results upward in the result list. Some examples of this is that we want to boost
Route objects that have more than 1000 views,
Route objects that have 1 or more pictures, Route objects that have a
RoutePoint object that has a
Poi object that has the
participant boolean set to
true, and so on.
The way I tried to realize this, is to use a custom Similarity, to first hopefully disable the default Lucene scoring. This is the code that I use for that:
Code:
public class IgnoreScoringSimilarity extends DefaultSimilarity {
@Override
public float idf(long docFreq, long numDocs) {
return 1.0f;
}
@Override
public float tf(float freq) {
return 1.0f;
}
@Override
public float coord(int overlap, int maxOverlap) {
return 1.0f;
}
@Override
public float lengthNorm(FieldInvertState state) {
return 1.0f;
}
@Override
public float queryNorm(float sumOfSquaredWeights) {
return 1.0f;
}
}
And the persistence.xml entry:
Code:
<property name="hibernate.search.default.similarity" value="com.package.to.search.IgnoreScoringSimilarity"/>
Then, when we build up the
BooleanJunction's for filtering our results (using MUST clauses), we add SHOULD clauses with our scoring parameters, and boost those clauses. An example of this is:
Code:
QueryBuilder qb = fullTextSession.getSearchFactory().buildQueryBuilder().forEntity(Route.class).get();
BooleanJunction bj = qb.bool();
for (Map.Entry<Integer, List<Filter>> entry : topLevelGroupedFilters.entrySet()) {
BooleanJunction entryJunction = qb.bool();
// For every filter in the entry, add it as a 'should' to the entry boolean junction, so when multiple filters of the same type get added,
// they get added in an OR. When the entry has only one should for a type, it defaults to an AND
for (Filter filter : entry.getValue()) {
// Group owner filters
List<GroupFilter> gfs = filterService.getGroupFilters(filter);
if (gfs != null && !gfs.isEmpty()) {
BooleanJunction groupsBJ = qb.bool();
if (gfs.size() > 1) {
for (GroupFilter groupFilter : gfs) {
groupsBJ.should(qb.keyword().onField("group.id").matching(Integer.toString(groupFilter.getGroup().getId())).createQuery());
}
} else {
groupsBJ.must(qb.keyword().onField("group.id").matching(Integer.toString(gfs.get(0).getGroup().getId())).createQuery());
}
entryJunction.should(groupsBJ.createQuery());
}
}
if (!entryJunction.isEmpty()) {
bj.must(entryJunction.createQuery());
}
bj.should(qb.range().onField("descriptionLength").above(1000).createQuery()).boostedTo(2.0f);
bj.should(qb.range().onField("views.views").above(1000).createQuery()).boostedTo(1.0f);
bj.should(qb.range().onField("nameLength").above(20).createQuery()).boostedTo(1.0f);
bj.should(qb.range().onField("picturesLength").above(0).createQuery()).boostedTo(5.0f);
bj.should(qb.range().onField("routeCategoriesLength").above(2).createQuery()).boostedTo(2.0f);
bj.should(qb.keyword().onField("routePoints.poi.participant").matching("true").createQuery().boostedTo(10.0f));
org.apache.lucene.search.Query luceneQuery = bj.createQuery();
org.hibernate.search.FullTextQuery query = fullTextSession.createFullTextQuery(luceneQuery, Route.class);
}
This yielded results that I was not expecting. It seemed some results were pretty high up in the results sorting that didn't have any of the boosting query match. This caused me to just test only one boost and set it really high, to see if that worked.
I tried setting only:
Code:
bj.should(qb.keyword().onField("routePoints.poi.participant").matching("true").createQuery().boostedTo(10000.0f));
And the results seemed for 80% what I expected. There were some duds in the top results that had
routePoints.poi.participant false, and it seemed to happen more when I set more filters. That is why I think I am doing something wrong, or that Lucene is still scoring my results in spite of me setting the custom
Similarity. I made a Stackoverflow question (http://stackoverflow.com/questions/30708833/how-to-disable-default-scoring-boosting-in-hibernate-search-lucene) for this, asking how to disable default scoring, but it has yet to get a relevant answer.
Basically my question boils down to:
How do I use Hibernate Search/Lucene API to score my results based on some keyword/range queries on my Route object? I am using Wildfly 8.2 (Has Hibernate 4.3.7 core built-in I believe) and Hibernate Search 5.2.0-Final.