Hibernate Books

All times are UTC - 5 hours [ DST ]



Post new topic Reply to topic  [ 12 posts ] 
Author Message
 Post subject: [HSEARCH-726] Faceted search + embedded fields (*tomany)
PostPosted: Tue Apr 12, 2011 10:05 am 
Beginner
Beginner

Joined: Mon Apr 11, 2011 7:56 am
Posts: 38
See [HSEARCH-726]

Hi,
I just updated to Hibernate Search 3.4.0CR2 and played with faceted search.
In my case, the counts in the returned facets are incorrect. This probably happens when searching on a embedded field with *tomany relation (I will experiment with it more if it's not caused by my code).
Before I create an issue for this, I want to be sure it's not due to bugs in my code.

To illustrate the problem, here is a result (showing the author facet and search results) of a simple query:
Code:
Top 10 authors (faceted search):

    Brand Author(2)
    My Name 28690(1)
    My Name 12970(1)

Search results (Publications):

4 results found in 3 ms., displaying results 1-4

    Title: Title1
    Authors: My Name 12970, My Name 28690, My Name 20709

    Title: Title2
    Authors: My Name 12970, Brand Author
   
    Title: Title3
    Authors: Brand Author

    Title: Title4
    Authors: Brand Author

The problem is that the facet count of 'Brand Author' should be 3, 'My Name 12970' should be 2 and 'My Name 20709' should be 1 (and thus being displayed).

Something seems wrong in collecting the facets, or something is wrong in my code.
Here are the related snippets used for getting the facet objects and search results. The methods facets and list are using the same FullTextQuery instance ('query').
Code:
public java.util.List<Facet> facets(String field, int topN) {
      
      org.hibernate.search.query.dsl.QueryBuilder builder = fullTextSession
            .getSearchFactory().buildQueryBuilder().forEntity(entityClass)
            .get();

      FacetingRequest facetReq = builder.facet().name(field).onField(field).discrete().orderedBy(FacetSortOrder.COUNT_DESC).includeZeroCounts(false).maxFacetCount(topN).createFacetingRequest();

      if (validateQuery())
         return query.getFacetManager().enableFaceting(facetReq)
               .getFacets(field);
      else{
         return new ArrayList<Facet>();
      }
.
.
public java.util.List<EntityClass> list() {
   if (validateQuery()) {
      return (java.util.List<EntityClass>) query.list();
   } else
      return new ArrayList<EntityClass>();
}
.
.
private boolean validateQuery() {
      if (luceneQueryChanged) {
         if (!searchTerms.isEmpty()) {
            org.apache.lucene.queryParser.QueryParser parser = new org.apache.lucene.queryParser.MultiFieldQueryParser(
                  luceneVersion, nonNGramSearchFields, fullTextSession
                        .getSearchFactory().getAnalyzer(entityClass));
            try {
               luceneQuery = parser.parse(searchTerms);
            } catch (org.apache.lucene.queryParser.ParseException pe) {
               return false;
            }
         }
         // Match all documents if no search terms are given
         else {
            luceneQuery = fullTextSession.getSearchFactory()
                  .buildQueryBuilder().forEntity(entityClass).get().all()
                  .createQuery();
         }
         query = fullTextSession.createFullTextQuery(luceneQuery,
               entityClass);
         luceneQueryChanged = false;

      }
      query.setFirstResult(offset);
      query.setMaxResults(limit);

      return true;
   }


Code snippets with annotations in data model (this is generated code), both sides of the relation are searchable through an embedded field:
Publication.java:
Code:
  @ManyToMany(mappedBy = "_publications", targetEntity = Author.class, fetch = javax.persistence.FetchType.LAZY) @org.hibernate.annotations.Cascade({org.hibernate.annotations.CascadeType.PERSIST, org.hibernate.annotations.CascadeType.SAVE_UPDATE, org.hibernate.annotations.CascadeType.MERGE}) @org.hibernate.search.annotations.ContainedIn protected java.util.Set<webdsl.generated.domain.Author> _authors = new java.util.LinkedHashSet<webdsl.generated.domain.Author>();

  @org.hibernate.search.annotations.IndexedEmbedded(depth = 1, prefix = "authors" + ".") public java.util.Set<webdsl.generated.domain.Author> getAuthors()
  {
    return _authors;
  }
Author.java:
Code:
  @ManyToMany(fetch = javax.persistence.FetchType.LAZY) @JoinTable(name = "Author_publications_Publication", joinColumns = {@JoinColumn(name = "Author_id_owner")}, inverseJoinColumns = {@JoinColumn(name = "Publication_id_inverse")}) @org.hibernate.annotations.Cascade({org.hibernate.annotations.CascadeType.PERSIST, org.hibernate.annotations.CascadeType.SAVE_UPDATE, org.hibernate.annotations.CascadeType.MERGE}) @org.hibernate.search.annotations.ContainedIn protected java.util.Set<webdsl.generated.domain.Publication> _publications = new java.util.LinkedHashSet<webdsl.generated.domain.Publication>();

  @org.hibernate.search.annotations.IndexedEmbedded(depth = 1, prefix = "publications" + ".") public java.util.Set<webdsl.generated.domain.Publication> getPublications()
  {
    return _publications;
  }
.
.
.
  @javax.persistence.Column(name = "\"_name\"", length = 255) @org.hibernate.annotations.AccessType(value = "field") protected String _name = "";

  @org.hibernate.search.annotations.Fields({@org.hibernate.search.annotations.Field(name = "name"), @org.hibernate.search.annotations.Field(index = org.hibernate.search.annotations.Index.UN_TOKENIZED, name = "name_untokenized")}) public String getName()
  {
    return _name;
 


Last edited by Elmer on Fri May 13, 2011 5:10 am, edited 2 times in total.

Top
 Profile  
 
 Post subject: Re: Bug or not? Faceted search + embedded fields (*tomany)
PostPosted: Wed Apr 13, 2011 5:28 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

the code is unfortunately hard to follow due the amount of annotations and the fact that it only shows part of the code. I also would be really interested in the actual query you are running and the faceting field name.

The best would really be if you could condense the problem down into a unit test which we can run to re-produce the problem.

Have you looked at the index using Lucene?

--Hardy


Top
 Profile  
 
 Post subject: Re: Bug or not? Faceted search + embedded fields (*tomany)
PostPosted: Wed Apr 13, 2011 8:40 am 
Beginner
Beginner

Joined: Mon Apr 11, 2011 7:56 am
Posts: 38
Hi Hardy,

Sorry for the unclear code snippets. The Lucene index is fine (+ query.list() yielded the right results).

As suggested, I created a unit test for this case.
I have (locally) extended EmbeddedTest.java in org.hibernate.search.test.embedded with this test:
Code:
   public void testFacetEmbeddedAndCollections() throws Exception {
      Author a = new Author();
      a.setName( "Voltaire" );
      Author a2 = new Author();
      a2.setName( "Victor Hugo" );
      Author a3 = new Author();
      a3.setName( "Moliere" );

      Product p1 = new Product();
      p1.setName( "Candide" );
      p1.getAuthors().add( a );
      p1.getAuthors().add( a2 ); // be creative

      Product p2 = new Product();
      p2.setName( "Candide" );
      p2.getAuthors().add( a2 );
      p2.getAuthors().add( a3 );
      
      Product p3 = new Product();
      p3.setName( "Candide" );
      p3.getAuthors().add( a2 );
      p3.getAuthors().add( a3 );

      Session s = openSession();
      Transaction tx = s.beginTransaction();
      s.persist( a );
      s.persist( a2 );
      s.persist( a3 );
      s.persist( p1 );
      s.persist( p2 );
      s.persist( p3 );
      tx.commit();

      s.clear();

      FullTextSession session = Search.getFullTextSession( s );
      tx = session.beginTransaction();

      QueryParser parser = new MultiFieldQueryParser( getTargetLuceneVersion(), new String[] { "name" }, SearchTestCase.standardAnalyzer );
      Query query;
      FullTextQuery ftq;
      QueryBuilder builder;
      List<Facet> facets;

      query = parser.parse( "Candide" );
      ftq = session.createFullTextQuery( query, Product.class );
      FacetingRequest facetReq;
      

      builder = session.getSearchFactory().buildQueryBuilder().forEntity( Product.class ).get();
      facetReq = builder.facet().name("someFacet").onField("authors.name_untokenized")
      .discrete().orderedBy(FacetSortOrder.COUNT_DESC)
      .includeZeroCounts(false).maxFacetCount(10)
      .createFacetingRequest();
      facets = ftq.getFacetManager().enableFaceting(facetReq).getFacets("someFacet");
      
      assertEquals( "error in collecting facets on embedded collections, wrong number of facets", 3, facets.size() );
      for(Facet f : facets){
         if(f.getValue().equals(a.getName()))
            assertEquals( "error in collecting facets on embedded collections (Author a)", 1, f.getCount() );
         if(f.getValue().equals(a2.getName()))
            assertEquals( "error in collecting facets on embedded collections (Author a2)", 3, f.getCount() );
         if(f.getValue().equals(a3.getName()))
            assertEquals( "error in collecting facets on embedded collections (Author a3)", 2, f.getCount() );
      }


      tx.commit();

      s.clear();
      s.close();
   }

For facetting, in Author.java the annotation of the field name is slightly adapted to index the name untokenized:
Code:
   @Fields({@Field(index= Index.TOKENIZED), @Field(index= Index.UN_TOKENIZED, name="name_untokenized")})
   private String name;


Result:
junit.framework.AssertionFailedError: error in collecting facets on embedded collections, wrong number of facets expected:<3> but was:<2>
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.failNotEquals(Assert.java:283)...

Seems to be a bug, what do you think? :)


Top
 Profile  
 
 Post subject: Re: Bug or not? Faceted search + embedded fields (*tomany)
PostPosted: Wed Apr 13, 2011 11:42 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

I was able to reproduce your case now and the problem is more by design atm. The current faceting implementation utilizes the Lucene FieldCache which has a limitation that each document must have a single value for the specified field. This is not the case in your use case.

You could start a Jira issue to suggest this use case. To support it would mean to find a better approach for faceting under the hood (or maybe an alternative). The initial idea was really to facet on concrete values and in particular "cleanly" divide the search results.

--Hardy


Top
 Profile  
 
 Post subject: Re: Bug or not? Faceted search + embedded fields (*tomany)
PostPosted: Tue Apr 26, 2011 5:30 am 
Beginner
Beginner

Joined: Mon Apr 11, 2011 7:56 am
Posts: 38
Hi again,

Is there some indication on the priority of this bug from the eyes of the HS devs? It hasn't been triaged yet, but I'd like to know if I can expect some changes in the (near) future. Thanks in advance!


Top
 Profile  
 
 Post subject: Re: Bug or not? Faceted search + embedded fields (*tomany)
PostPosted: Tue Apr 26, 2011 5:39 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
For the next version we are planning for quite a few other changes - http://community.jboss.org/wiki/PlansFo ... ateSearch4. In this light I don't see this issue treated w/ high priority.
If you have any ideas on how to handle this case or want maybe to contribute let us know.

--Hardy


Top
 Profile  
 
 Post subject: Re: Bug or not? Faceted search + embedded fields (*tomany)
PostPosted: Tue May 03, 2011 9:14 am 
Newbie

Joined: Tue May 03, 2011 9:05 am
Posts: 1
Hi Elmer

Did you find a way to resolve your problem. We're dealing with the same issue (faceting search on 1-to-many field). Currently, I'm using bitSet to solve our problem
http://sujitpal.blogspot.com/2007/04/lu ... -with.html
However, it will be nice if Hibernate Search can provide a solution


Top
 Profile  
 
 Post subject: Re: Bug or not? Faceted search + embedded fields (*tomany)
PostPosted: Thu May 12, 2011 11:10 am 
Newbie

Joined: Fri Jun 11, 2010 3:52 pm
Posts: 4
Very disappointing to hear that this may not be a major priority. Facets are practically useless without the ability to multi-select and search across 1-many associations.
Building some of the more useful faceted interfaces simply cannot be done using Hibernate Search.
(see Linked In http://www.linkedin.com/search/fpsearch?type=people as a great example of facets counts across *toomany associations)

The problem cascades because sorting becomes completely incorrect. When faceting (with many options for selection) being able to push selection options with greatest count to the top is massively important from a user experience perspective.

Mike how are you using the BitSetCounter with Hibernate search? It looks to me like the FacetCounters are built through FacetCollector

Code:
private <N extends Number> FacetCounter createFacetCounter(FacetingRequestImpl request) {
      if ( request instanceof DiscreteFacetRequest ) {
         return new SimpleFacetCounter();
      }
      else if ( request instanceof RangeFacetRequest ) {
         @SuppressWarnings("unchecked")
         RangeFacetRequest<N> rangeFacetRequest = (RangeFacetRequest<N>) request;
         return new RangeFacetCounter<N>( rangeFacetRequest );
      }
      else {
         throw new IllegalArgumentException( "Unsupported cache type" );
      }
   }


I can't see how to circumvent this code, without ripping into the source.

thanks


Top
 Profile  
 
 Post subject: Re: Bug or not? Faceted search + embedded fields (*tomany)
PostPosted: Thu May 12, 2011 11:42 am 
Beginner
Beginner

Joined: Mon Apr 11, 2011 7:56 am
Posts: 38
mike.chu wrote:
Hi Elmer

Did you find a way to resolve your problem. We're dealing with the same issue (faceting search on 1-to-many field). Currently, I'm using bitSet to solve our problem
http://sujitpal.blogspot.com/2007/04/lu ... -with.html
However, it will be nice if Hibernate Search can provide a solution

Hi Mike,

Unfortunately, I'm still busy implementing other features for my graduation project, but I hope to have some time next week to look into this problem again. Do I understand correctly that you implemented faceting without using any of the hsearch built-in facet facilities?

And I agree with ttobin852, faceting is likely to be used in a data model with *-to-many relations, making its current implementation not really usable for these cases.


Top
 Profile  
 
 Post subject: Re: Bug or not? Faceted search + embedded fields (*tomany)
PostPosted: Fri May 13, 2011 4:01 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

we definitely have to improve on this. The current work is a first approach. The good news is that the API won't have to change. It is basically a backend problem in this case, because we are using Lucene's FieldCache functionality which requires that there is only a single value per field.

--Hardy


Top
 Profile  
 
 Post subject: Re: [HSEARCH-726] Faceted search + embedded fields (*tomany)
PostPosted: Fri Jun 10, 2011 10:21 am 
Beginner
Beginner

Joined: Mon Apr 11, 2011 7:56 am
Posts: 38
edit: see HSEARCH-776

Unfortunately it seems that more bugs appear in this case, but I'm not sure if the cause is the same.
For example, if I have a search query with results in which 21 authors appear, and I ask for the top 10 authors, I get:
top 10:
Code:
Zef Hemel (4)
Patricia Johann (2)
Sander Vermolen (2)
Maartje de Jonge (2)
Rob Vermaas (1)
Gabor Karsai (1)
Otto Skrove Bagge (1)
Markus Völter (1)
Jaakko Järvi (1)
Steven Eker (1)


The same query, now asking for the top 100 authors:

Code:
Eelco Visser (14)
Martin Bravenboer (8)
Lennart C. L. Kats (6)
Zef Hemel (4)
Karina Olmos (3)
Patricia Johann (2)
Sander Vermolen (2)
Maartje de Jonge (2)
Merijn de Jonge (2)
Zine-El-Abidine Benaissa (2)
Rob Vermaas (1)
Gabor Karsai (1)
Otto Skrove Bagge (1)
Markus Völter (1)
Jaakko Järvi (1)
Steven Eker (1)
Éric Tanter (1)
Mark van den Brand (1)
Ralf Lämmel (1)
Tobias Kuipers (1)
Karl Trygve Kalleberg (1)


As you can see, the top 10 is incorrect, where the top 100 seems to be correct (except for the incorrect count due to the initial issue). Why is the top10 different from the 10 first facets from the top100?

The code I use for fetching the facets:
Code:
         QueryBuilder builder = getFullTextSession().getSearchFactory().buildQueryBuilder().forEntity(entityClass).get();
         FacetingRequest facetReq = builder
            .facet()
            .name(facetName) //"author.name-10" and "author.name-100" in these cases
            .onField(field) // "author.name" in this case
            .discrete()
            .orderedBy(FacetSortOrder.COUNT_DESC)
            .includeZeroCounts(false).maxFacetCount(topN) //10 and 100 in this case
            .createFacetingRequest();
         
         facets = fullTextQuery.getFacetManager().enableFaceting(facetReq).getFacets(facetName);


Top
 Profile  
 
 Post subject: Re: [HSEARCH-726] Faceted search + embedded fields (*tomany)
PostPosted: Wed Nov 20, 2013 3:01 pm 
Newbie

Joined: Tue Sep 17, 2013 12:46 pm
Posts: 5
This is something of a solution to the multi-value facet-count problem for hibernate-search.
Blog: http://outbottle.com/hibernate-search-multivalue-facet-counts/

The blog is complete with a Java Class that can be reused to generate facet-counts for single-value and multi-value fields.
The solution provided is based on the BitSet solution provided here: http://sujitpal.blogspot.ie/2007/04/lucene-search-within-search-with.html

The blog has a Maven project which demonstrates the solution quite comprehensively. The project demonstrates using the hibernate-search faceting API to filter on a date-range AND a 1-to-many (single-value) facet-group AND a many-to-many (multi-value) facet-group combined. The solution is then invoked to correctly derive facet-counts for each facet-group.

The solution facilitates results similar to this jsFiddle emulation: http://jsfiddle.net/jralston/EeW97/embedded/result/ (except that the emulation does not demo the range faceting).

The jsFiddle is part of a larger blog which explores the concept of facet searching in general: http://outbottle.com/understanding-faceted-searching/ .

If you’re like me and are finding the whole notion of facet-searching quite confusing then this will help.
It may not be the best solution in the world so feel free to feedback.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.