-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 15 posts ] 
Author Message
 Post subject: hibernate search - unique results
PostPosted: Wed Apr 07, 2010 4:50 am 
Newbie

Joined: Tue Mar 23, 2010 10:00 am
Posts: 6
Is there a way to make hibernate search return unique results?


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Tue Apr 20, 2010 3:41 pm 
Newbie

Joined: Tue Apr 20, 2010 3:40 pm
Posts: 3
Yes :-)


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Tue Apr 20, 2010 4:45 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
Is there a way to make hibernate search return unique results?

It always should do that, are you having duplicates? how does your query look like?

There was a bug (HSEARCH-476) which could trigger in very rare situations which would add multiple copies of the same object to the index, you might want to check the latest development version to check if that solves your problem.
Try version 3.2.0.CR1 which fixed that.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Mon Dec 05, 2011 5:28 pm 
Beginner
Beginner

Joined: Sun Sep 18, 2011 4:44 pm
Posts: 22
Am I right that org.hibernate.search.jpa.FullTextQuery does not offer the possibility to retrieve unique results?
I have a case that a query contains the same documents multipe times.
I retrieve the data by projection.
Hibernate 4.

Thanks


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Mon Dec 05, 2011 5:39 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi bcn,
what is your use case? Documents should be unique in the index.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Tue Dec 06, 2011 12:03 pm 
Beginner
Beginner

Joined: Sun Sep 18, 2011 4:44 pm
Posts: 22
I have an entity neighborhood which contains a set of another entity street.
What I observe is that if a street entity is updated with JPA merge, the hibernate index of
the parent entity neighborhood contains an additional document, as far as I understand the Luke display.
The street index itself remains with one document. The updated property is not the id of the street, just a normal field, but included in the index.

Then a search is executed for a neighborhood and street name, it contains the same street several times due to the multiple documents in the neighborhood index.

Does that make sense what I am saying?

I use EntityManager.flush at same points, could that affect the indexes?
Hibernate search 4.0.0.CR2.

Thanks


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Tue Dec 06, 2011 12:19 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Are both neighbourhood and street indexed entities? Could you show your mapping and a sample query?

You should NOT have multiple Documents in the index for the same neighbourhood instance, but the single Lucene Document can contain multiple streets. So any match on the street would make the neighbourhood "included" in the results, but only once as there is a single Document.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Tue Dec 06, 2011 12:39 pm 
Beginner
Beginner

Joined: Sun Sep 18, 2011 4:44 pm
Posts: 22
Yes, both neighborhood and street are indexed entities.

some parts of neighborhood entity:
Code:
   @DocumentId
   @IndexedEmbedded
   @FieldBridge(impl = NeighborhoodIdBridge.class)
   @EmbeddedId
   @AttributeOverrides({
         @AttributeOverride(name = "countryIso2", column = @Column(name = "country_iso2", nullable = false, length = 2)),
         @AttributeOverride(name = "subdivId", column = @Column(name = "subdiv_id", nullable = false, length = 4)),
         @AttributeOverride(name = "cityId", column = @Column(name = "city_id", nullable = false, length = 6)),
         @AttributeOverride(name = "neighId", column = @Column(name = "neigh_id", nullable = false, length = 6)) })
   public TNeighborhoodId getId() {
      return this.id;
   }

   @IndexedEmbedded
   @OneToMany(fetch = FetchType.LAZY, mappedBy = "TNeighborhood")
   public Set<TStreet> getTStreets() {
      return this.TStreets;
   }


street entity:
Code:
  @ContainedIn
   @ManyToOne(fetch = FetchType.LAZY)
   @JoinColumns({
         @JoinColumn(name = "country_iso2", referencedColumnName = "country_iso2", nullable = false),
         @JoinColumn(name = "subdiv_id", referencedColumnName = "subdiv_id", nullable = false),
         @JoinColumn(name = "city_id", referencedColumnName = "city_id", nullable = false),
         @JoinColumn(name = "neigh_id", referencedColumnName = "neigh_id", nullable = false) })
   public TNeighborhood getTNeighborhood() {
      return this.TNeighborhood;
   }

   @Field(index = Index.YES, analyze = Analyze.YES, store = Store.YES)
   @Column(name = "street_name", nullable = false, length = 200)
   public String getStreetName() {
      return this.streetName;
   }

   // must be indexed for sorting
   @Field(index = Index.YES, analyze = Analyze.NO, store = Store.YES)
   @Column(name = "street_numestab", nullable = false)
   public short getStreetNumestab() {
      return this.streetNumestab;
   }


numestab is updated which seems to cause the problem.

Search code:
Code:
         FullTextEntityManager fullTextEntityManager = org.hibernate.search.jpa.Search
               .getFullTextEntityManager(this.entityManager);

         QueryContextBuilder qcb = fullTextEntityManager.getSearchFactory()
               .buildQueryBuilder();
         QueryBuilder qb = qcb.forEntity(TNeighborhood.class).get();

         BooleanJunction<?> bj = qb.bool().must(
               qb.keyword().onField("id.countryIso2").matching(countryIso2)
                     .createQuery());
         bj = bj.must(qb.keyword().onField("id.subdivId").matching(subdivId)
               .createQuery());

         if (CodeUtils.notEmptyOrHyphen(cityId)) {
            bj = bj.must(qb.keyword().onField("id.cityId").matching(cityId)
                  .createQuery());
         }
         else {
            bj = bj.must(qb.keyword().onField("TCity.cityNameSearchNoAna")
                  .matching(CodeUtils.removeDiacritics(cityName).toLowerCase())
                  .createQuery());
         }

         if (CodeUtils.notEmptyOrHyphen(neighborhoodId)) {
            bj = bj.must(qb.keyword().onField("id.neighId")
                  .matching(neighborhoodId).createQuery());
         }
         else {
            bj = bj.must(qb.keyword().onField("neighNameSearchNoAna")
                  .matching(CodeUtils.removeDiacritics(neighName).toLowerCase())
                  .createQuery());
         }

         if (prefix != null) {
            bj = bj.must(qb.keyword().wildcard().onField("TStreets.streetName")
                  .matching(CodeUtils.removeDiacritics(prefix).toLowerCase() + "*")
                  .createQuery());
         }

         FullTextQuery query = fullTextEntityManager.createFullTextQuery(
               bj.createQuery(), TNeighborhood.class);
         query.setProjection(ProjectionConstants.DOCUMENT);

         SortField sortField;
         if (order == Order.numEstabs) {
            sortField = new SortField("TStreets.streetNumestab", SortField.INT,
                  true);
         }
         else {
            sortField = new SortField("TStreets.streetName", SortField.STRING,
                  false);
         }
         query.setSort(new Sort(sortField));

         if (numMaxRows != null) {
            query.setMaxResults(numMaxRows);
         }
         query.limitExecutionTimeTo(2, TimeUnit.SECONDS);

         List<StreetBean> ret = new ArrayList<StreetBean>();
         for (Object row : query.getResultList()) {
            Document doc = (Document) ((Object[]) row)[0];
            String streetId = doc.get("TStreets.streetId");
                  if (streetId != null) {
                     ret.add(new StreetBean(
                           Integer.parseInt(streetId),
                           countryIso2,
                           subdivId,
                           cityId,
                           neighborhoodId,
                           doc.get("TStreets.streetName"),
                           Integer.parseInt(doc.get("TStreets.streetNumestab")),
                           Integer.parseInt(doc.get("TStreets.TStreetType"))));
                  }
         }


Result list "ret" contains the same street several times, although the street index (and the db table of course) contain it only once.
It seems that after merge of a street entity, a new document is created in the neighborhood index. All these different neighborhood documents contain several streets of the neighborhood.

Thanks!


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Wed Dec 07, 2011 9:47 am 
Beginner
Beginner

Joined: Sun Sep 18, 2011 4:44 pm
Posts: 22
Any insight?

I confirmed again that when an associated (street) entity is updated (EntityManager.merge), new root documents (neighborhoods) are created in the index.
In fact, the number of new root documents is equal to the number of associated elements in the set.
If there were 3 streets, 3 new documents are created.

Is this a bug or am I doing something wrong? Please, I need to solve this quickly.

Thanks!


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Wed Dec 07, 2011 5:26 pm 
Beginner
Beginner

Joined: Sun Sep 18, 2011 4:44 pm
Posts: 22
I have dug further into this and it seems a problem with a field bridge.
When an associated entity (street) is created, the root entity (neighborhood) is updated. Hib search first removes the old root document and then adds the new one.
For some reason the removal fails and the old document remains in the index!
The code is in DeleteWorkDelegate.performWork. No exception occurs! I guess the boolean term just does not match. At least an info should be logged in this case, no?

The document id field is:
Code:
   @DocumentId
   @IndexedEmbedded
   @FieldBridge(impl = NeighborhoodIdBridge.class)
   @EmbeddedId
   @AttributeOverrides({
         @AttributeOverride(name = "countryIso2", column = @Column(name = "country_iso2", nullable = false, length = 2)),
         @AttributeOverride(name = "subdivId", column = @Column(name = "subdiv_id", nullable = false, length = 4)),
         @AttributeOverride(name = "cityId", column = @Column(name = "city_id", nullable = false, length = 6)),
         @AttributeOverride(name = "neighId", column = @Column(name = "neigh_id", nullable = false, length = 6)) })
   public TNeighborhoodId getId() {
      return this.id;
   }

and the field bridge is
Code:
public class NeighborhoodIdBridge implements TwoWayFieldBridge {
   private final static String SEP = "_";

   @Override
   public String objectToString(Object object) {
      TNeighborhoodId id = (TNeighborhoodId) object;
      return id.getNeighId() + SEP + id.getCityId() + SEP + id.getSubdivId()
            + SEP + id.getCountryIso2();
   }

   @Override
   public void set(String name, Object value, Document document,
         LuceneOptions luceneOptions) {
      TNeighborhoodId id = (TNeighborhoodId) value;

      luceneOptions.addFieldToDocument(name + ".neighId", id.getNeighId(),
            document);
      luceneOptions
            .addFieldToDocument(name + ".cityId", id.getCityId(), document);
      luceneOptions.addFieldToDocument(name + ".subdivId", id.getSubdivId(),
            document);
      luceneOptions.addFieldToDocument(name + ".countryIso2",
            id.getCountryIso2(), document);
   }

   @Override
   public Object get(String name, Document document) {
      return new TNeighborhoodId(document.get(name + ".countryIso2"),
            document.get(name + ".subdivId"), document.get(name + ".cityId"),
            document.get(name + ".neighId"));
   }

}

Maybe updating only works for string bridges? I can't see a bug in my code.

Thanks,
Rick


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Thu Dec 08, 2011 8:00 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi bcn,
sorry for the delay and thanks a lot for looking into it, your last comment was very helpful.

I think you've almost found the problem yourself: the old document is not deleted, because the delete statement fails to find the previous match.
The @DocumentId you're using seems to include several fields, are these changing?

The output of the fieldbridge used on the @DocumentId marked getter MUST not change, Lucene needs to be able to match the same document back again for the delete operation to work properly, we can't delete using the old value and insert with the new one, the two ids must be the same (after the transformation into a String).

So the question might become.. why are you using @DocumentId ? If you don't override it, it will default to the same value as @Id and will likely do the correct thing.

Unfortunately we can't detect the issue automatically without specifically querying the index; Lucene is not returning the number of affected documents by the delete/update statements.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Thu Dec 08, 2011 8:32 am 
Beginner
Beginner

Joined: Sun Sep 18, 2011 4:44 pm
Posts: 22
Thanks for the answer.

When I remove @DocumentId I get on deployment:
Code:
Caused by: org.hibernate.search.SearchException: No document id in: com.domain.TNeighborhood
   at org.hibernate.search.engine.spi.DocumentBuilderIndexedEntity.<init>(DocumentBuilderIndexedEntity.java:187)
   at org.hibernate.search.spi.SearchFactoryBuilder.initDocumentBuilders(SearchFactoryBuilder.java:378)
   at org.hibernate.search.spi.SearchFactoryBuilder.buildNewSearchFactory(SearchFactoryBuilder.java:224)

I guess this happens because it is annotated with @EmbeddedId, not with @Id.

The id / field bridge output does not change, i.e. it is the same primary key.

What I see in DeleteWorkDelegate is that in
Code:
   idQueryTerm = new TermQuery( builder.getTerm( id ) );

builder.getTerm( id ) produces something like id:xxx_xxx_xxx_xxx, i.e. with the output of the bridge objectToString() method (at least Term.toString shows that).
But the field "id" does not exist in the index, there are only the individual fields id.cityId etc. as defined in the bridge.
It should produce 4 terms id.cityId:xxx, id.neighId:yyy etc.
Could this be the problem?

By the way, I removed @IndexedEmbedded, it caused the fields id.xxx to be created twice.

Thanks


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Thu Dec 08, 2011 8:57 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
It should produce 4 terms id.cityId:xxx, id.neighId:yyy etc.
Could this be the problem?

Absolutely. a FieldBridge applied on the document Id should create a single String field, and use the field name being passed in as parameter to "set(String NAME, ...)".

I see how the flexibility we allow on the FieldBridge has biten you, wondering if we could check for single field & unique term to be mandatory in case it's applied to a document id.

Could you please create an issue for this? https://hibernate.onjira.com/browse/HSEARCH
We could also make the DeleteWorkDelegate smarter to actually delete for all matching terms, but that has a performance impact.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Thu Dec 08, 2011 9:16 am 
Beginner
Beginner

Joined: Sun Sep 18, 2011 4:44 pm
Posts: 22
Created issue: https://hibernate.onjira.com/browse/HSEARCH-1003

I think changing the field structure a bit and using a TwoWayStringBridge instead should work.

Thanks!


Top
 Profile  
 
 Post subject: Re: hibernate search - unique results
PostPosted: Thu Dec 08, 2011 10:16 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
excellent, thanks a lot.
Not sure we can include some improvement for 4.0.0.Final but I'm suggesting it to the team.. might need another RC.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 15 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.