-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 11 posts ] 
Author Message
 Post subject: Search across related entities in HSEARCH
PostPosted: Fri Apr 22, 2011 11:23 am 
Beginner
Beginner

Joined: Fri Feb 18, 2011 7:30 pm
Posts: 41
The nature of my project is such that search is needed and specifically search across related entities. We want to perform several queries involving a correlation between two or more properties of a given entity in a collection.

To put things in context, take here is a snippet of the domain:

Code:
Artist { firstname, lastname, alias }
Album { title, releaseDate, genre }
House { address, city, state }


An artist can have many albums and houses. After indexing, I would like to be able to search for things like:

1) All atrists who released in 2010 in the genre of jazz
2) All atrist who live in new york, in the city of long island, who have released an album in 2011 with a genre of gospel

All my application cares about is returning "artist" entities at the end of every search.

One way to handle this will be to also index the collection entities (i.e. Album and House) and when a correlated like this comes in, for example in the case of 1), target album and then collect all the matching artists, but this quickly gets messy when I have more than one related child collections, e.g. 2).

It seems sub-optimal to perform several queries against child indexes, and then union the result to arrival at the final matching results. In our case, we have several of these child collections.

What is the best way to handle things like this? I looked into filters, but they don't address issues like this related to the query itself. It's more for cross-cutting concerns which is not what this is.

Is it even ideal to try to handle this with search? Or is the best thing to just do HSQL against the database?

Thanks.


Top
 Profile  
 
 Post subject: Re: Search across related entities in HSEARCH
PostPosted: Fri Apr 22, 2011 2:12 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
This depends on your requirements. Almost all queries can be expressed with @IndexedEmbedded, did you try with it?

I'll mention the limitation of IndexedEmbedded so that you don't get disappointed later. The limit is that all collections are "embedded" literally in the same Document, as Lucene has totally no notion of relations.

So considering your case
Quote:
2) All atrist who live in new york, in the city of long island, who have released an album in 2011 with a genre of gospel

that would be easy to implement, but in it's simplest form it will match all artists who:
- released at least an album in 2011
- released at least a gospel album

(omitting the other restrictions, which are easy). To be clear: you can't easily express the requirement that the album matching 2011 is the same which is classified as gospel.

In most cases this is a subtle difference which is easily overcome, as low scoring matches are not on top anyway so you're likely going to return at top of all results what the user is searching for anyway.

To implement this query correctly, the solution is simple: have an index of Albums as well, embedding the artists metadata; you'll still be able to retrieve the matching artists from the usual relational query - it's still better than SQL, definitely more flexible and efficient.
It's usually possible to map surprisingly complex relations this way, but admittely not all kind of restrictions are possible - sometimes you need "post-load" filtering; you can write your own filters at Lucene level, or just use a scrollable result to perform some programmatic filtering on loaded entities; sometimes you encode additional metadata in the Document for purpose of filtering.

As of suitability of the framework, the idea is that it should be effective to write a query in 99% of cases; the remaining 1% is not solvable efficiently. Highly relational queries require a relational model, hence this is not meant to obsolete HQL or SQL but to complement it.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Search across related entities in HSEARCH
PostPosted: Fri Apr 22, 2011 3:07 pm 
Beginner
Beginner

Joined: Fri Feb 18, 2011 7:30 pm
Posts: 41
I have things mapped correctly with @IndexedEmbedded and all.

This is a snippet of my code (showing only the relevant annotation bits)

Code:
Artist {
  @IndexedEmbedded
  @OneToMany(mappedBy = "artist")
  private Set<Album> albums;
 
  @IndexedEmbedded
  @OneToMany(mappedBy = "artist")
  private Set<House> houses;

}

Album {
  @ManyToOne
  @JoinColumn(name="ARTIST_ID", insertable = false, updatable = false)
  @ContainedIn
  private Artist artist;
}



Quote:
To be clear: you can't easily express the requirement that the album matching 2011 is the same which is classified as gospel.


That is exactly my requirement. I want to be sure the album of 2011 is the same one classified as gospel.

Quote:
To implement this query correctly, the solution is simple: have an index of Albums as well, embedding the artists metadata;


I don't follow. If I try to embed the metadata of artist into Albums (i.e. by putting @IndexedEmbedded on the artist property in Albums for example), I get:

Code:
Caused by: org.hibernate.search.SearchException: Circular reference. Duplicate use of com.myapp.Album in root entity com.myapp.Album#artist.albums.
   at org.hibernate.search.engine.AbstractDocumentBuilder.checkForIndexedEmbedded(AbstractDocumentBuilder.java:599)
   at org.hibernate.search.engine.AbstractDocumentBuilder.initializeMemberLevelAnnotations(AbstractDocumentBuilder.java:436)
   at org.hibernate.search.engine.AbstractDocumentBuilder.initializeClass(AbstractDocumentBuilder.java:379)
   at org.hibernate.search.engine.AbstractDocumentBuilder.checkForIndexedEmbedded(AbstractDocumentBuilder.java:618)
   at org.hibernate.search.engine.AbstractDocumentBuilder.initializeMemberLevelAnnotations(AbstractDocumentBuilder.java:436)


Quote:
It's usually possible to map surprisingly complex relations this way, but admittely not all kind of restrictions are possible - sometimes you need "post-load" filtering; you can write your own filters at Lucene level, or just use a scrollable result to perform some programmatic filtering on loaded entities; sometimes you encode additional metadata in the Document for purpose of filtering.


Is there an example of this? Does the HSearch in Action book have one? If not, can you point me to any sample you know of?

I know search will be more efficient which is why I want to use it, but if I can't guarantee the results will match when correlated fields are involved in a search (which is 95% of use case of us), then it will not serve the true purpose of why we are integrating it.

(see https://forum.hibernate.org/viewtopic.php?f=9&t=1001551&hilit=correlated ). That is what I am facing.


Top
 Profile  
 
 Post subject: Re: Search across related entities in HSEARCH
PostPosted: Mon Apr 25, 2011 5:37 am 
Beginner
Beginner

Joined: Fri Feb 18, 2011 7:30 pm
Posts: 41
Still haven't heard back concerning this. Can anyone please help out as this will help determine how we move forward on the project?

Thanks in advance


Top
 Profile  
 
 Post subject: Re: Search across related entities in HSEARCH
PostPosted: Mon Apr 25, 2011 6:02 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
Is there an example of this? Does the HSearch in Action book have one? If not, can you point me to any sample you know of?

Assuming you have the book, chapter 8 shows a filter implemented by taking in consideration an external service.

Quote:
Still haven't heard back concerning this.

ever heard of easter holidays? ;)

Code:
Caused by: org.hibernate.search.SearchException: Circular reference. Duplicate use of com.myapp.Album in root entity

something isn't alright there, or maybe I misunderstood what you're trying to do. care to make a test?

As alternative you can map your objects to Lucene Documents with ultimate flexibility via a @ClassBridge; it requires a bit more coding so I'd favour the plain annotations but you could index a small portion of your graph using it to break any circularity or the weirdest requirements. I don't think you could find any blocking issue in the modelling flexibility, as ultimately the framework's goal is to help you out but never prevents you to switch to a lower level approach or replace core components. Still, if there's something not looking right with your circular reference issue we can improve that with your help and suggestions.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Search across related entities in HSEARCH
PostPosted: Mon Apr 25, 2011 7:16 am 
Beginner
Beginner

Joined: Fri Feb 18, 2011 7:30 pm
Posts: 41
Thanks Sanne. I do have a copy of the book and I have read through chap 8, but I still don't see how that helps me with this particular scenario. All I want is to be able to have search return the same results as I would get if I were to do a similar query against the database. That said, I know it is not meant to be a replacement for the database, but this particular restriction (i.e. not being able to query correlated fields in a collection) seems very limiting to all search frameworks I have investigated.

What I am asking is to see the best way I could go about solving this issue since we have already done quite a lot with HSearch at this point and would prefer to stick with it.

As I pointed out in my last post, my requirements are to:

Quote:
make sure the album matching 2011 is the same which is classified as gospel


This touches 2 correlated fields (year and genre) in the Album's entity which is embedded in the Artist entity.

What is the best way to get a test across to you? Would it be sufficient to have you pull from github?

Please let me know.

Thanks.

PS: Happy Easter!


Last edited by ronotica on Mon Apr 25, 2011 11:39 am, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Search across related entities in HSEARCH
PostPosted: Mon Apr 25, 2011 10:41 am 
Beginner
Beginner

Joined: Fri Feb 18, 2011 7:30 pm
Posts: 41
Hey Sanne - here is a project on github that gives you a test case.

https://github.com/berinle/hsforum_correlation

The scripts to create the tables and seed them are in src/main/scripts

Please take a look @ TestCase1 class.

Let me know if you need anything else. As I mentioned earlier, you still get the circular reference error when you try to embed Artist in Albums as you had suggested. My goal is to make the 2 test cases pass.

You could run the test invoking this command ./gradlew tT from the directory you cloned the project under.

Thanks.


Top
 Profile  
 
 Post subject: Re: Search across related entities in HSEARCH
PostPosted: Wed Apr 27, 2011 12:14 pm 
Beginner
Beginner

Joined: Fri Feb 18, 2011 7:30 pm
Posts: 41
Sanne - Have you had a chance to look at this? Please let me know. Thanks


Top
 Profile  
 
 Post subject: Re: Search across related entities in HSEARCH
PostPosted: Wed Apr 27, 2011 1:03 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi ronotica, I'm travelling and very busy until next week; I'm glad you shared the project and will look into it when I can, but sorry I have many other things in the pipeline so I won't react quickly as I usually do.
In the meantime, could you try working this out by using the two indexes:
- one for the Author
- one for the Album

that way you should be able to figure it out. If by any chance you keep getting a "org.hibernate.search.SearchException: Circular reference", that might be an issue but as I said don't let you be blocked by that as you can temporarily work around it by defining a custom ClassBridge, and later on change it back to annotation level properties when we figured what's exactly wrong.

While you implement a ClassBridge, you'll have full control over the mapping to the index, so I assume you'll see by yourself that Search won't ever be an hindrance to flexibility. Worst thing, you might find that Lucene is not suited for what you need; in that case I can only help you by figuring out very advanced filters for Lucene. Be aware that people is working a lot on better support for relational queries via Lucene - we're just not there yet.

SOLR-2272 is quite related.. but as you can see it made quite a fuss so I'm not going to insist right now that they merge it ;)

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Search across related entities in HSEARCH
PostPosted: Wed Apr 27, 2011 2:44 pm 
Beginner
Beginner

Joined: Fri Feb 18, 2011 7:30 pm
Posts: 41
Thanks Sanne. I understand. Have a save trip.

Not sure I fully understand what you are asking me to try, but I have tried with the two indexes (album and artist) and as long as I do multiple searches, I get right results for correlated searches (Testcase2 does just that in the code I put on github).

As far as embedding that is still not working, as Album is already @IndexEmbedded in Artist, and then trying to embed Artist in Album will lead to a circular reference error. I'll see how things work with the custom class bridge.

Thanks again.

PS: Please do keep this in mind (if you can) and let me know next week sometime when you get back and are able to take a closer look.


Top
 Profile  
 
 Post subject: Re: Search across related entities in HSEARCH
PostPosted: Sun May 08, 2011 1:52 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi Ronotica,
flying back very soon, did you make any progress? did you try using a classbridge to break out of circularity?
For Hibernate Search 4 (soonish) I want to change the internal design so that we don't have any issue with circularity at all - that's been a plan since long time but we're quite strict on breaking backwards compatibility and this was not possible without.

General plan:
http://community.jboss.org/wiki/Plansfo ... ateSearch4
make sure to comment and help out if you more changes.. of course more improvements are acceptable after 4.0 but API should always be backwards compatible unless there are excellent reasons. Lucene is having some experimental support for join operations in Lucene 4, but it doesn't look stable enough yet.. we might consider it if we can get some help in the how to best bake it and testing.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 11 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.