-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 11 posts ] 
Author Message
 Post subject: Collection initialization for nested sets
PostPosted: Fri Jan 12, 2007 5:41 pm 
Newbie

Joined: Tue Feb 14, 2006 5:13 am
Posts: 18
I am building an application that uses a nested set database-schema similar to the one I found in the extended CaveatEmptor example. So I have a bidirectional parent/children relation, and additional rows for left/right-numbers.

Now I want to use one of the big advantages of the nested set model and load a whole subtree with one query.

The Nested-Set-Query is quite simple:
Code:
from Node n where n.left >= :left and n.right <= :right


Unfortunately this doesn't initalize the children-collection of the loaded Node-Entities. So when I traverse these nodes by using the childrens-collection, hibernate will do lazy initialization for that collection and load again the same Node-Objects just loaded before.

Is there a way the tell hibernate what the children elements are?

Currently I use a fetch join query for that:
Code:
from Node n left outer join fetch n.children where n.left >= :left and n.right <= :right


This works, but with very bad performance (and that's why i use a nested set model) even for my small testcase with less than 100 nodes, the database now needs 100ms instead of 10ms for executing the query with the join. and since this is not a linear problem, I expect even worse performance ratio with larger resultsets.

So I'm looking for a way to tell hibernate, that all elements needed for the children-collection are contained in the resultset of the simple query.

Browsing the hibernate source i found the InitializeCollectionEventListener but my knowledge in this area is very low.

Any hints or better idea?

Thanks in advance,
Andi


Top
 Profile  
 
 Post subject:
PostPosted: Mon Jan 15, 2007 12:06 am 
Expert
Expert

Joined: Thu Dec 23, 2004 9:08 pm
Posts: 2008
I'm pretty sure that hibernate is not re-loading the same child objects each time, it's just loading the children's IDs. Turn on show_sql and turn up the log level of org.hibernate.type to DEBUG: if I'm right, you should see that for each parent you iterate over, the full list of child IDs is selected, but the children are individually selected only if they haven't already been. This is about as good as you can get it: certainly you won't easily be able to improve your query, because whatever optmizations you do there will (mostly) affect the query cache, but what you need to affect is the session cache.

_________________
Code tags are your friend. Know them and use them.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Jan 15, 2007 5:47 am 
Newbie

Joined: Tue Feb 14, 2006 5:13 am
Posts: 18
unfortunately this isn't true. at least in my case, hibernate creates a query that reads the whole entities, not only the ids. maybe i could do some configuration stuff so that hibernate just loads the id's. but even if this would be possible, it still leads to unnecessary acess to the database. and since my nessted sets will be quite big (and deep), it really matters to avoid this.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Jan 15, 2007 7:11 am 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 6:10 am
Posts: 8615
Location: Neuchatel, Switzerland (Danish)
do your first query and use .iterate() on the collections to make hibernate just do a select for id of the childrens and if you already loaded the children it will find it in the session avoiding more queries.

_________________
Max
Don't forget to rate


Top
 Profile  
 
 Post subject:
PostPosted: Mon Jan 15, 2007 8:39 am 
Newbie

Joined: Tue Feb 14, 2006 5:13 am
Posts: 18
Thanks for your replies.

Even with iterate() hibernate still executes the query for each single child-node. If i have, e.g. a tree with 4 nodes per level and 10 levels hibernate will execute 2000 unnecessary selects. I was able to reduce the nr of queries by setting BatchSize, but therefore the unnecessary queries are slightly more complex. Is there really no better way?

To me the InitializeCollectionEventListener sounds like a good starting point. I would dig deeper here, if someone with more knowledge of the EventListener could tell me, if this makes sense.

Regards,
Andi


Top
 Profile  
 
 Post subject:
PostPosted: Mon Jan 15, 2007 9:20 am 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 6:10 am
Posts: 8615
Location: Neuchatel, Switzerland (Danish)
look into subselect fetching then.

_________________
Max
Don't forget to rate


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 16, 2007 10:07 am 
Newbie

Joined: Tue Feb 14, 2006 5:13 am
Posts: 18
bingo! seems like i've read over the subselect feature multiple times.

this is still not the best solution i could think of, but from my point of view, the overhead is acceptable compared the the alternatives.

thanks for the hint!

for those of you who are interested in some performance comparsion:
i used a small tree with 4 levels, each intermediate node has 10 childnode (so i have 1111 entites)

the pure database-timings were:

incremental traversal (without nested-set)
1111*30ms = 33330 ms

single nested-set-select with join fetch:
4900 ms

single nested-set-select for all tree-nodes plus second select with subselect for children-collection
350 ms + 490 ms = 840 ms

so using the subselect feature did a huge performance improvement and the advantage will rise the bigger the tree gets.

if hibernate would have internal knowlegde of the nested tree model, it could avoid the second subselect query at all, so reading the whole tree could take only half the time. but i doubt we'll see a tree-association-type in hibernate soon ;-)

but there is one final optimization that could be possible.

for the subselect hibernate creates a query like:
Code:
select n.parent_id, n.id, n.* from Node n where n.parent_id in (select id from Node where n.left >= x and n.right <= y)


since i know that all nodes were loaded before, it would be sufficient to execute
Code:
select n.parent_id, n.id from Node n where n.parent_id in (select id from Node where n.left >= x and n.right <= y)


in my example this would reduce the load-time of the subselect query from 490ms to 330ms.

i thought using @LazyToOne(LazyToOneOption.PROXY) should do this, but
the subselect-query stayed the same. is it possible to suppress the relaoding of data for the subselect, or do i have to live with that?

thanks again for you help (and for hibernate of course)

andi


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 16, 2007 11:51 am 
Hibernate Team
Hibernate Team

Joined: Mon Aug 25, 2003 9:11 pm
Posts: 4592
Location: Switzerland
I wrote the nested set implementation for Hibernate (yes, it's not really complete but the tests work). I didn't plan using the Node interface for queries! It's just an interface that encapsulates the delegate with the node info. So in one application you could have Category nodes, and maybe, Posting nodes (like in a webforum), these classes would then implement the Node interface to get the delegate. The interface is then detected by the Hibernate interceptor, so that DML for nested set updates can be executed when the tree nodes are modified.

I did not really implement the querying part, the plan was to first try this:

select c from Category c where c.nodeInfo.left >= :left...

This should initialize the c.childCategories collection immediately!

A second step would be to have a custom NestedSet collection that would support additional fetching operations in addition to a normal Set contract.

_________________
JAVA PERSISTENCE WITH HIBERNATE
http://jpwh.org
Get the book, training, and consulting for your Hibernate team.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 16, 2007 1:16 pm 
Newbie

Joined: Tue Feb 14, 2006 5:13 am
Posts: 18
christian,

yesterday i thought about contacting you to ask you whether you're interested in a discussion about some limitations/possible improvements of your implementation. are you?

christian wrote:
I didn't plan using the Node interface for queries! It's just an interface that encapsulates the delegate with the node info. So in one application you could have Category nodes, and maybe, Posting nodes (like in a webforum), these classes would then implement the Node interface to get the delegate.


Don't get confused with my implementation. Although i use your kind of database-schema, i have a slighty different implementation. i removed the nodeinfo-class and instead implemented Node as a base class, that the Posting/Category classes are derived of. My interceptor implementation then checks for my Node-Baseclass instead of the Node-Interface.

christian wrote:
select c from Category c where c.nodeInfo.left >= :left...

This should initialize the c.childCategories collection immediately!


But where would be the difference to the first query in my initial post?

Maybe i don't understand hibernate good enough. How should hibernate know that this query returns all children of the returned Category nodes, without special knowledge of nested sets? Without this knowledge hibernate has to do a second query to populate the children-collection.


One more question about subselect:

The test-code i created yesterday to learn about subselect fetching did use the @Fetch(FetchMode.SUBSELECT) annotation, but in my real project i would like to enable subselect-fetching only for specific queries. But i can't find a way to do this. Neither for HQL nor for Criteria-Queries i could find a way to programmatic enable subselect fetching. Am i just blind again?

Regards,
Andi


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 16, 2007 1:26 pm 
Hibernate Team
Hibernate Team

Joined: Mon Aug 25, 2003 9:11 pm
Posts: 4592
Location: Switzerland
I don't have time currently to finish or discuss the nested set pattern. I might need it myself in a few weeks but then I'll blog about it when it's done.

Quote:
Maybe i don't understand hibernate good enough. How should hibernate know that this query returns all children of the returned Category nodes, without special knowledge of nested sets? Without this knowledge hibernate has to do a second query to populate the children-collection.


The resultset is marshalled by Hibernate into an object graph, the tree. During that marshalling, Hibernate knows that a row is an entity instance which has to be referenced in the collection of the owner, identified by the PARENT_CATEGORY_ID column value. It connects the graph accordingly.

Hm, but you are probably right that this doesn't work because Hibernate doesn't know if the whole Set is "complete" after that. So the best implementation is probably a custom NestedSet collection that knows this.

There is no way to enable subselect fetching on a per-query basis.

_________________
JAVA PERSISTENCE WITH HIBERNATE
http://jpwh.org
Get the book, training, and consulting for your Hibernate team.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jan 16, 2007 4:58 pm 
Newbie

Joined: Tue Feb 14, 2006 5:13 am
Posts: 18
christian wrote:
I don't have time currently to finish or discuss the nested set pattern. I might need it myself in a few weeks but then I'll blog about it when it's done.

ok, so i'm going ahead on my self. things i'm planning to do:
- posibility to maintain the order of children of one node
- moving subtrees
- optimization for initial creation of a whole tree
- introduction of an change-queue to execute nestes-set-changes in correct order

if you continue working on your nested sets implementation and interested in possible results of mine, just give me a hint. your work gave me a lot of very valueable hints so i'm glad if i can give something back.

but maybe the timeline of my project forces me go ahead without nested set, (others are waiting on my stuff to proceed with their work) and do the performance optimizations with nested sets in a later project stage. so i can't guarantee that i'll have presentable results soon.

best regards,
andi


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 11 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.