-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 1 post ] 
Author Message
 Post subject: How to control eager subselect usage for collections?
PostPosted: Mon Apr 26, 2010 11:05 am 
Newbie

Joined: Fri May 15, 2009 3:11 pm
Posts: 1
Hi,
The nub of this post: can I *dynamically* control/override when subselect is used when eagerly fetching collections?

Right now, I believe that:
1. If lazy=true, fetch=subselect is hardcoded in the hbm for collections, then there is no way to turn off eager fetching via subselect for any query (either critieria or hql-based). This causes performance of queries that don't need to fetch the collections to degrade.

2. If the hbm does NOT specify lazy=true, fetch=subselect for the collections, then there is no way to ever get fetch=subselect behavior for any query (either critieria or hql-based). This causes performce of queries that need to efficiently fetch the collections to degrade (eager joins suffer from cartesian explosion).

Is my understanding correct?

Details:
As an example, I have a hierarchical class 'Node' that has three collections (see Node hbm snippet at bottom), and must support two distinct use cases:

1. Deep Narrow Query: Grab all nodes for a given day (runDay=1/1/2010). Workflow accesses all collections, so need to efficiently fetch the Node collections (childNodes/events/values).
Sample HQL: from Node n where n.runDay = '1/1/2010'

2. Shallow Wide Query: Gather all 'root' nodes for a given week (1/1/2010 <= runDay < 1/8/2010), where 'root' is defined as node that has no parent. In this case, none of the collections are accessed as part of the workflow, so they don't need to be fetched.
Sample HQL: from Node n where n.parentNode is null and '1/1/2010' <= runDay and runDay < '1/8/2010'


Performance:
Code:
    HBM
Lazy     Fetch           DeepNarrow           ShallowWide
false     subselect        11 sec                34 sec                 
true      - (select)       71 sec*                6 sec       // 3N+1 Select on lazy fetches
true      join            600 sec**               6 sec       // cartesian explosion on eager join


* - lazy query, but as access all collections in the DeepNarrow result, results in 3N+1 selects overall.
** - via hql, I overrode lazy=true with eager fetch of all collections, resulting in cartesian explosion (roughly 10*10*10=1000 fold explosion) and horrible performance.

Thus the ideal solution is to use lazy=false, fetch=subselect strategy for the Deep Narrow query (11 sec), and lazy=true for the Shallow Wide query (6 sec). But there is no way to do this?

Note that if there is not, then I have essentially have to choose one of the strategies (probably lazy=false, fetch=subselect), and live with the at least 5-fold degradation (34 sec compared to 6 secs for the Shallow Wide query). Ouch!

And if I can't control subselect dynamically, I'm open to other suggestions as to how to get acceptable performance for *both* of the above workflows?

Thanks,
Brad

---------------------------------------------------------------------

I have a data structure that looks like (ignore lazy/fetch attribs for now):

Code:
  <class name="Node" table="Node" lazy="false">
    <id name="nodeId" column="NodeId" type="integer" unsaved-value="0">
      <generator class="native"/>
    </id>
     <property name="runDay" column="RunDay" type="Calendar" not-null="true"/>

     <!-- Many properties and many-to-one and one-to-one associations have been removed to make this example succinct -->

    <many-to-one name="parentNode" class="Node" column="ParentNodeId" cascade="none" not-null="false"
                 index="IDX_PARENT_NODE_ID" lazy="false" fetch="join"  />

     <!-- 3 collections -->   

    <bag name="childNodes" inverse="true" cascade="none" lazy="false" fetch="subselect">
      <key column="ParentNodeId"/>
      <one-to-many class="Node"/>
    </bag>   

    <set name="events" table="NodeEvent" cascade="all-delete-orphan" lazy="false" fetch="subselect">
      <key column="NodeId" not-null="true"/>
      <one-to-many class="NodeEvent"/>
    </set>

    <map name="values" table="NodePropertyValue" cascade="all-delete-orphan" lazy="false" fetch="subselect">
      <key column="NodeId"/>
      <index-many-to-many column="PropertyTypeId" class="NodeType"/>
      <element column="PropertyValue" type="text"/>
    </map>
</class>


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 1 post ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.