I have two questions about querying collections as efficiently as possible.
1) In some cases Hibernate requires an unnecessary join of an intermediate table.
Consider the this example: A User has a many-to-one association with a Group, and a Group has a set of Permissions. So the USER table has a USERID primary key and a GROUPID foreign key, and the PERMISSION table also has a GROUPID foreign key.
Suppose I'd like to retrieve some information about a User and their Permissions, but I don't need to get any Group information. I could write HQL like "select u.username, p.type, p.value from User u join fetch u.group.permissions p". This will result in SQL like "select u.username, p.type, p.value from USER u inner join GROUP g on g.GROUPID = u.GROUPID inner join PERMISSION p on p.GROUPID = g.GROUPID".
In handwritten SQL, you could avoid joining the GROUP table, since we're not selecting any columns from it, and since the GROUPID is available in the USER table: "select u.username, p.type, p.value from USER u inner join PERMISSION p on p.GROUPID = u.GROUPID".
Is it possible to craft the HQL to avoid the unnecessary join in this case? Is a theta-style join possible here? Or do I need to use a SQLQuery?
2) What's the recommended way to efficiently load multiple collections for multiple entities?
If you want to load just one collection for a bunch of entities, that's easy: for example, "from Group g left join fetch g.permissions where g.creationDate > ?". But suppose Group also has a set of Users, and I'd like to load all the Users AND all the Permissions for the Groups meeting some criteria.
You can run the query above and then iterate through the results (taking care to coalesce the duplicate rows resulting from the join), initializing the users collection for each Group with a call like "Hibernate.initialize(group.getUsers())". That of course results in N additional selects.
You can run another query like the one above, this time joining the users collection: "from Group g left join fetch g.users where g.creationDate > ?". Since the Group objects are in the session cache, if the same list of Groups is returned by both queries, I guess you can end up with your list of Groups with both collections initialized. But this results in extraneous database overhead to load the Groups twice.
You can run the first query above (to get the Groups and eager-fetch the Permissions), then do query like "from User u where u.group.creationDate > ?", and finally manually assign the Users to their respective Group collections. This seems to be the most efficient, and is what I'd imagine writing in native SQL. But it only works for collections of entities, not components (unless you write native SQL and manually instantiate the components).
I know that in many cases this problem can be mitigated in other ways: caching, aggregation, etc. But sometimes you really do need to load all the data. Are there any elegant solutions to this problem?
Thanks in advance for your help!
|