I don't want my first post about the book to be too negative so I should say that in almost every regard it has been phenomenal: easy to read, well indexed, and thorough. It's been and will continue to be and indispensable resource well worth the six copies we purchased. I guess I'm just grumpy because I spent the better part of day trying to figure out why some of my results had duplicates. As soon as I left the book and did a Google search I found the hundreds of discussions on the topic.
Anyway, here's the issue. Page 580 of Java Persistence with Hibernate states:
Code:
If you now load many Item objects, for example with createCriteria(Item.class).list(), this is how the resulting SQL statement looks:
select i.*, b.*
from ITEM i
left outer join BID b on i.ITEM_ID = b.ITEM_ID
The resultset now contains many rows, with duplicate data for each Item that has many bids, and NULL fillers for all Item objects that don’t have bids. Look at the resultset in figure 13.5.
Hibernate creates three persistent Item instances, as well as four Bid instances, and links them all together in the persistence context so that you can navigate this graph and iterate through collections—even when the persistence context is closed and all objects are detached.
It's that last paragraph that I believe has led us to some confusion. I would agree that this what is written is indeed what should happen, however it doesn't appear that this is the case. In fact this contradicts the more accurate statement found in the
Advanced Problems FAQ:
Code:
Hibernate does not return distinct results for a query with outer join fetching enabled for a collection (even if I use the distinct keyword)?
Query result lists are always exactly the same size as the underlying JDBC ResultSet. Try using uniqueResult() if appropriate, or simply distinctify the results yourself using, eg.
Collection results = new HashSet( session.createQuery("select p from Parent p left join fetch p.children").list() );
This filters out duplicate references, not duplicate "objects" or values. To understand why the duplicate references appear, have a look at the SQL resultset. Hibernate does not by default hide these duplicate references but mimics the SQL resultset. If you want to keep the order of elements, use a LinkedHashSet. An alternative in Hibernate 3.2 is a ResultTransformer that can filter duplicate references (in memory, of course).
Of course, I suppose it's possible to interpret the book text in such a way that these two items don't contradict each other, but I think that vast majority of people reading that section of the book would take away from that snippet of text that the List returned from the createCriteria(Item.class).list() statement would contain 3 items with the child associations properly initialized. At the very least, users should be made aware that they will have duplicate objects in their returned list so that they can apply one of the available techniques for removing those duplicates. We have 4 people on my team reading this book and to a person, they each struggled with this behavior because of the way the book text is worded.