I have the same problem. Scenario is this:
Parent (1) -> Children (2) -> ChildrensChildren (5)
Relations are mapped with cascade="all-delete-orphan". Numbers in parentesis are the object count (and row count i DB).
Doing a session.refresh(Parent) generates one SQL statement that fetches data for all three entities from DB thus returning 5 rows. However, only the two first entities (Parent and Children) are expected which makes the Children collection contain 5 and not the expected 2 entities. One or both "Children" objects will be duplicated.
The problem has to do with an optimization that the refresh event is using to fetch all data in just one SQL statement. I don't know how to fix the optimization but it can be disabled by commenting line 122 in DefaultRefreshEventListener.
Code:
String previousFetchProfile = source.getFetchProfile();
// source.setFetchProfile("refresh");
Object result = persister.load( id, object, event.getLockMode(), source );
source.setFetchProfile(previousFetchProfile);
You can also avoid the problem by turing of refresh cascading in your mapping.
Hopefully Gavin will fix the true problem within the optimization, it has something to do with the class CascadeEntityLoader which i don't understand.
Third option is to evict your object from all caches and then reload it. Doing a session.load or session.get will fetch a correctly loaded entity. This does however require that the object is not loaded somewhere else within the session so you get two active instances of the same entity. This is a risky approach that can be usefull in some cases.
Has anyone seen a Jira issue on this? More threads on the forum?