-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 14 posts ] 
Author Message
 Post subject: [bug?] Caching not handling collections correctly
PostPosted: Fri Mar 17, 2006 1:31 pm 
Newbie

Joined: Fri Mar 17, 2006 12:54 pm
Posts: 10
I've got an example where I have the following object hierarchy:

Order 1-* Bag 1-* Item

I have a lot of these objects, so I want to cache them in a transactional manner (perhaps using TreeCache) to minimize the number of database queries. However, it looks like Hibernate is not caching the object collections correctly.

When I create an Order, and then a Bag (adding it to the Order) and then an Item (adding it to the Bag), the object cache's Order, Bag, and Item are correctly populated with these new objects. However, the collection cache's Order.bags and Bag.items are not. It looks like they are populated the next time these collections are loaded, but even that is not consistent.


I put together a test app which isolates just these objects and runs a JUnit test that shows the problem with the caching. It can be downloaded here.


Output from JUnit test (show_sql=true):
Code:
====== step 1 ======
Hibernate: select nextval ('order_sequence')
Hibernate: select nextval ('bag_sequence')
Hibernate: select nextval ('item_sequence')
Hibernate: insert into ce_order (id) values (?)
Hibernate: insert into ce_bag (order_id, index, id) values (?, ?, ?)
Hibernate: insert into ce_item (bag_id, index, id) values (?, ?, ?)
Hibernate: update ce_bag set order_id=?, index=? where id=?
Hibernate: update ce_item set bag_id=?, index=? where id=?
model.Bag.items ** {}
model.Item ** {1={index=null, _subclass=model.Item, bag=1, _lazyPropertiesUnfetched=false}}
model.Bag ** {1={index=null, _subclass=model.Bag, order=1, items=1, _lazyPropertiesUnfetched=false}}
model.Order ** {1={_subclass=model.Order, bags=1, _lazyPropertiesUnfetched=false}}
model.Order.bags ** {}
====== step 2 ======
Hibernate: select bags0_.order_id as order2_1_, bags0_.id as id1_, bags0_.index as index1_, bags0_.id as id0_, bags0_.order_id as order2_1_0_, bags0_.index as index1_0_ from ce_bag bags0_ where bags0_.order_id=?
Hibernate: select items0_.bag_id as bag2_1_, items0_.id as id1_, items0_.index as index1_, items0_.id as id0_, items0_.bag_id as bag2_2_0_, items0_.index as index2_0_ from ce_item items0_ where items0_.bag_id=?
Hibernate: select nextval ('item_sequence')
Hibernate: insert into ce_item (bag_id, index, id) values (?, ?, ?)
Hibernate: update ce_item set bag_id=?, index=? where id=?
model.Bag.items ** {}
model.Item ** {2={index=null, _subclass=model.Item, bag=1, _lazyPropertiesUnfetched=false}, 1={index=0, _subclass=model.Item, bag=1, _lazyPropertiesUnfetched=true}}
model.Bag ** {1={index=0, _subclass=model.Bag, order=1, items=1, _lazyPropertiesUnfetched=true}}
model.Order ** {1={_subclass=model.Order, bags=1, _lazyPropertiesUnfetched=false}}
model.Order.bags ** {1=[1]}
====== step 3 ======
Hibernate: select items0_.bag_id as bag2_1_, items0_.id as id1_, items0_.index as index1_, items0_.id as id0_, items0_.bag_id as bag2_2_0_, items0_.index as index2_0_ from ce_item items0_ where items0_.bag_id=?
Hibernate: select nextval ('item_sequence')
Hibernate: insert into ce_item (bag_id, index, id) values (?, ?, ?)
Hibernate: update ce_item set bag_id=?, index=? where id=?
model.Bag.items ** {}
model.Item ** {2={index=1, _subclass=model.Item, bag=1, _lazyPropertiesUnfetched=true}, 1={index=0, _subclass=model.Item, bag=1, _lazyPropertiesUnfetched=true}, 3={index=null, _subclass=model.Item, bag=1, _lazyPropertiesUnfetched=false}}
model.Bag ** {1={index=0, _subclass=model.Bag, order=1, items=1, _lazyPropertiesUnfetched=true}}
model.Order ** {1={_subclass=model.Order, bags=1, _lazyPropertiesUnfetched=false}}
model.Order.bags ** {1=[1]}


Notice how the first thing it does in step 2 is go to the database for the Bag and Item objects it just stored. These objects are actually in the cache (as seen in the cache dump from step 1), but the collection caches Order.bags and Bag.items are not correctly set.

XDoclet generated Mapping documents:

Code:
<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE hibernate-mapping PUBLIC
    "-//Hibernate/Hibernate Mapping DTD 3.0//EN"
    "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">

<hibernate-mapping
>
    <class
        name="model.Order"
        table="ce_order"
    >
        <cache usage="transactional" />

        <id
            name="id"
            column="id"
            type="java.lang.Long"
        >
            <generator class="sequence">
                <param name="sequence">order_sequence</param>
              <!-- 
                  To add non XDoclet generator parameters, create a file named
                  hibernate-generator-params-Order.xml
                  containing the additional parameters and place it in your merge dir.
              -->
            </generator>
        </id>

        <list
            name="bags"
            lazy="false"
            cascade="save-update"
        >
            <cache
                usage="transactional"
            />

            <key
                column="order_id"
            >
            </key>

            <index
                column="index"
            />

            <one-to-many
                  class="model.Bag"
            />

        </list>

        <!--
            To add non XDoclet property mappings, create a file named
                hibernate-properties-Order.xml
            containing the additional properties and place it in your merge dir.
        -->

    </class>

</hibernate-mapping>


Code:
<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE hibernate-mapping PUBLIC
    "-//Hibernate/Hibernate Mapping DTD 3.0//EN"
    "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">

<hibernate-mapping
>
    <class
        name="model.Bag"
        table="ce_bag"
    >
        <cache usage="transactional" />

        <id
            name="id"
            column="id"
            type="java.lang.Long"
        >
            <generator class="sequence">
                <param name="sequence">bag_sequence</param>
              <!-- 
                  To add non XDoclet generator parameters, create a file named
                  hibernate-generator-params-Bag.xml
                  containing the additional parameters and place it in your merge dir.
              -->
            </generator>
        </id>

        <many-to-one
            name="order"
            class="model.Order"
            cascade="none"
            outer-join="auto"
            update="true"
            insert="true"
            column="order_id"
            not-null="true"
        />

        <property
            name="index"
            type="java.lang.Integer"
            update="true"
            insert="true"
            column="index"
        />

        <list
            name="items"
            lazy="false"
            cascade="none"
        >
            <cache
                usage="transactional"
            />

            <key
                column="bag_id"
            >
            </key>

            <index
                column="index"
            />

            <one-to-many
                  class="model.Item"
            />

        </list>

        <!--
            To add non XDoclet property mappings, create a file named
                hibernate-properties-Bag.xml
            containing the additional properties and place it in your merge dir.
        -->

    </class>

</hibernate-mapping>


Code:
<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE hibernate-mapping PUBLIC
    "-//Hibernate/Hibernate Mapping DTD 3.0//EN"
    "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">

<hibernate-mapping
>
    <class
        name="model.Item"
        table="ce_item"
        mutable="false"
    >
        <cache usage="transactional" />

        <id
            name="id"
            column="id"
            type="java.lang.Long"
        >
            <generator class="sequence">
                <param name="sequence">item_sequence</param>
              <!-- 
                  To add non XDoclet generator parameters, create a file named
                  hibernate-generator-params-Item.xml
                  containing the additional parameters and place it in your merge dir.
              -->
            </generator>
        </id>

        <many-to-one
            name="bag"
            class="model.Bag"
            cascade="none"
            outer-join="auto"
            update="true"
            insert="true"
            column="bag_id"
            not-null="true"
        />

        <property
            name="index"
            type="java.lang.Integer"
            update="true"
            insert="true"
            column="index"
        />

        <!--
            To add non XDoclet property mappings, create a file named
                hibernate-properties-Item.xml
            containing the additional properties and place it in your merge dir.
        -->

    </class>

</hibernate-mapping>



JUnit test demonstrating problem
Code:
package test;

import model.Order;
import model.Bag;
import model.Item;

import java.util.Map;

import org.hibernate.Session;
import org.hibernate.Transaction;
import org.hibernate.stat.Statistics;

/**
* 16:36 < unlord> anyway, the test should create an order, then create a bag and add it to the order, then create an Item and add it, close the tx
* 16:36 < unlord> then open another tx, try to retrieve the Order, then get the Bag off the order, and add a new Item
* 16:37 < unlord> close the tx, and then do the exact same thing a 3rd time
* 16:37 < unlord> on the second Tx, it should find the Order just fine, but will go to the DB for the bag and item
* 16:37 < unlord> on the third Tx, it should find the Bag just fine, but go to the db for the second Item
* 16:39 < unlord> with a transactional cache, there is no way the cache can be out of sync with the DB, so it should never need to go to the db for stuff it knows
*/
public class CacheTest extends BaseTestCase {

  protected Long orderId;

  protected Long bagId;

  protected void printCache() {
    Statistics st=sf.getStatistics();
    String[] regs=st.getSecondLevelCacheRegionNames();
    for (int i=0;i<regs.length;i++) {
      String r=regs[i];
      Map cacheEntries=st.getSecondLevelCacheStatistics(r).getEntries();
      System.out.println(r+" ** "+cacheEntries);
    }
  }

  protected void step1() {
    Session session=sf.openSession();
    Transaction tx=session.beginTransaction();

    // create an order
    Order o=new Order();
    session.save(o);

    // create a bag
    Bag b=new Bag();
    b.setOrder(o);
    session.save(b);

    o.getBags().add(o);

    // create an item
    Item i=new Item();
    i.setBag(b);
    session.save(i);

    b.getItems().add(i);

    // save it all
    session.update(o);

    orderId=o.getId();
    bagId=b.getId();

    // close the tx
    tx.commit();
    session.close();
  }

  protected void step2() {
    Session session=sf.openSession();
    Transaction tx=session.beginTransaction();

    // load the order
    Order o=(Order)session.get(Order.class,orderId);

    // retrieve the bag
    Bag b=(Bag)o.getBags().get(0);

    // create an item
    Item i=new Item();
    i.setBag(b);
    session.save(i);

    b.getItems().add(i);

    // save the item
    session.update(o);

    tx.commit();
    session.close();
  }

  public void testCache() {
    System.out.println("====== step 1 ======");
    step1();
    printCache();
    System.out.println("====== step 2 ======");
    step2();
    printCache();
    System.out.println("====== step 3 ======");
    step2();
    printCache();
  }

};



Tested against Hibernate versions
2.1.8
3.0.5
3.1.2


Name and version of the database you are using:
PostgreSQL 8.0


Top
 Profile  
 
 Post subject:
PostPosted: Fri Mar 17, 2006 4:58 pm 
Expert
Expert

Joined: Mon Jan 09, 2006 5:01 pm
Posts: 311
Location: Sacramento, CA
cache="transactional" is for clustering... and doesn't support read-write and nostrict-read-write, which is what I think you want...

Try cache="read-write"

Also make sure to enable your cache provider in your hibernate config file. Try ehCache by:
hibernate.cache.provider_class org.hibernate.cache.EhCacheProvider

_________________
-JT

If you find my replies helpful, please rate by clicking 'Y' on them. I appreciate it.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Mar 19, 2006 8:40 am 
Newbie

Joined: Fri Mar 17, 2006 12:54 pm
Posts: 10
jt_1000 wrote:
cache="transactional" is for clustering... and doesn't support read-write and nostrict-read-write, which is what I think you want...

Try cache="read-write"

Also make sure to enable your cache provider in your hibernate config file. Try ehCache by:
hibernate.cache.provider_class org.hibernate.cache.EhCacheProvider


This example is actually done using EhCache 1.2beta4. I believe that as of 1.2 it will support distributed transactional caches.

In any event, I've run this test using "read-write" and EhCache locally (not distributed) and saw the same thing. I'm pretty sure this is an issue with how Hibernate is using the particular cache manager and not an issue with how the cache manager is caching collections.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Mar 19, 2006 8:50 am 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
This is expected (and correct) behavior.

Multiple concurrent txns can be adding to a collection simultaneously, so the only correct time to cache a collection is after fetching it from the db.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Mar 19, 2006 5:54 pm 
Newbie

Joined: Fri Mar 17, 2006 12:54 pm
Posts: 10
gavin wrote:
This is expected (and correct) behavior.

Multiple concurrent txns can be adding to a collection simultaneously, so the only correct time to cache a collection is after fetching it from the db.


Thank you for your input. I'm not sure I totally understand why this would be expected behavior.

In the example code I provided, when I commit the transaction, Hibernate assigns an index to the child object based on where it is in the parent collection. If that SQL update is transactionally safe, it seems like you should be able to store that collection information in a transactional cache. Any subsequent modification of the collection will also be transactionally safe. When the collection cache is invalidated (falls out of the cache) only then should it need to be reloaded. When you first create the parent object, the collection can and should be transactionally cached (even if it contains no objects).

I look forward to any insight you can give me on this problem.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Mar 19, 2006 7:05 pm 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
Nope, you've misunderstood what transactional guarantees are generally available. No-one *ever* runs their databases in serializable mode, so phantom reads are not prevented.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Mar 19, 2006 7:28 pm 
Newbie

Joined: Fri Mar 17, 2006 12:54 pm
Posts: 10
gavin wrote:
Nope, you've misunderstood what transactional guarantees are generally available. No-one *ever* runs their databases in serializable mode, so phantom reads are not prevented.


Well then I am still confused. In the example code I have, the first step creates the objects, and caches them just fine. In the second step, when it queries the Order object, it has it cached so it does not need to load it, but it does need to look up all Bag objects for that order (and then all Item objects associated with that Bag). However, in the third step, it does *not* go to the database for all Bag's associated with the Order object (since the relationship is cached) but it still needs to look up all Item objects for that cached Bag object.

Why is it doing this? More to the topic of this post, why can't you simply cache the association when the object is created? It really seems like no other transaction in a different thread could possibly be modifying the association of an object that is being created? If its okay to cache the Bag association after its been queried, why is it not okay to cache it when it is created?


Top
 Profile  
 
 Post subject:
PostPosted: Sun Mar 19, 2006 7:41 pm 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
unlord wrote:
Why is it doing this? More to the topic of this post, why can't you simply cache the association when the object is created?


I already explained this. Sorry, I don't have time to explain phantom reads to you just now.

How about you just trust what Hibernate is doing, because it always has very good reasons for its very sophisticated caching behavior, and the people who designed this stuff have spent a lot, lot more time thinking about caching and transactions than you have.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Mar 19, 2006 7:52 pm 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
Well, actually, in your special case, where you are creating the collection owner for the first time in the same transaction that creates the collection elements, it is probably safe to cache the collection immediately (I would have to re-think through a bunch of considerations), but this is of course not the general case of a "create collection" operation. You also have to consider cases where the owner is already existing.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Mar 19, 2006 8:05 pm 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
unlord wrote:
However, in the third step, it does *not* go to the database for all Bag's associated with the Order object (since the relationship is cached) but it still needs to look up all Item objects for that cached Bag object.


Because you modified the items collection in step 2 (you added a new item). And another transaction could be doing the same thing concurrently (phantom reads are allowed). So we absolutely cannot recache the updated collection, we must simply evict the cache and let it be recached the next time we run a query. This is exactly what I was talking about. The behavior is expected, and it is the only possible correct behavior.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Mar 20, 2006 7:28 am 
Newbie

Joined: Fri Mar 17, 2006 12:54 pm
Posts: 10
gavin wrote:
Well, actually, in your special case, where you are creating the collection owner for the first time in the same transaction that creates the collection elements, it is probably safe to cache the collection immediately (I would have to re-think through a bunch of considerations), but this is of course not the general case of a "create collection" operation. You also have to consider cases where the owner is already existing.


Yah, I've been thinking about it and I believe I see what is going on. Whenever an element is added to a collection, you simply invalidate the collection cache and require it to be reloaded from the db. This is because (as you rightly said) you don't want to serialize all db accesses against the cache by making sure that only one thread is modifying it at a time. I see how you could have a situation where two transactions are modifying an existing collection and neither of them knows about the other, so there is a race condition if you allow either to update the collection w/o a lock (like a cache should).

Thanks for considering the case where you create the owner and the collection elements in the same transaction. Is it possible to detect this and cache based on it? Would that be too much work for a special case?


Top
 Profile  
 
 Post subject:
PostPosted: Mon Mar 20, 2006 9:24 am 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
Probably too much effort to be worthwhile, yes.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Apr 07, 2006 6:29 am 
Newbie

Joined: Sat Oct 22, 2005 9:26 pm
Posts: 11
Quote:
Thanks for considering the case where you create the owner and the collection elements in the same transaction. Is it possible to detect this and cache based on it? Would that be too much work for a special case?


We are having similar problems, although maybe our situation is slightly different. We have collections that are mapped, but essentially read only. For us it would be better if the collection were cached immediately, and invalidated completely on a single write.

I'm not sure how to solve this yet, except (shudder) doing a read immediately after write, or abandoning the hibernate collections altogether and using some serious custom types. There may be some simple mapping file tweaks hibernate could add that could help here ... maybe a read-only collection type ? An event we could listen to and request a cache event early ?

Help ? The customer is always right ? (except maybe in open source ;) )


Top
 Profile  
 
 Post subject:
PostPosted: Wed Feb 07, 2007 5:37 am 
Newbie

Joined: Thu Jan 05, 2006 9:36 am
Posts: 2
I also have similar problems and would benefit from immediately cached collections.

I am writing billing that will definitely run on one machine, so this will be the one and only process that will add items to collections. The billing is run in many phases. One phase in one transaction adds some items to collection and this invalidates this collection's cache, the next phase in another transaction then accesses this collection and rereads it from database. I was hoping to significantly increase speed with caching colections, but now this does not work for me. The idea was to set the (eh-)cache to eternal=true and in first phase read everything in cache 'with left outer join fetch' so late phases would not access database at all (except for some queries). And it works, except for the described collection problem.

I must say that I have spent much time discovering this collection rereading behaviour is not a bug. And by the number of posts on this forum, other people also have problem understanding this.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 14 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.