-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 12 posts ] 
Author Message
 Post subject: Problems with one-many performance. Unnecessary SQL?
PostPosted: Mon Jun 21, 2004 10:05 pm 
Beginner
Beginner

Joined: Mon Jun 21, 2004 7:59 pm
Posts: 21
Ok. I've read the Hibernate2 reference document (a few times), the examples and FAQs. They've helped, but now I'm stuck. I've found that for small one-many collections, performance is pretty good. However Hibernate seems to make some unecessary calls that really effect performance on large collections.

In my example:
Person - A collection of Aliases (just a wrapper class)
Alias - Hosts demographic info on a Person
Address - Address info for the Alias.

Person 1 <-> * Alias 1 <-> * Address

Problem: To test and demo Hibernate2 I'm trying to do a few things, but one of them is a stress test on insert and delete performance.

Test1: Insert 1 Person with 1 million generated Aliases, and no Address associations. Fails, out of memory (kind of expected).

Test2: Insert 1 Person with 100,000 generated Aliases, and no Address associations. Fails, out of memory.

Test3: Modified test 2, not to fail. Uses session.evict(). However doesn't work correctly when cascade="all" or cascade="all-delete-orphan" is selected. I like using the all-delete-orphan option since on smaller collections it has exactly the symantics I want. However in Test2 odd update statements are called which hang things up.

Test4: Remove all the Persons created in tests 1, 2, or 3. The problem I'm seeing here is that Persons seem to be getting initialized before they're removed. This also kills performance.

I've think I have the correct lazy and inverse settings, but if someone could follow my code/mapping files, take a look at my output and post some suggestions, I'd really appreciate it.

Really wish the forum had an attachment option.

Hibernate version: 2.1.4
Java version: 1.4.2_03
Database: Postgresql 7.4.1

Mapping Docs:
Person.hbm.xml
Code:
<hibernate-mapping>
    <class
        name="db.Person"
        table="person"
        dynamic-update="true"
        dynamic-insert="true"
    >
        <cache usage="nonstrict-read-write" />

        <id
            name="ID"
            column="id"
            type="java.lang.Long"
        >
            <generator class="increment">
                <param name="sequence">person_sequence</param>
            </generator>
        </id>

        <set
            name="aliases"
            table="person_alias"
            lazy="true"
            inverse="true"
            cascade="all-delete-orphan"
            sort="unsorted"
        >
            <cache
                usage="nonstrict-read-write"
             />

              <key
                  column="personid"
              >
              </key>

              <one-to-many
                  class="db.Alias"
              />
        </set>

        <!--
            To add non XDoclet property mappings, create a file named
                hibernate-properties-Person.xml
            containing the additional properties and place it in your merge dir.
        -->

    </class>

        <query name="person.getPersonByAliasLName"><![CDATA[
            select person from db.Person as person, db.Alias as alias where alias.lastName = :lname and alias in elements(person.aliases)
        ]]></query>
        <query name="person.getOrphanedPersons"><![CDATA[
            from db.Person as person where person.aliases.size = 0
        ]]></query>
        <query name="person.getAllPersons"><![CDATA[
            select person from db.Person as person
        ]]></query>

</hibernate-mapping>

Alias.hbm.xml
Code:
<hibernate-mapping>
    <class
        name="db.Alias"
        table="person_alias"
        dynamic-update="true"
        dynamic-insert="true"
    >
        <cache usage="nonstrict-read-write" />

        <id
            name="ID"
            column="id"
            type="java.lang.Long"
        >
            <generator class="increment">
            </generator>
        </id>

        <property
            name="firstName"
            type="java.lang.String"
            update="true"
            insert="true"
            access="property"
            column="firstName"
            length="60"
        />

        <property
            name="lastName"
            type="java.lang.String"
            update="true"
            insert="true"
            access="property"
            column="lastName"
            length="60"
            not-null="true"
        />

        <property
            name="SSN"
            type="java.lang.String"
            update="true"
            insert="true"
            access="property"
            column="SSN"
            length="9"
        />

        <set
            name="addresses"
            lazy="true"
            inverse="true"
            cascade="all-delete-orphan"
            sort="unsorted"
        >
            <cache
                usage="nonstrict-read-write"
             />

              <key
                  column="aliasid"
              >
              </key>

              <one-to-many
                  class="db.Address"
              />
        </set>

        <many-to-one
            name="person"
            class="db.Person"
            cascade="none"
            outer-join="auto"
            update="true"
            insert="true"
            access="property"
            column="personid"
            not-null="true"
        />

        <!--
            To add non XDoclet property mappings, create a file named
                hibernate-properties-Alias.xml
            containing the additional properties and place it in your merge dir.
        -->

    </class>

</hibernate-mapping>

Address.hbm.xml
Code:
<hibernate-mapping>
    <class
        name="db.Address"
        table="address"
        dynamic-update="true"
        dynamic-insert="true"
    >
        <cache usage="nonstrict-read-write" />

        <id
            name="ID"
            column="id"
            type="java.lang.Long"
        >
            <generator class="increment">
            </generator>
        </id>

        <many-to-one
            name="alias"
            class="db.Alias"
            cascade="none"
            outer-join="auto"
            update="true"
            insert="true"
            access="property"
            column="aliasid"
            not-null="true"
        />

        <property
            name="street"
            type="java.lang.String"
            update="true"
            insert="true"
            access="property"
            column="street"
        />

        <property
            name="city"
            type="java.lang.String"
            update="true"
            insert="true"
            access="property"
            column="city"
        />

        <property
            name="state"
            type="java.lang.String"
            update="true"
            insert="true"
            access="property"
            column="state"
        />

        <property
            name="zip"
            type="java.lang.String"
            update="true"
            insert="true"
            access="property"
            column="zip"
            length="5"
        />

        <property
            name="zipPlusFour"
            type="java.lang.String"
            update="true"
            insert="true"
            access="property"
            column="zipPlusFour"
            length="4"
        />

        <!--
            To add non XDoclet property mappings, create a file named
                hibernate-properties-Address.xml
            containing the additional properties and place it in your merge dir.
        -->

    </class>

</hibernate-mapping>


Problem create code: I've done a number of things to try and get things working. All either have memory problems when the number of aliases are > 80K or there are other issues when Session.evict() is used in conjunction with any cascade option that also does a save-update. The below code will work and insert 30k Aliases. However the delete code has a lot of problems. I'd like the below code to work for > 100k Aliases and the delete code to work efficiently.
Insert Code
Code:
Session s = sessionfactory.openSession();
        Transaction tx = s.beginTransaction();
        Person p = new Person();
        //s.save(p);
        //s.flush();
        List aliasList = new LinkedList();
        for(int i=0; i<30000; i++)
        {
            Alias a = new Alias();
            a.setLastName("Hazen"+i);
            a.setPerson(p);
            a.setFirstName("Jim");
            a.setSSN("555555555");
            //s.save(a);
            //aliasList.add(a);
            p.getAliases().add(a);
            //s.save(p);
            if(i%10000 == 0)
            {
                //s.save(p);
                //s.flush();
                //s.clear();
//                for(Iterator iterator = aliasList.iterator(); iterator.hasNext();)
//                {
//                    s.evict((Alias)iterator.next());
//                    iterator.remove();
//                }
                //
                //                //s.close();
                //                //s = sessionfactory.openSession();
                System.out.println("i = "+i);
                System.out.println("memory = "+Runtime.getRuntime().freeMemory());
            }
        }
       
        s.save(p);
        //s.flush();
        tx.commit();
        s.close();


Delete Code
Code:
Session s = sessionfactory.openSession();
        Transaction tx = s.beginTransaction();
       
        Query q = s.getNamedQuery("person.getAllPersons"); //select person from db.Person as person
        s.delete(q.getQueryString());
//        for(Iterator i = q.iterate(); i.hasNext();)
//        {
//            s.delete((Person)i.next());
//        }
       
        tx.commit();
        s.close();


Output
Code:
[java] i = 0
     [java] memory = 7372008
     [java] i = 10000
     [java] memory = 4977048
     [java] i = 20000
     [java] memory = 2170424
18:39:04,996  INFO DriverManagerConnectionProvider:143 - cleaning up connection pool: jdbc:postgresql:hibernate
     [java] Hibernate: insert into person (id) values (?)
     [java] Hibernate: insert into person_alias (firstName, lastName, SSN, personid, id) values (?, ?, ?, ?, ?)
     [java] Hibernate: select person0_.id as id from person person0_
     [java] Hibernate: select aliases0_.id as id__, aliases0_.personid as personid__, aliases0_.id as id0_, aliases0_.firstName as firstName0_, aliases0_.lastName as lastName0_, aliases0_.SSN as SSN0_, aliases0_.personid as personid0_ from person_alias aliases0_ where aliases0_.personid=?
[java] Hibernate: select addresses0_.id as id__, addresses0_.aliasid as aliasid__, addresses0_.id as id0_, addresses0_.aliasid as aliasid0_, addresses0_.street as street0_, addresses0_.city as city0_, addresses0_.state as state0_, addresses0_.zip as zip0_, addresses0_.zipPlusFour as zipPlusF7_0_ from address addresses0_ where addresses0_.aliasid=?
... (same message many times)
     [java] Hibernate: select addresses0_.id as id__, addresses0_.aliasid as aliasid__, addresses0_.id as id0_, addresses0_.aliasid as aliasid0_, addresses0_.street as street0_, addresses0_.city as city0_, addresses0_.state as state0_, addresses0_.zip as zip0_, addresses0_.zipPlusFour as zipPlusF7_0_ from address addresses0_ where addresses0_.aliasid=?
     [java] Hibernate: delete from person_alias where id=?

After which the code hangs, or at least the delete doesn't complete in a reasonable amount of time (I usually let it go for 1-5 minutes, but the same call by hand only takes a second.)

So my questions are, why is Hibernate doing all those selects? You can do the same stuff very simply, and it doesn't take nearly as long. I figure it must be with my mapping, but if I change the mapping file, things don't work right. I get the same FAQ errors that people get when they don't set up their mapping correctly. So how do I get deletes to work in conjunction with cascade=all-delete-orphan without all the initialization.

By hand the statements work much quicker:
Code:
delete from address where aliasid in (select id from person_alias where personid=?);
delete from person_alias where personid=?;
delete from person where id=?;


Also, if I try to use the session.evict() insert code with a cascade that does save-update, I get the following problem. The initial insert goes through, then the next, and then an update. Eventually this grinds things to a hault. I don't see the need for the update, and can't find a way to get both cascade all-delete-orphan symantics and avoid running out of memory on large collections. Sure, you can say that ORM tools aren't really designed for large collections. But being able to demonstrate that there is at least a way, will be critical for my company.

Output with evict code
Code:
     [java] Hibernate: insert into person (id) values (?)
     [java] Hibernate: insert into person_alias (firstName, lastName, SSN, personid, id) values (?, ?, ?, ?, ?)
     [java] i = 0
     [java] memory = 7473288
18:59:07,667  INFO DriverManagerConnectionProvider:143 - cleaning up connection pool: jdbc:postgresql:hibernate
     [java] Hibernate: insert into person_alias (firstName, lastName, SSN, personid, id) values (?, ?, ?, ?, ?)
     [java] Hibernate: update person_alias set firstName=?, lastName=?, SSN=?, personid=? where id=?
     [java] i = 10000
     [java] memory = 3749384
     [java] Hibernate: insert into person_alias (firstName, lastName, SSN, personid, id) values (?, ?, ?, ?, ?)
     [java] Hibernate: update person_alias set firstName=?, lastName=?, SSN=?, personid=? where id=?

After which things grind to a hault. I'm sure it's probably because of the collection state, but I'm not sure what to do here. I'd like to batch update these records, so insert 10k, commit and then evict, but haven't found a way to do that all with cascade all-delete-orphan (which it was the program will need most of the time).

Any help would be appreciated. I have a complete src package with ant build/run/db schema export script that I could send if people need to play around on their systems. Let me know.

Thanks in advance,
Jim


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 22, 2004 1:10 am 
Beginner
Beginner

Joined: Mon Jun 21, 2004 7:59 pm
Posts: 21
Ok, I think I'm starting to understand what's screwing me up.

With cascade={all, all-delete-orphan, save-update}, Hibernate saves child elements for you, automatically. With all-delete-orphan it essentially makes the db look like the collection, with deletes. So with these options if I:

Parent.getChildren().add(Child); //1 insert
Session.flush();
Parent.getChildren().add(Child); //1 insert 1 update.
Session.flush();

I'd ask Hibernate to only update the record, if needed, but how is it supposed to know if it's needed? Actually, now that I think about it, it should know. The first call to Parent.getChildren() should have initialized the collection (if eager) or at least retrieved all the PKs (if lazy). In either case Hibernate knows the current state of the Children collection at that point. Add a child, flush, add another child, it should know what the state of the collection is and make the right choice.

Yes, yes, it should and it does. Of course due to my memory concerns I'm evicting all of this state information. Hibernate doesn't know the current state of my collection and plays it safe and assumes it's dirty. This is why I'm seeing what I'm seeing. Hibernate is doing the right thing. Be careful when you use evict with a cascade option that will save-update.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 22, 2004 1:15 am 
Beginner
Beginner

Joined: Mon Jun 21, 2004 7:59 pm
Posts: 21
What I'd really like is a programatic way to change the cascade settings for a given relation, for a given session.

This would allow me to have my cascade="all-delete-orphan" for 90% of my work and give me the option to set this load to cascade="none" for my batch inserts (with eviction).

I would still have to associate Session.save(Child) and Session.save(Parent). But I can live with that. A few extra lines in a corner case it no big deal. Having to change my cascade options for everything to support this corner case is a pain.

-Jim


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 22, 2004 2:51 am 
Beginner
Beginner

Joined: Mon Jun 21, 2004 7:59 pm
Posts: 21
So may update problem makes sense. Anyone know what's wrong with my delete method?

s.delete(Person);

I would just think it would scan the association graph for Person and issue a few delete X where Y = ? and we'd be set. Why is it doing so many selects before the delete? The delete also seems to take forever. Removing a Person with 20k aliases takes Hibernate 2 minutes. The hand delete from person_alias where personid = ? takes less than a second.

-Jim


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 22, 2004 2:53 am 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
Hibernate 2.1 does not implement efficient deletes. We will adress this in a future version.

(Of course, real applications should never physically delete data via the online application, but that's another discussion entirely.....)


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 22, 2004 3:41 am 
Beginner
Beginner

Joined: Mon Jun 21, 2004 7:59 pm
Posts: 21
Thanks. As long as I'm "doing it right". Any way of formulating things so that the current version get's things deleted a little quicker?

I guess for these large delete cases I can manage the graph myself. Person.getAssociation().clear()
s.delete(Person)

And you're right, large scale deletes aren't the norm, but there are some reporting tables that we roll over one month, and then remove. So we do remove medium amounts of relatively unimportant data.

Any thoughts on how high a priority this is? Something for 2.1.6 or more like 2.2? I know that when I try and sell Hibernate here at my company it will be trial to convince people that Hibernate generates efficient SQL. My folks will frown when they see this, since the first thing they'll do is create a bunch of objects and then remove a bunch of objects. Even if they do only do deletes 10% of the time.

-Jim


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 22, 2004 3:48 am 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
Don't expect anything until Hibernate3.

Really, I would say that purging data is not the role of an ORM solution, esp. not purging *large amounts* of data. What is object oriented about it?


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 22, 2004 4:13 am 
Beginner
Beginner

Joined: Mon Jun 21, 2004 7:59 pm
Posts: 21
I'd like Hibernate and HQL to become our main DB access point. Keep things clean and OO for the most part. But sometimes you just have to issue:
Code:
s.delete("from LogEvent where currentperiod='N'");

Not much OO about that, but it takes forever if LogEvent has some one-many associations that also need to go. I guess I can grab the JDBC connection from the Session and issue my few association deletes, then my top level delete by hand. I would have to do that anyway without Hibernate, so it's still a net gain. It's just with all that Hibernate does well, I'm surprised that this is something I'm finding a work around for. Especially since Hibernate knows what the associations are, it just appears to be initializing them before removing them. Doesn't seem to be a point in that.

But hey, if it's so easy, why don't I do it, and send you the patch right? :) Looking forward to H3. Still shooting for Q1 05?

-Jim


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 22, 2004 8:17 am 
Hibernate Team
Hibernate Team

Joined: Mon Aug 25, 2003 9:11 pm
Posts: 4592
Location: Switzerland
The main access point to your data should be your database, not your OO-application. It doesn't get any "cleaner" than that.

_________________
JAVA PERSISTENCE WITH HIBERNATE
http://jpwh.org
Get the book, training, and consulting for your Hibernate team.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 22, 2004 8:27 am 
CGLIB Developer
CGLIB Developer

Joined: Thu Aug 28, 2003 1:44 pm
Posts: 1217
Location: Vilnius, Lithuania
jimhazen2000 wrote:
And you're right, large scale deletes aren't the norm, but there are some reporting tables that we roll over one month, and then remove. So we do remove medium amounts of relatively unimportant data.


Put some script on crontab, It is not OOP, but it must not be a very big problem ... .


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 22, 2004 2:00 pm 
Beginner
Beginner

Joined: Mon Jun 21, 2004 7:59 pm
Posts: 21
christian wrote:
The main access point to your data should be your database, not your OO-application. It doesn't get any "cleaner" than that.


I think that statement runs counter to what Hibernate is all about. Sure the DB houses all the data, so it's the defacto access point to all the data. However, once you start developing data centric applications, it becomes advantageous to represent access to this data via the same idiomatic programming means as the rest of your business logic.

Enterprise connection pooling is needed by everyone, but is time consuming to write and easy to get wrong. Integration into transaction managers, same thing. There are a ton of persistance frameworks and home grown ORM solutions out there. These solutions are often buggy, and incomplete and maintentance takes away from developing the core application. I would rather let DB and ORM specialists take care of this well defined issue and leave my business to develop its business applications.

Hibernate is an excellent ORM persistance service. It has a number of features that wouldn't be available if I had all my DB code in DAOs:
Caching objects
Executing SQL statements later
Never updating unmodified objects
You get the idea, go here for more.

Sure, at times there is overhead to OO development. Especially when the problem isn't really OO in nature. Artificially forcing something into an object that isn't has problems. I didn't expect my batch insert to function efficiently. This was really a data loading task rather than an objectified data usage task. Plus if I were inserting many objects instead of a single object with a huge number of children, performance would also probably be much better. Again, no issue with the insert.

However, for the removal of data, I don't see a reason not to use Hibernate. Either I need to remove data that I've already been using in an objectified manner, or I'd like to take advantage of Hibernate's declarative association and management mechanism to ensure that my delete is complete and I avoid FK violations. For the same reason that idiomatic access to collections of child objects is good, I should be looking to Hibernate to manage the cleanup of those collections. Sure I could write the cleanup code by hand, but it's likely that I'll make a mistake. I'll forget to clean up an association, or get the ordering wrong, and have to go back and fix. Hibernate can do all of this programatically and should never make a mistake.

So far my only issue with Hibernate is that it is horribly inefficient when it comes to removing stuff. For persistant objects with child relations it seems to eagerly load the entire graph and then remove each relation one by one. This gets the job done, and never causes FK problems, but it takes years. To me this is simply a performance bug with Hibernate, not a fundamental abuse of the tool. There's no reason for Hibernate to eagerly load objects before removal, and removing the association all at once via a single delete statement rather than recursing the delete on each element seems doable.

So don't get me wrong here guys. I'm a huge supporter of what Hibernate is trying to do. I think you guys are on exactly the right path. Delete performance is a known issue and Gavin says it will be address in a future release. I'm cool with that. I'm not here to dwell on one thing that Hibernate doesn't do well. There are plenty of things it excels at, and there are work arounds for my issue. I just take exception to the notion that ORMs shouldn't do deletes or that people shouldn't access data through an application.

Keep up the good work and I look forward to Hibernate3.

-Jim


Top
 Profile  
 
 Post subject:
PostPosted: Thu Jun 24, 2004 1:45 pm 
Newbie

Joined: Thu Mar 04, 2004 11:37 am
Posts: 9
jimhazen2000:
Quote:
What I'd really like is a programatic way to change the cascade settings for a given relation, for a given session.


i'd really like too!!

is anyone has got the solution to this problem?

does hibernate permits to change programmatically this cascade setting?


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 12 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.