These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 
Author Message
 Post subject: Understanding DML style delete performance
PostPosted: Mon Mar 02, 2009 11:21 am 
Newbie

Joined: Mon Mar 02, 2009 10:30 am
Posts: 2
Hello fellow Hibernate Users!

I have a performance problem in my application when using DML style delete operations. Let me introduce the task first: At some given time, i have to delete all elements from a Parent/Child composition which are older than some timestamp. We talk about three millionen entries (so quite a large set of data). To get some decent performance, i'd like to use DML style deletes with HQL.

Fig. 1 shows a small example model (unfortunately i'm not allowed to show the real one). The real model is quite similar but each class contains up to 25 additional attributes (no additional associations) and there are other kinds of parents inside a class hierarchy. The mapping document (attached below) is generated by Teneo.

Image
Fig. 1: a small example model


As i understood so far, DML style deletes won't allow the cascade deletion of children. Because of this and because of the generated foreign key constraints i have to delete the children from the manyChildren composition first. The performance of this first delete Query is quite good (about 70 milliseconds for 10000 children with 5000 parents, each parent has two children).

After this first delete i tried to delete the parents. The performance problems start at this second Query: I took about 2000 milliseconds for the 5000 parents. For more parents the perfomance scales even worse.


Let me show the appropriate Java code and generated SQL next:

Code:
Session session = hbSessionFactory.openSession();
        Transaction tx = session.beginTransaction();

String quiteFastQuery = "delete ManyChild as toDelete where toDelete in " +
                                "( select children from Parent as p left outer join p.manyChildren children " +
                                " where p.timestamp < :someTimestamp " +
                                ")";

        Query q = session.createQuery(quiteFastQuery);
        q.setTimestamp("someTimestamp", new Date());
        q.executeUpdate();

Listing 1: The 'good'/high-performant first Query for deletion of the ManyChild children


Code:
delete from "manychild" where "many_child_id" in (select "many_child_id" from "parent" parent1_ left outer join "manychild" manychildr2_ on parent1_.e_id=manychildr2_."parent_manychildren_e_id" where parent1_."timestamp"<?)

Listing 2: The generated SQL for the high-performant first Query


Code:
String verySlowQuery = "delete from Parent p " +
                               "where p.timestamp < :someTimestamp";
        Query q2 = session.createQuery(verySlowQuery);
        q2.setTimestamp("someTimestamp", new Date());
        q2.executeUpdate();
       
        tx.commit();
        session.close();

Listing 3: The 'bad'/low-performant first Query for deletion of the parents


Code:
delete from "parent" where "timestamp"<?

Listing 4: The generated SQL for the low-performant second Query


Do you have any idea where the problem might be? For me this second Query looks quite trivial. Am i missing something important completely (I'm quite new to Hibernate)? Is there maybe a better way to perform such a massive delete?

What i tried beside the above was to use a StatelessSession. However the performance was quite similar.


---------------------------
Hibernate version: 3.3.1.GA

Mapping documents:
Code:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE hibernate-mapping PUBLIC "-//Hibernate/Hibernate Mapping DTD 3.0//EN" "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">

<hibernate-mapping auto-import="false">
   <class name="hql_delete_test.impl.ParentImpl" entity-name="Parent" abstract="false" lazy="false" discriminator-value="Parent" table="`parent`">
      <meta attribute="eclassName">Parent</meta>
      <meta attribute="epackage">http://hql_delete_test</meta>
      <id type="long" name="e_id" column="e_id" access="org.eclipse.emf.teneo.hibernate.mapping.identifier.IdentifierPropertyHandler">
         <meta attribute="syntheticId">true</meta>
         <generator class="native"/>
      </id>
      <discriminator column="`dtype`" type="string"/>
      <version name="e_version" column="e_version" access="org.eclipse.emf.teneo.hibernate.mapping.property.VersionPropertyHandler">
         <meta attribute="syntheticVersion">true</meta>
      </version>
      <list name="manyChildren" lazy="true" cascade="all,delete-orphan">
         <key update="true" foreign-key="parent_manychildren">
            <column name="`parent_manychildren_e_id`" not-null="false" unique="false"/>
         </key>
         <list-index column="`parent_manychildren_idx`"/>
         <one-to-many entity-name="ManyChild"/>
      </list>
      <many-to-one name="first_Child" entity-name="SingleChild" lazy="false" cascade="all" foreign-key="parent_first_child" insert="true" update="true" not-null="false">
         <column not-null="false" unique="false" name="`singlechild_first_child_id`"/>
      </many-to-one>
      <many-to-one name="second_Child" entity-name="SingleChild" lazy="false" cascade="all" foreign-key="parent_second_child" insert="true" update="true" not-null="false">
         <column not-null="false" unique="false" name="`singlechild_second_child_id`"/>
      </many-to-one>
      <property name="timestamp" lazy="false" insert="true" update="true" not-null="false" unique="false" type="timestamp">
         <column not-null="false" unique="false" name="`timestamp`"/>
      </property>
   </class>
   <class name="hql_delete_test.impl.ManyChildImpl" entity-name="ManyChild" abstract="false" lazy="false" discriminator-value="ManyChild" table="`manychild`">
      <meta attribute="eclassName">ManyChild</meta>
      <meta attribute="epackage">http://hql_delete_test</meta>
      <id name="ID" type="long" unsaved-value="0">
         <column not-null="false" unique="false" name="`many_child_id`"/>
         <generator class="native"/>
      </id>
      <discriminator column="`dtype`" type="string"/>
      <version name="e_version" column="e_version" access="org.eclipse.emf.teneo.hibernate.mapping.property.VersionPropertyHandler">
         <meta attribute="syntheticVersion">true</meta>
      </version>
   </class>
   <class name="hql_delete_test.impl.SingleChildImpl" entity-name="SingleChild" abstract="false" lazy="false" discriminator-value="SingleChild" table="`singlechild`">
      <meta attribute="eclassName">SingleChild</meta>
      <meta attribute="epackage">http://hql_delete_test</meta>
      <id name="ID" type="long" unsaved-value="0">
         <column not-null="false" unique="false" name="`single_child_id`"/>
         <generator class="native"/>
      </id>
      <discriminator column="`dtype`" type="string"/>
      <version name="e_version" column="e_version" access="org.eclipse.emf.teneo.hibernate.mapping.property.VersionPropertyHandler">
         <meta attribute="syntheticVersion">true</meta>
      </version>
   </class>
</hibernate-mapping>


Code between sessionFactory.openSession() and session.close():
Code:
Transaction tx = session.beginTransaction();

        String quiteFastQuery = "delete ManyChild as toDelete where toDelete in " +
                                "( select children from Parent as p left outer join p.manyChildren children " +
                                " where p.timestamp < :someTimestamp " +
                                ")";

        Query q = session.createQuery(quiteFastQuery);
        q.setTimestamp("someTimestamp", new Date());
        q.executeUpdate();

        String verySlowQuery = "delete from Parent p " +
                               "where p.timestamp < :someTimestamp";
        Query q2 = session.createQuery(verySlowQuery);
        q2.setTimestamp("someTimestamp", new Date());
        q2.executeUpdate();

        //... code for removing the orphan single children of the parent

        tx.commit();


Full stack trace of any exception that occurs:

Name and version of the database you are using:
PostgreSQL 8.2

The generated SQL (show_sql=true):
Code:
359 [main] DEBUG org.hibernate.SQL  - select nextval ('hibernate_sequence')
1390 [main] DEBUG org.hibernate.SQL  - select nextval ('hibernate_sequence')
1406 [main] DEBUG org.hibernate.SQL  - select nextval ('hibernate_sequence')
1406 [main] DEBUG org.hibernate.SQL  - select nextval ('hibernate_sequence')
1406 [main] DEBUG org.hibernate.SQL  - select nextval ('hibernate_sequence')
1422 [main] DEBUG org.hibernate.SQL  - insert into "singlechild" (e_version, econtainer_class, e_container, e_container_featureid, "dtype", "single_child_id") values (?, ?, ?, ?, 'SingleChild', ?)
1437 [main] DEBUG org.hibernate.SQL  - insert into "singlechild" (e_version, econtainer_class, e_container, e_container_featureid, "dtype", "single_child_id") values (?, ?, ?, ?, 'SingleChild', ?)
1437 [main] DEBUG org.hibernate.SQL  - insert into "parent" (e_version, "singlechild_first_child_id", "singlechild_second_child_id", "timestamp", "dtype", e_id) values (?, ?, ?, ?, 'Parent', ?)
1437 [main] DEBUG org.hibernate.SQL  - insert into "manychild" (e_version, econtainer_class, e_container, e_container_featureid, "dtype", "many_child_id") values (?, ?, ?, ?, 'ManyChild', ?)
1437 [main] DEBUG org.hibernate.SQL  - insert into "manychild" (e_version, econtainer_class, e_container, e_container_featureid, "dtype", "many_child_id") values (?, ?, ?, ?, 'ManyChild', ?)
1437 [main] DEBUG org.hibernate.SQL  - update "singlechild" set e_version=?, econtainer_class=?, e_container=?, e_container_featureid=? where "single_child_id"=? and e_version=?
1453 [main] DEBUG org.hibernate.SQL  - update "singlechild" set e_version=?, econtainer_class=?, e_container=?, e_container_featureid=? where "single_child_id"=? and e_version=?
1453 [main] DEBUG org.hibernate.SQL  - update "manychild" set "parent_manychildren_e_id"=?, "parent_manychildren_idx"=? where "many_child_id"=?
1453 [main] DEBUG org.hibernate.SQL  - update "manychild" set "parent_manychildren_e_id"=?, "parent_manychildren_idx"=? where "many_child_id"=?
1593 [main] DEBUG org.hibernate.SQL  - delete from "manychild" where "many_child_id" in (select "many_child_id" from "parent" parent1_ left outer join "manychild" manychildr2_ on parent1_.e_id=manychildr2_."parent_manychildren_e_id" where parent1_."timestamp"<?)
1609 [main] DEBUG org.hibernate.SQL  - delete from "parent" where "timestamp"<?


Debug level Hibernate log excerpt:
Code:
    [main] INFO  org.eclipse.emf.teneo.hibernate.HbHelper  - Creating emf data store and registering it under name: HQL_Delete_Test_Datastore
15   [main] INFO  org.eclipse.emf.teneo.hibernate.HbHelper  - Returning created emf data store, initialize this newly created data store!
SLF4J: This version of SLF4J requires log4j version 1.2.12 or later. See also http://www.slf4j.org/codes.html#log4j_version
750  [main] WARN  org.eclipse.emf.teneo.hibernate.HbDataStore  - No hibernate cache provider set, using org.hibernate.cache.HashtableCacheProvider
750  [main] WARN  org.eclipse.emf.teneo.hibernate.HbDataStore  - For production use please set the ehcache (or other) provider explicitly and configure it
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.fetch_containment_eagerly: false
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - hibernate.connection.password: ruegen
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - hibernate.cache.provider_class: org.hibernate.cache.HashtableCacheProvider
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.naming.version_column: e_version
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.disable_econtainer: false
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - hibernate.connection.username: ruegen
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.default_cache_strategy: NONE
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.sql_name_escape_character: `
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.set_default_cascade_on_non_containment:
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.map_all_lists_as_idbag: false
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.naming.max_sql_name_length: -1
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.default_temporal: TIMESTAMP
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.emap_as_true_map: true
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.default_varchar_length: -1
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.naming.default_id_column: e_id
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.also_map_as_class: true
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - hibernate.connection.autocommit: false
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.add_index_for_fk: false
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.naming.join_table_naming_strategy: unique
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.XSDDateClass: javax.xml.datatype.XMLGregorianCalendar
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.always_version: true
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.naming.idbag_id_column_name: ID
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.max_comment_length: 0
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.set_entity_automatically: true
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.hibernate_mapping_file: false
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.naming.set_foreign_key_name: true
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.cascade_policy_on_containment: ALL
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.naming.id_feature_as_primary_key: true
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.naming.set_generated_value_on_id_feature: true
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - hibernate.connection.driver_class: org.postgresql.Driver
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.runtime.update_schema: true
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.naming.strategy: lowercase
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.always_map_list_as_bag: false
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - hibernate.connection.url: jdbc:postgresql://localhost/ruegen
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - hibernate.dialect: org.hibernate.dialect.PostgreSQLDialect
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.UserDateTimeType: org.eclipse.emf.teneo.hibernate.mapping.XSDDateTime
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.naming.join_column_naming_strategy: unique
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.set_proxy: false
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.UserDateType: org.eclipse.emf.teneo.hibernate.mapping.XSDDate
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.cascade_policy_on_non_containment:
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.ignore_eannotations: false
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - hibernate.hbm2ddl.auto: create
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.naming.default_id_feature: e_id
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.map_embeddable_as_embedded: false
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.optimistic_locking: true
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.join_table_for_non_contained_associations: true
750  [main] INFO  org.eclipse.emf.teneo.hibernate.HbDataStore  - teneo.mapping.cascade_all_on_containment:
765  [main] WARN  org.eclipse.emf.teneo.hibernate.HbDataStore  - The teneo update schema option is not used anymore for hibernate, use the hibernate option: hibernate.hbm2ddl.auto


Top
 Profile  
 
 Post subject:
PostPosted: Mon Mar 02, 2009 12:54 pm 
Newbie

Joined: Mon Mar 02, 2009 10:30 am
Posts: 2
Hm after some additional tests i'm not sure if this is really a Hibernate issue. When i execute the plain SQL statements directly against the database the performance is bad as well.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.