Hello fellow Hibernate Users!
I have a performance problem in my application when using DML style delete operations. Let me introduce the task first: At some given time, i have to delete all elements from a Parent/Child composition which are older than some timestamp. We talk about three millionen entries (so quite a large set of data). To get some decent performance, i'd like to use DML style deletes with HQL.
Fig. 1 shows a small example model (unfortunately i'm not allowed to show the real one). The real model is quite similar but each class contains up to 25 additional attributes (no additional associations) and there are other kinds of parents inside a class hierarchy. The mapping document (attached below) is generated by Teneo.
Fig. 1: a small example model
As i understood so far, DML style deletes won't allow the cascade deletion of children. Because of this and because of the generated foreign key constraints i have to delete the children from the manyChildren composition first. The performance of this first delete Query is quite good (about 70 milliseconds for 10000 children with 5000 parents, each parent has two children).
After this first delete i tried to delete the parents. The performance problems start at this second Query: I took about 2000 milliseconds for the 5000 parents. For more parents the perfomance scales even worse.
Let me show the appropriate Java code and generated SQL next:
Code:
Session session = hbSessionFactory.openSession();
Transaction tx = session.beginTransaction();
String quiteFastQuery = "delete ManyChild as toDelete where toDelete in " +
"( select children from Parent as p left outer join p.manyChildren children " +
" where p.timestamp < :someTimestamp " +
")";
Query q = session.createQuery(quiteFastQuery);
q.setTimestamp("someTimestamp", new Date());
q.executeUpdate();
Listing 1: The 'good'/high-performant first Query for deletion of the ManyChild childrenCode:
delete from "manychild" where "many_child_id" in (select "many_child_id" from "parent" parent1_ left outer join "manychild" manychildr2_ on parent1_.e_id=manychildr2_."parent_manychildren_e_id" where parent1_."timestamp"<?)
Listing 2: The generated SQL for the high-performant first QueryCode:
String verySlowQuery = "delete from Parent p " +
"where p.timestamp < :someTimestamp";
Query q2 = session.createQuery(verySlowQuery);
q2.setTimestamp("someTimestamp", new Date());
q2.executeUpdate();
tx.commit();
session.close();
Listing 3: The 'bad'/low-performant first Query for deletion of the parentsCode:
delete from "parent" where "timestamp"<?
Listing 4: The generated SQL for the low-performant second QueryDo you have any idea where the problem might be? For me this second Query looks quite trivial. Am i missing something important completely (I'm quite new to Hibernate)? Is there maybe a better way to perform such a massive delete?
What i tried beside the above was to use a StatelessSession. However the performance was quite similar.
---------------------------Hibernate version: 3.3.1.GA Mapping documents:Code:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE hibernate-mapping PUBLIC "-//Hibernate/Hibernate Mapping DTD 3.0//EN" "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<hibernate-mapping auto-import="false">
<class name="hql_delete_test.impl.ParentImpl" entity-name="Parent" abstract="false" lazy="false" discriminator-value="Parent" table="`parent`">
<meta attribute="eclassName">Parent</meta>
<meta attribute="epackage">http://hql_delete_test</meta>
<id type="long" name="e_id" column="e_id" access="org.eclipse.emf.teneo.hibernate.mapping.identifier.IdentifierPropertyHandler">
<meta attribute="syntheticId">true</meta>
<generator class="native"/>
</id>
<discriminator column="`dtype`" type="string"/>
<version name="e_version" column="e_version" access="org.eclipse.emf.teneo.hibernate.mapping.property.VersionPropertyHandler">
<meta attribute="syntheticVersion">true</meta>
</version>
<list name="manyChildren" lazy="true" cascade="all,delete-orphan">
<key update="true" foreign-key="parent_manychildren">
<column name="`parent_manychildren_e_id`" not-null="false" unique="false"/>
</key>
<list-index column="`parent_manychildren_idx`"/>
<one-to-many entity-name="ManyChild"/>
</list>
<many-to-one name="first_Child" entity-name="SingleChild" lazy="false" cascade="all" foreign-key="parent_first_child" insert="true" update="true" not-null="false">
<column not-null="false" unique="false" name="`singlechild_first_child_id`"/>
</many-to-one>
<many-to-one name="second_Child" entity-name="SingleChild" lazy="false" cascade="all" foreign-key="parent_second_child" insert="true" update="true" not-null="false">
<column not-null="false" unique="false" name="`singlechild_second_child_id`"/>
</many-to-one>
<property name="timestamp" lazy="false" insert="true" update="true" not-null="false" unique="false" type="timestamp">
<column not-null="false" unique="false" name="`timestamp`"/>
</property>
</class>
<class name="hql_delete_test.impl.ManyChildImpl" entity-name="ManyChild" abstract="false" lazy="false" discriminator-value="ManyChild" table="`manychild`">
<meta attribute="eclassName">ManyChild</meta>
<meta attribute="epackage">http://hql_delete_test</meta>
<id name="ID" type="long" unsaved-value="0">
<column not-null="false" unique="false" name="`many_child_id`"/>
<generator class="native"/>
</id>
<discriminator column="`dtype`" type="string"/>
<version name="e_version" column="e_version" access="org.eclipse.emf.teneo.hibernate.mapping.property.VersionPropertyHandler">
<meta attribute="syntheticVersion">true</meta>
</version>
</class>
<class name="hql_delete_test.impl.SingleChildImpl" entity-name="SingleChild" abstract="false" lazy="false" discriminator-value="SingleChild" table="`singlechild`">
<meta attribute="eclassName">SingleChild</meta>
<meta attribute="epackage">http://hql_delete_test</meta>
<id name="ID" type="long" unsaved-value="0">
<column not-null="false" unique="false" name="`single_child_id`"/>
<generator class="native"/>
</id>
<discriminator column="`dtype`" type="string"/>
<version name="e_version" column="e_version" access="org.eclipse.emf.teneo.hibernate.mapping.property.VersionPropertyHandler">
<meta attribute="syntheticVersion">true</meta>
</version>
</class>
</hibernate-mapping>
Code between sessionFactory.openSession() and session.close():Code:
Transaction tx = session.beginTransaction();
String quiteFastQuery = "delete ManyChild as toDelete where toDelete in " +
"( select children from Parent as p left outer join p.manyChildren children " +
" where p.timestamp < :someTimestamp " +
")";
Query q = session.createQuery(quiteFastQuery);
q.setTimestamp("someTimestamp", new Date());
q.executeUpdate();
String verySlowQuery = "delete from Parent p " +
"where p.timestamp < :someTimestamp";
Query q2 = session.createQuery(verySlowQuery);
q2.setTimestamp("someTimestamp", new Date());
q2.executeUpdate();
//... code for removing the orphan single children of the parent
tx.commit();
Full stack trace of any exception that occurs:Name and version of the database you are using:PostgreSQL 8.2
The generated SQL (show_sql=true):Code:
359 [main] DEBUG org.hibernate.SQL - select nextval ('hibernate_sequence')
1390 [main] DEBUG org.hibernate.SQL - select nextval ('hibernate_sequence')
1406 [main] DEBUG org.hibernate.SQL - select nextval ('hibernate_sequence')
1406 [main] DEBUG org.hibernate.SQL - select nextval ('hibernate_sequence')
1406 [main] DEBUG org.hibernate.SQL - select nextval ('hibernate_sequence')
1422 [main] DEBUG org.hibernate.SQL - insert into "singlechild" (e_version, econtainer_class, e_container, e_container_featureid, "dtype", "single_child_id") values (?, ?, ?, ?, 'SingleChild', ?)
1437 [main] DEBUG org.hibernate.SQL - insert into "singlechild" (e_version, econtainer_class, e_container, e_container_featureid, "dtype", "single_child_id") values (?, ?, ?, ?, 'SingleChild', ?)
1437 [main] DEBUG org.hibernate.SQL - insert into "parent" (e_version, "singlechild_first_child_id", "singlechild_second_child_id", "timestamp", "dtype", e_id) values (?, ?, ?, ?, 'Parent', ?)
1437 [main] DEBUG org.hibernate.SQL - insert into "manychild" (e_version, econtainer_class, e_container, e_container_featureid, "dtype", "many_child_id") values (?, ?, ?, ?, 'ManyChild', ?)
1437 [main] DEBUG org.hibernate.SQL - insert into "manychild" (e_version, econtainer_class, e_container, e_container_featureid, "dtype", "many_child_id") values (?, ?, ?, ?, 'ManyChild', ?)
1437 [main] DEBUG org.hibernate.SQL - update "singlechild" set e_version=?, econtainer_class=?, e_container=?, e_container_featureid=? where "single_child_id"=? and e_version=?
1453 [main] DEBUG org.hibernate.SQL - update "singlechild" set e_version=?, econtainer_class=?, e_container=?, e_container_featureid=? where "single_child_id"=? and e_version=?
1453 [main] DEBUG org.hibernate.SQL - update "manychild" set "parent_manychildren_e_id"=?, "parent_manychildren_idx"=? where "many_child_id"=?
1453 [main] DEBUG org.hibernate.SQL - update "manychild" set "parent_manychildren_e_id"=?, "parent_manychildren_idx"=? where "many_child_id"=?
1593 [main] DEBUG org.hibernate.SQL - delete from "manychild" where "many_child_id" in (select "many_child_id" from "parent" parent1_ left outer join "manychild" manychildr2_ on parent1_.e_id=manychildr2_."parent_manychildren_e_id" where parent1_."timestamp"<?)
1609 [main] DEBUG org.hibernate.SQL - delete from "parent" where "timestamp"<?
Debug level Hibernate log excerpt:Code:
[main] INFO org.eclipse.emf.teneo.hibernate.HbHelper - Creating emf data store and registering it under name: HQL_Delete_Test_Datastore
15 [main] INFO org.eclipse.emf.teneo.hibernate.HbHelper - Returning created emf data store, initialize this newly created data store!
SLF4J: This version of SLF4J requires log4j version 1.2.12 or later. See also http://www.slf4j.org/codes.html#log4j_version
750 [main] WARN org.eclipse.emf.teneo.hibernate.HbDataStore - No hibernate cache provider set, using org.hibernate.cache.HashtableCacheProvider
750 [main] WARN org.eclipse.emf.teneo.hibernate.HbDataStore - For production use please set the ehcache (or other) provider explicitly and configure it
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.fetch_containment_eagerly: false
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - hibernate.connection.password: ruegen
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - hibernate.cache.provider_class: org.hibernate.cache.HashtableCacheProvider
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.naming.version_column: e_version
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.disable_econtainer: false
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - hibernate.connection.username: ruegen
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.default_cache_strategy: NONE
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.sql_name_escape_character: `
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.set_default_cascade_on_non_containment:
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.map_all_lists_as_idbag: false
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.naming.max_sql_name_length: -1
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.default_temporal: TIMESTAMP
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.emap_as_true_map: true
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.default_varchar_length: -1
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.naming.default_id_column: e_id
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.also_map_as_class: true
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - hibernate.connection.autocommit: false
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.add_index_for_fk: false
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.naming.join_table_naming_strategy: unique
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.XSDDateClass: javax.xml.datatype.XMLGregorianCalendar
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.always_version: true
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.naming.idbag_id_column_name: ID
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.max_comment_length: 0
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.set_entity_automatically: true
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.hibernate_mapping_file: false
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.naming.set_foreign_key_name: true
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.cascade_policy_on_containment: ALL
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.naming.id_feature_as_primary_key: true
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.naming.set_generated_value_on_id_feature: true
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - hibernate.connection.driver_class: org.postgresql.Driver
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.runtime.update_schema: true
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.naming.strategy: lowercase
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.always_map_list_as_bag: false
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - hibernate.connection.url: jdbc:postgresql://localhost/ruegen
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - hibernate.dialect: org.hibernate.dialect.PostgreSQLDialect
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.UserDateTimeType: org.eclipse.emf.teneo.hibernate.mapping.XSDDateTime
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.naming.join_column_naming_strategy: unique
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.set_proxy: false
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.UserDateType: org.eclipse.emf.teneo.hibernate.mapping.XSDDate
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.cascade_policy_on_non_containment:
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.ignore_eannotations: false
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - hibernate.hbm2ddl.auto: create
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.naming.default_id_feature: e_id
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.map_embeddable_as_embedded: false
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.optimistic_locking: true
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.join_table_for_non_contained_associations: true
750 [main] INFO org.eclipse.emf.teneo.hibernate.HbDataStore - teneo.mapping.cascade_all_on_containment:
765 [main] WARN org.eclipse.emf.teneo.hibernate.HbDataStore - The teneo update schema option is not used anymore for hibernate, use the hibernate option: hibernate.hbm2ddl.auto