-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 
Author Message
 Post subject: Awful performance on large table for batch processing
PostPosted: Fri Feb 13, 2009 9:42 am 
Newbie

Joined: Fri Feb 06, 2009 2:16 am
Posts: 8
Hibernate 3.3.1.GA
JBOSS 4.3.0.GA
Oracle 10g, Driver ojdbc14.jar version 10.2.0.4.0

Hi everyone,

I would like to use hibernate to create a batch processing application.
It must transfer data from one Oracle instance to another.

I have tried to different ways: mapping tables independently and mapping one main entity with collections of other entities.
In both cases if I have awful performance results, so can somebody tell me what I’m doing wrong?
Here is my context :
    - JBOSS 4.3.0.GA
    - Hibernate 3.3.1.GA
    - I have one table OBJECT and different properties tables PTY1, PTY2, PTY3, PTY4, PTY5.
    - Fact is, the finale solution will have to map around 30 related tables.
    - Fact is some table contains a few millions records, some more than 10 millions.
    - There is an average of 60 000 records per PTY* table for one record of OBJECT.
    - I intend to work OBJECT record by OBJECT record.
    - Most of the tables have no defined primary keys:
      - some have one fields containing unique values so I make it my @Id field
      - for the others I created a composite primary key
    - Each of my PTY* tables have two fields in common with my OBJECT table. Basically, OBJECT has 2 fields names code and index, and PTY* have 2 fields names objectCode, object Index.

I call performance issue when to persist one object and its related data from the PTY*, it takes about one hour.

Historic:

So, after reading chapter 13 from Hibernate, I started by turning off my second level of cache.

I first created an entity E mapping OBJECT and for each PTY* table I create a field in E such as :
Code:
    @OneToMany(fetch = FetchType.EAGER)
    @Cascade(org.hibernate.annotations.CascadeType.ALL)
    @LazyCollection(LazyCollectionOption.EXTRA)
    @JoinColumns({
        @JoinColumn(name = "objectCode", referencedColumnName = "code"),
        @JoinColumn(name = "objectIndex", referencedColumnName = "index")
    })
    private List<PTYEntity> PTYEntities = new LinkedList();


(By the way, desperate, I unsuccessfully tried both List/LinkedList and Set/hashSet looking for better results.

Looking at the generated SQL, I saw that Hibernate generated a serie of LEFT OUTER JOINs. That ended up returning way more records than it should.
What I mean, is I was looking for SQL code:
Code:
SELECT *
FROM OBJECT o,
PTY1 p1,
PTY2 p2,
PTY3 p3,
PTY4 p4,
PTY5 p5
WHERE o.code = p1.objectCode
AND  o.index = p1.objectIndex
AND   o.code = p2.objectCode
AND  o.index = p2.objectIndex
AND   o.code = p3.objectCode
AND  o.index = p3.objectIndex
AND   o.code = p4.objectCode
AND  o.index = p4.objectIndex
AND   o.code = p5.objectCode
AND  o.index = p5.objectIndex


So I created a named query in each PTY* entity such as

Code:
SELECT p FROM PTYentity e WHERE e.objectCode=(:code) AND e.objectIndex = (:index)


And I created a service for PTY* entity

Then I first switched to a LAZY the fetch type of the collection of the entity E.

I created a service :
    - to call for one entity E by its primary key
    - to set its PTY* collections by calling the respective named queries
    - to persist in a destination data source the data
    - to remove the data from the origin data source (this I commented quickly because I first to fix my time performance issue on persisting the data)
Because it was slow, I decided to delete collection fileds from my entity Y, and programmed :
    - the retrieval of one record R from OBJECT
    - the retrieval of all related data from PTY1 for code and index of in R
    - the retrieval of all related data from PTY2 for code and index of in R
    - …etc
But it is still slow.

and I did level the fetch number to 50 first.
But now that I loop through collection of PTY* entities, I flush every 20.

Here is my development architecture :
    - a few EJB3 entities
    - a main entity having lazy collections of those former entities
    - a stateless EJB service to manage my entity E
    - a stateless EJB accessed remotely for execute data mangement using the service described in the line above

I did this development step by step :
    - these elements + a client access class
    - unit test to try executing the client and remotely access the service
    - test with smal amount of data : ok
    - test with big amount : ok
    - test with more and more lazy entity collections in my main entity
    - test with more and more entity collections separated from the main entity, called independently


What can I do to make it faster?
What is the way to this?
Am I wrong to think I could do it with Hibernate?

thanks for help and/or advice


Top
 Profile  
 
 Post subject:
PostPosted: Fri Feb 13, 2009 6:05 pm 
Senior
Senior

Joined: Thu Jan 08, 2009 3:48 pm
Posts: 168
Several thoughts:
- I think Hibernate does the outer joins because the association to PTY tables are not declared as not-null=true; this would change the joins to inner joins but is only applicable if every object _must_ have the property
- can more than one object have the same index? Does more than one object link to the same PTY? Or is O : PTY = 1 : 1 ?
- How is your search performance in the database itself? With such large tables you need proper indices and the like
- If i remember correctly <bag> has the best insert performance in hibernate because no check for double entries is necessary

Rating is welcome


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.