Hibernate 3.3.1.GA
JBOSS 4.3.0.GA
Oracle 10g, Driver ojdbc14.jar version 10.2.0.4.0
Hi everyone,
I would like to use hibernate to create a batch processing application.
It must transfer data from one Oracle instance to another.
I have tried to different ways: mapping tables independently and mapping one main entity with collections of other entities.
In both cases if I have awful performance results, so can somebody tell me what I’m doing wrong?
Here is my context :
- JBOSS 4.3.0.GA
- Hibernate 3.3.1.GA
- I have one table OBJECT and different properties tables PTY1, PTY2, PTY3, PTY4, PTY5.
- Fact is, the finale solution will have to map around 30 related tables.
- Fact is some table contains a few millions records, some more than 10 millions.
- There is an average of 60 000 records per PTY* table for one record of OBJECT.
- I intend to work OBJECT record by OBJECT record.
- Most of the tables have no defined primary keys:
- some have one fields containing unique values so I make it my @Id field
- for the others I created a composite primary key
- Each of my PTY* tables have two fields in common with my OBJECT table. Basically, OBJECT has 2 fields names code and index, and PTY* have 2 fields names objectCode, object Index.
I call performance issue when to persist one object and its related data from the PTY*, it takes about one hour.
Historic:
So, after reading chapter 13 from Hibernate, I started by turning off my second level of cache.
I first created an entity E mapping OBJECT and for each PTY* table I create a field in E such as :
Code:
@OneToMany(fetch = FetchType.EAGER)
@Cascade(org.hibernate.annotations.CascadeType.ALL)
@LazyCollection(LazyCollectionOption.EXTRA)
@JoinColumns({
@JoinColumn(name = "objectCode", referencedColumnName = "code"),
@JoinColumn(name = "objectIndex", referencedColumnName = "index")
})
private List<PTYEntity> PTYEntities = new LinkedList();
(By the way, desperate, I unsuccessfully tried both List/LinkedList and Set/hashSet looking for better results.
Looking at the generated SQL, I saw that Hibernate generated a serie of LEFT OUTER JOINs. That ended up returning way more records than it should.
What I mean, is I was looking for SQL code:
Code:
SELECT *
FROM OBJECT o,
PTY1 p1,
PTY2 p2,
PTY3 p3,
PTY4 p4,
PTY5 p5
WHERE o.code = p1.objectCode
AND o.index = p1.objectIndex
AND o.code = p2.objectCode
AND o.index = p2.objectIndex
AND o.code = p3.objectCode
AND o.index = p3.objectIndex
AND o.code = p4.objectCode
AND o.index = p4.objectIndex
AND o.code = p5.objectCode
AND o.index = p5.objectIndex
So I created a named query in each PTY* entity such as
Code:
SELECT p FROM PTYentity e WHERE e.objectCode=(:code) AND e.objectIndex = (:index)
And I created a service for PTY* entity
Then I first switched to a LAZY the fetch type of the collection of the entity E.
I created a service :
- to call for one entity E by its primary key
- to set its PTY* collections by calling the respective named queries
- to persist in a destination data source the data
- to remove the data from the origin data source (this I commented quickly because I first to fix my time performance issue on persisting the data)
Because it was slow, I decided to delete collection fileds from my entity Y, and programmed :
- the retrieval of one record R from OBJECT
- the retrieval of all related data from PTY1 for code and index of in R
- the retrieval of all related data from PTY2 for code and index of in R
- …etc
But it is still slow.
and I did level the fetch number to 50 first.
But now that I loop through collection of PTY* entities, I flush every 20.
Here is my development architecture :
- a few EJB3 entities
- a main entity having lazy collections of those former entities
- a stateless EJB service to manage my entity E
- a stateless EJB accessed remotely for execute data mangement using the service described in the line above
I did this development step by step :
- these elements + a client access class
- unit test to try executing the client and remotely access the service
- test with smal amount of data : ok
- test with big amount : ok
- test with more and more lazy entity collections in my main entity
- test with more and more entity collections separated from the main entity, called independently
What can I do to make it faster?
What is the way to this?
Am I wrong to think I could do it with Hibernate?
thanks for help and/or advice