-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 
Author Message
 Post subject: EJB3 BULK INSERT and UPDATES Perf. compare to RAD soft
PostPosted: Tue Jan 06, 2009 9:03 am 
Newbie

Joined: Fri Mar 28, 2008 5:48 am
Posts: 5
Hi all,

I'm facing a performance problem which I think must be for a large part due to my lack of skill and/or knowledge :
Well I'm rewriting a legacy app made with a RAD tool (Delphi like)
In this app there is two tables I have to deal with carefully.

Table TabA is quite simple

Code:
TabA
--------
@GeneratedValue(strategy = GenerationType.AUTO)
id: Long

StrFld : String

@OneToMany(mappedBy = "source", cascade = {CascadeType.REMOVE})
    private Set<TabB > relatedSourceSet= new HashSet<TabB >();

@OneToMany(mappedBy = "target", cascade = {CascadeType.REMOVE})
    private Set<TabB > relatedTargetSet= new HashSet<TabB >();



Table TabB is a table maintaining relations between instance of TabA

Code:
TabB
--------
@GeneratedValue(strategy = GenerationType.AUTO)
id: Long

@ManyToOne()
@JoinColumn
Source : TabA

@ManyToOne()
@JoinColumn
Target :  TabA

@ManyToOne
RelationType: TabRelation


I got a TabRelation where we maintain a dozen or so kind of relation.

Well that's quite simple , I hope I was clear enough on this first part. The entities are really simple

Then I got a EJB (in fact EJB or seam Component ; I tried both
it's an EJB3 / SEAM application)
wich is in charge of IMPORTING instances of this from a file ( the file can be XML or labelled)
I already took care of taking the file , parsing it and splitting it...

so my EJB make all this step then start doing the persist / update job


Quote:

Code:
    TabA tabA_instanceSub;
TabA tabA_instance;
....


                        querySt = "SELECT t  FROM TabA descT WHERE UPPER(t.StrFld) = :item";
                        query = em.createQuery(querySt);
                        query.setParameter("item", listVal.get(i2).trim().toUpperCase());
                        lstRes = query.getResultList();
                        if (lstRes.isEmpty()) {
                            //This one is brand new !!
                            tabA_instance = new TabA();
                            tabA_instance.setStrFld(listVal.get(i2).trim());
                            em.persist(tabA_instance);
                        } else {
                            //We got it already
                            tabA_instance = (TabA) lstRes.get(0);
                        }


.....
...


                        TabB tabB_instance;
                        querySt = "SELECT relT  FROM TabB relT WHERE relT.relationType = :relationType";
                        querySt = querySt + " AND UPPER(relT.Source.StrFld) = :StrFld1 AND UPPER(relT.target.StrFld) = :StrFld2";
                        query = em.createQuery(querySt);
                        query.setParameter("relationType", DESCR_REL_TYPE);
                        query.setParameter("StrFld1", tabA_instance.getStrFld().toUpperCase());
                        query.setParameter("StrFld2", tabA_instanceSub.getStrFld().toUpperCase());
                        lstRes = query.getResultList();

                        if (lstRes.isEmpty()) {
                            tabB_instance = new TabB(tabA_instance, tabA_instanceSub, REL_TYP_TS);
                            em.persist(tabB_instance);
                        }



Well I tried to simplify everything I can for sake of clarity but all is here
: i'm in a transaction , I look to see if I already got the current item ? if not I create (and persist it) then I look for the relation and same simple process do I got it ?

Not so horrible hey ?!

Well now my problem is this : with a file with 9887 instances of TabA; and 10632 relations instances of tabB it took less than 5min in the legacy app and 3 hours with my JEE stuff !!
No need to say this is not possible for the users the are doing this quite every day ...


No way for the import file to be an sql.. it came from different sources that's why it pass through a xml so i can handle it.
Anyway if I have to i will do something on it but 2 questions here
    -how should I handle the Id which are JPA- annoted to be generated !!!??
    -And how to handle the fact tha I have to chase duplicate BEFORE creating a new instance of tabA ??


the transformation of the file to produce the arraylist I feed on is a matter of seconds. so it's clearly not the point here.
Which way should I tune the transaction to get back to some correct performance ?

Maybe I should try WITHOUT any transaction just to see ??

I d'idn't try yet because I need to be in a transaction it' s a requirement



Thanks for your help

Hibernate version: 3.3.1

Mapping documents: JPA

Code between sessionFactory.openSession() and session.close():

Full stack trace of any exception that occurs:

Name and version of the database you are using: MySql 5 and also Derby (both gives the same perf)

The generated SQL (show_sql=true):

Debug level Hibernate log excerpt:


Top
 Profile  
 
 Post subject:
PostPosted: Thu Jan 15, 2009 5:56 pm 
Newbie

Joined: Fri Mar 28, 2008 5:48 am
Posts: 5
No one got ome hints ? things to look at ? things to banish... ?

Is it possible to use addbatch and executebatch methods of statement with JPA ??

Still in need for some help. Thanks


Top
 Profile  
 
 Post subject:
PostPosted: Thu Jan 15, 2009 7:14 pm 
Expert
Expert

Joined: Wed Mar 03, 2004 6:35 am
Posts: 1240
Location: Lund, Sweden
Here is an (unsorted) list of things you might consider:

Are you doing all 9887+10632 instances in one go or are you flush()-ing at regular intervals? Note that default settings of Hibernate will automatically flush the session before a query is made. This can make things real slow unless you also clear() the session. It may be a good idea to set the flush mode to FlushMode.COMMIT, but if there is a chance of duplicates in the file you'll have to take care of that in your code.

How long time does it take to process the first 100 items as compared to the last 100? If there is a large difference it can be a sign of poor flush behavior.

Are you building new Query objects for each new instance or are you reusing the same two queries and just changing the query parameter?

I don't think doing it without a transaction will help much since in practice it may just mean that you are doing 9887+10632 transactions instead.

Have you made any time measurements to see what parts of your code that is taking a lot of time?

Is there an index on the StrFld column? Hmm... it may not be of any use anyway since you are using UPPPER().... but... MySQL is usually case-insensitive so you may not need UPPER at all...


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.