-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 8 posts ] 
Author Message
 Post subject: bulk reading is getting slower the more I read
PostPosted: Thu May 24, 2007 7:44 am 
Newbie

Joined: Wed May 23, 2007 10:35 am
Posts: 6
I have the problem that a bulk read from a database is getting slower and slower the more objects I read. I have a simple object model:
Book, Author and Chapters. A book has OneToMany Authors and OneToMany Chapters.

My test case is to create 10000 books each having 10 authors and 10 chapters. The book title, chapter content and author name are short strings (around 10chars length).

Then I try to read all books (including authors and chapters). All classes have toString methods that print all properties. To prevent IO bottlenecks I call the toString method of the book (therefore retrieving all lazy collections) but do not print the content somewhere (I did before, did not change the numbers a lot).

I am logging how long the retrieval of 100 books takes and the times (in ms) are:
Code:
39;692;438;433;568;484;587;635;660;489;
552;542;555;603;602;764;537;845;627;642;
561;605;680;660;578;603;698;636;697;710;
827;770;785;872;951;879;894;934;970;1101;
1010;1067;1097;1133;1249;1196;1230;1352;1314;1661;
1545;1395;1420;1461;1538;1633;1615;1611;1643;1712;
1721;1831;1901;1987;2024;2061;2112;2145;2263;2186;
2210;2262;2300;2339;2368;2539;2427;2466;2523;2756;
2755;2599;2696;2800;2702;2771;2782;2819;2819;2868;
2951;3262;2962;3025;2991;3109;3115;3572;3406;3483;3565;

(first number for first 100 books, second for books 100-200)

Some statistical ups and downs, but the tendency is raising: books 9900 to 10000 take about 4 times longer to read than books 0-100!

The numbers above are for MySql, same tendency for Oracle9 (but generally slower).

The question is: why is getting hibernate (?) slower on more objects??? I clear the session cache, I do not use a 2nd level cache - I have no idea what is going on. But it prevents us from using hibernate in a large project!

My classes and settings:

Code:
@Entity
public class Book {
    @Id
    @GeneratedValue
    private Long id;
    @Basic
    private String title;
    @OneToMany(cascade=CascadeType.ALL, fetch=FetchType.LAZY)
    private List<Author> authors;
    @OneToMany(cascade=CascadeType.ALL, fetch=FetchType.LAZY)
    private List<Chapter> chapters;

  // getters and setters
}

@Entity
public class Author {
    @Id
    @GeneratedValue
    private Long id;
    private String name;
   
  // getters and setters
}

@Entity
public class Chapter {
    @Id
    @GeneratedValue
    private Long id;
    @Basic
    private String text;

  // getters and setters
}


Code between sessionFactory.openSession() and session.close():

Code:
        Session session = HibernateUtil.getSession();
        Transaction tx = session.beginTransaction();
        String hql = "select b from Book b where b.title LIKE 'book%'";
        Query query = session.createQuery(hql).setCacheMode(CacheMode.IGNORE);
        ScrollableResults books =  query.scroll(ScrollMode.FORWARD_ONLY);
        int count=0;
        Book book;
        String bookString;
        while ( books.next() ) {
            book = (Book) books.get(0);
            bookString = count + " " + book;
            if (count % 25 == 0) {
                //flush and release memory:
                session.flush();
                session.clear();
            }
            ++count;
        }
        tx.commit();
        session.close();


My hibernate.hbm.xml looks like:
Code:
<hibernate-configuration>
    <session-factory>
        <!-- Database connection settings -->
        <property name="connection.driver_class">com.mysql.jdbc.Driver</property>
        <property name="connection.url">jdbc:mysql://xxxx/aptest?useUnicode=true&amp;characterEncoding=UTF-8</property>
        <property name="connection.username">xx</property>
        <property name="connection.password">xx</property>
        <property name="dialect">org.hibernate.dialect.MySQL5InnoDBDialect</property>

        <!-- JDBC connection pool (use the built-in) -->
        <property name="connection.pool_size">1</property>

        <!-- Enable Hibernate's automatic session context management -->
        <property name="current_session_context_class">thread</property>

        <property name="cache.provider_class">org.hibernate.cache.NoCacheProvider</property>
       
        <!-- batch inserts -->
        <property name="hibernate.jdbc.batch_size">25</property>

        <!-- Echo all executed SQL to stdout -->
        <property name="show_sql">false</property>

        <property name="hbm2ddl.auto">validate</property> <!-- create, update, validate -->


         <!--  for configuration in annotations -->
         <mapping class="testonetomany.Book"/>
         <mapping class="testonetomany.Author"/>
         <mapping class="testonetomany.Chapter"/>
    </session-factory>

</hibernate-configuration>

Hibernate version: 3.2.3.ga, hibernate annotations 3.2.1.ga


Name and version of the database you are using:
mysql-5.0.18 (linux on suse) or oracle 9g
Code:


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 24, 2007 9:31 am 
Newbie

Joined: Tue May 08, 2007 12:25 am
Posts: 18
A few suggestions
1. try using a stateless session if possible.
2. make sure you have the right indices on the table.
3. a similar thing happened to me when I did not give MySQL enough memory to load the indices.

could you show me the sql thats being run on the table?


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 24, 2007 9:32 am 
Newbie

Joined: Tue May 08, 2007 12:25 am
Posts: 18
make sure there is a index on "title"


let me know if this works.


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 24, 2007 10:06 am 
Newbie

Joined: Wed May 23, 2007 10:35 am
Posts: 6
imosquer wrote:
1. try using a stateless session if possible.


thought about that but then I'd have to load the referenced collections by hand - not very comfortable.


imosquer wrote:
2. make sure you have the right indices on the table.


I did not create any indices on any tables manually. But you are right. My query could use an index. So I changed the query to
Code:
"select b from Book b"

So no index required, just give me all books! And for the foreign keys there should not be an index required!


imosquer wrote:
3. a similar thing happened to me when I did not give MySQL enough memory to load the indices.


Good point. Have to check that. But the same happened to a big oracle db as well. It nearly killed the oracle, but was even slower than mysql.

imosquer wrote:
could you show me the sql thats being run on the table?


The following queries are issued for the new query (no title like... anymore):
Code:
retrieve all books:
Hibernate: select book0_.id as id0_, book0_.title as title0_ from APTESTBOOK book0_

get authors for first book:
Hibernate: select authors0_.APTESTBOOK_id as APTESTBOOK1_1_, authors0_.authors_id as authors2_1_, author1_.id as id1_0_, author1_.name as name1_0_ from APTESTBOOK_APTESTAUTHOR authors0_ left outer join APTESTAUTHOR author1_ on authors0_.authors_id=author1_.id where authors0_.APTESTBOOK_id=?

get chapters for first book:
Hibernate: select chapters0_.APTESTBOOK_id as APTESTBOOK1_1_, chapters0_.chapters_id as chapters2_1_, chapter1_.id as id2_0_, chapter1_.text as text2_0_ from APTESTBOOK_APTESTCHAPTER chapters0_ left outer join APTESTCHAPTER chapter1_ on chapters0_.chapters_id=chapter1_.id where chapters0_.APTESTBOOK_id=?

print book:
0 Book[title=book0, authors=[Author[name=book0author0], Author[name=book0author1], Author[name=book0author2], Author[name=book0author3], Author[name=book0author4], Author[name=book0author5], Author[name=book0author6], Author[name=book0author7], Author[name=book0author8], Author[name=book0author9]], chapters=[Chapter[text=book0chapter0], Chapter[text=book0chapter1], Chapter[text=book0chapter2], Chapter[text=book0chapter3], Chapter[text=book0chapter4], Chapter[text=book0chapter5], Chapter[text=book0chapter6], Chapter[text=book0chapter7], Chapter[text=book0chapter8], Chapter[text=book0chapter9]]]

second book: first the authors:
Hibernate: select authors0_.APTESTBOOK_id as APTESTBOOK1_1_, authors0_.authors_id as authors2_1_, author1_.id as id1_0_, author1_.name as name1_0_ from APTESTBOOK_APTESTAUTHOR authors0_ left outer join APTESTAUTHOR author1_ on authors0_.authors_id=author1_.id where authors0_.APTESTBOOK_id=?

then the chapters:
Hibernate: select chapters0_.APTESTBOOK_id as APTESTBOOK1_1_, chapters0_.chapters_id as chapters2_1_, chapter1_.id as id2_0_, chapter1_.text as text2_0_ from APTESTBOOK_APTESTCHAPTER chapters0_ left outer join APTESTCHAPTER chapter1_ on chapters0_.chapters_id=chapter1_.id where chapters0_.APTESTBOOK_id=?
....


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 24, 2007 6:15 pm 
Newbie

Joined: Tue May 08, 2007 12:25 am
Posts: 18
In my experience when something like this happens, it usually has to do with indices. Because the more you go down the table the longer it takes the database to traverse the other join tables, and hence your delay.

I'm not sure if its your case or not, but its worth taking a closer look.

just to make sure, try adding indices to these columns:
APTESTBOOK_APTESTCHAPTER chapters_id
APTESTBOOK_APTESTAUTHOR authors_id


also, why are you using scrollable results? try just iterating over a list.
There shouldn't be much of a difference but just a thought.


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 24, 2007 7:59 pm 
Newbie

Joined: Wed May 23, 2007 6:25 pm
Posts: 7
Think about it this way:
First you ask the database for books # (NOT ID) 0 - 100. No problem, it reads off the first 100 records.
Second you ask the database for books # 100 - 200. Now it has to scan through 100 records then give you the next 100. Half the time was spent just getting to the first book you want.
...
You ask the database for books 1,000,000-1,000,100. It spends 10,000 times as much time finding row 1,000,000 as it does reading the data you want.

SQL Server 2005 just introduced the concept of row numbers, and AFAIK MySQL does not have it. The database cannot be treated as an array where you get instant access to a given row number. What you can do, however, is something like this:
int current = 0;
int PAGE = 100;
Results = Search for books matching name with id > current, limit PAGE
while ( Results.size > 0 )
{
print the books
current = id of last book printed
Results = Search for books matching name with id > current, limit PAGE
}

Sorry, you can add all the indexes you want but the behavior you noticed is always going to appear as long as you treat the query results like an array.


Top
 Profile  
 
 Post subject:
PostPosted: Fri May 25, 2007 3:45 am 
Newbie

Joined: Wed May 23, 2007 10:35 am
Posts: 6
imosquer wrote:
In my experience when something like this happens, it usually has to do with indices. Because the more you go down the table the longer it takes the database to traverse the other join tables, and hence your delay.

I'm not sure if its your case or not, but its worth taking a closer look.

just to make sure, try adding indices to these columns:
APTESTBOOK_APTESTCHAPTER chapters_id
APTESTBOOK_APTESTAUTHOR authors_id

also, why are you using scrollable results? try just iterating over a list.
There shouldn't be much of a difference but just a thought.


you are right, there is not much difference (and IMHO mysql and oracle do not support scrollable results, so in effect, they do not help)

for the indices: to be sure that the database is not the bottleneck I implemented the reading in plain jdbc (no hibernate used at all): I can read 10000 books (including authors and chapters and create all java objects (like hibernate)) in 18 seconds (whereas hibernate needed around 200 seconds!)

And it does not become slower the more I read!!! The books/second is constant over the whole reading!

So again: why becomes hibernate slower the more I read? Maybe someone from the hibernate team could respond as it is clearly a hibernate problem!


Top
 Profile  
 
 Post subject: 3.2.1.ga solves the problem
PostPosted: Fri May 25, 2007 4:13 am 
Newbie

Joined: Wed May 23, 2007 10:35 am
Posts: 6
I finally found a solution to the problem. Hibernate 3.2.1.ga or 3.2.4.ga reads the 10000 books in constant time (books read per second).

So it seems that the version 3.2.3.ga is kind of buggy.

Thanks for your answers anyway!


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 8 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.