postgresql slow?

MarcusZ · **Joined:** Fri May 07, 2004 2:18 am **Posts:** 19

Hello

I have spent the last 2 months becoming familiar with hibernate and at this point I can honestly state that it performs as advertised. The learning curve is a little steep. More from having to glean snippets of examples than from any inherent complexity. My personal experience with it was made a little more difficult because I used postgresql as my backed and was constantly frustrated by the the performance of selects and inserts as my objects grew in complexity and quantity.
Basically I have developed a 3 tier rmi app that has 400 classes and 300 tables. All strongly crosslinked with foreign keys. It works fine but as I developed it I came to realize that the performance would be unacceptable in real world use unless I started to implement many convoluted caching schemes. I then started to do a whole bunch of profiling to find out where the bottle necks were. It all came down to postgresql.

The app runs on 3 different server. All P4 3.2GHZ with 2GB ram that are dedicated to their one specific task eg the database runs alone on its own server. The dataset has about 40000 rows
Here are my results for my most complex object which has a fairly large object graph(lots of joins). The tests were repeated 20 times to average them out.
Postgresl 7.2.x on Redhat 9 select 1672 ms insert 1200ms
Postgresl 7.3.x on Redhat 9 select 1860 ms insert 1200ms
Postgresl 7.3.x on Debian testing select 1650 ms insert 1300ms

The numbers are similar enough to tell me that it isnt a problem with a specific Postgresql version or linux distro.
Also please note that this after considerable tweaking of postgresql.conf. The default numbers were worse
Since I have built my whole app using xdoclet tags I decided to try my luck with Mysql. I downloaded the current stable 4.x and altered a few settings in my ant build file and recreated the whole schema including data into mysql running on the same machines.
The results were stunning
Mysql 4.0.20 on Redhat and Debian select ~25ms insert ~19ms

Same data, same schema, mysql installed with a simple apt-get install with default config. This is an 80x improvement and it has been consistent and rock solid. I like postges(I dont know why) and I have been using it for years. I was sort of aware of its slowness and I would deliberately strive to avoid using foreign keys because of that but this is the first time I was able to seamlessly test it against mysql because xdoclet made it so easy and seamless.

Any comments. Please remember the test is not very scientific but it is the same data, same schema, same servers, same settings and the dataset is not random. It is from a legacy database that I have converted by hand into java objects that are then inserted into the database through hibernate as an ant task.
My timing tests were done with a simple
long start = System.currentTimeMillis();
entity = (Entity) session.get(entity.getClass(), new Long(entity.getId()));
long end = System.currentTimeMillis();
System.out.println("call took = " + (end - start) + " ms");
The postgres logs backup the numbers

Marcus

christian · **Posted:** Sat Jul 03, 2004 1:58 pm

Quote:

Same data, same schema, mysql installed with a simple apt-get install with default config. This is an 80x improvement and it has been consistent and rock solid. I like postges(I dont know why) and I have been using it for years. I was sort of aware of its slowness and I would deliberately strive to avoid using foreign keys because of that but this is the first time I was able to seamlessly test it against mysql because xdoclet made it so easy and seamless.

No, certainly not the same, most likely you don't even have transactions with MySQL enabled. It's just a dumb filestore, while PostgreSQL is a database. I recommend tuning the database first, read "SQL Tuning" from Dan Tow. Most likely you have some very suboptimal queries and indexes. Read the execution plans, optimize. I also recommend "Hibernate in Action", tuning is a topic that is very had to discuss online.

MarcusZ · **Joined:** Fri May 07, 2004 2:18 am **Posts:** 19

Why would I use hibernate and highly customized queries to recreate an object . I just want one object recreated. If I wanted to super customize the query I wouldnt bother with hibernate. That is the whole point of using session.get(). You will note I was not calling
Transaction tx= session.beginTransaction();
tx.commit();
In my function.
My main interest is select speed. I could live with the ridiculously slow insert.

Marcus

baliukas · **Posted:** Sat Jul 03, 2004 2:42 pm

I have tried it on my PC without any tuning (default configuration created by apt-get install)

Code:

Time: 122.865 ms
test=# select count(*) from test;
 count
-------
 90112
(1 row)

Time: 121.653 ms
test=# select * from test where id= 666::int8;
 id  | value1 | value2
-----+--------+--------
 666 |      2 | 3
(1 row)

Time: 3.015 ms
test=# INSERT INTO test values ( nextval('test_sec') ,2,3 );
INSERT 107269 1
Time: 1.203 ms

christian · **Posted:** Sat Jul 03, 2004 2:47 pm

Both of those queries run completely in the index (and the index is in the cache) of the database. This is not a usable test :)

baliukas · **Posted:** Sat Jul 03, 2004 3:02 pm

No, it is not in cache after "count", it is faster from cache:

Code:

test=# select * from test where id= 666::int8;
 id  | value1 | value2
-----+--------+--------
 666 |      2 | 3
(1 row)

Time: 0.856 ms
test=# select * from test where id= 666::int8;
 id  | value1 | value2
-----+--------+--------
 666 |      2 | 3
(1 row)

Time: 0.861 ms
test=# select * from test where id= 666::int8;
 id  | value1 | value2
-----+--------+--------
 666 |      2 | 3
(1 row)

baliukas · **Posted:** Sat Jul 03, 2004 3:04 pm

I can prove postgresql is the fasted database in the world, but it is posiible to prove the same for any database :)

MarcusZ · **Joined:** Fri May 07, 2004 2:18 am **Posts:** 19

I also get almost no speed difference between postgresql and mysql on simple objects. The difference becomes apparent as the object complexity increases. Especially as more foreign keys are involved. I like postgresql(I dont know why) but this slowness with my real world objects renders it unuseable. I would not have noticed were it not for hibernate. Then again if not for hibernate I would not be creating objects with this level of granularity( All those outer join statements make me go crosseyed after a while)

Marcus

christian · **Posted:** Sat Jul 03, 2004 4:20 pm

Guys, instead of doing a lot of handwaving like "complex" or "non-complex" objects, read the friggin SQL Tuning book.

baliukas · **Posted:** Sat Jul 03, 2004 4:48 pm

It is possible tune postgresql without book, you can not tune it like oracle, but
execute "VACUUM FULL ANALYSE" first, try "EXPLAIN ANALYSE sql" and see statistics in system tables. It must say why it is slow.
You can try to add more cache and more sort memory, but linux kernel cache with default configuration is very aggressive itself. Most performance problems are caused by DB fragmentation and bad query plans.

christian · **Posted:** Sat Jul 03, 2004 4:52 pm

Not surprisingly, "fragmentation" and execution plans is what "SQL Tuning" is all about.

MarcusZ · **Joined:** Fri May 07, 2004 2:18 am **Posts:** 19

Actually I did try vacuum analyze and every other trick I could think of. I have been using postgresql for years. I probably pissed away 5 days playing with every conceivable setting which is how I came down to ~1600ms. When I started it was ~4200ms. The fact that I couldnt go any lower made me really dig. I also notice that postgresql will pin the cpu while mysql is so fast I cant measure its cpu use. I am now trying it with Oracle and my preliminary results show its also a hell of a lot faster than postgresql. The more I play with it the more I come to realize that foreign keys just kill postgresql performance.

Marcus

baliukas · **Posted:** Sun Jul 04, 2004 1:49 am

It is very strange, paste query plan and SQL if it is not a secret. I am not sure I can help, but there are a lot of tricks to make postgresql faster. Probably not a foreign key is a problem, it is just a trigger. I am not sure about all versions, but "old" JDBC driver causes a lot of performance problems, "id= 666::int8" this trick with cast is very important too, it looks like two same queries, but performance is very different:

Code:

test=# select * from test where id= 666::int8;
 id  | value1 | value2
-----+--------+--------
 666 |      2 | 3
(1 row)

Time: 1.165 ms
test=# select * from test where id= 666;
 id  | value1 | value2
-----+--------+--------
 666 |      2 | 3
(1 row)

Time: 512.234 ms

baliukas · **Posted:** Sun Jul 04, 2004 5:29 am

BTW this explains the reason:

Index Scan vs Seq Scan 0.111 ms / 103.838 ms

Code:

test=# explain analyse select * from test where id= 666;
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..22.50 rows=2 width=412) (actual time=0.598..103.763 rows=1 loops=1)
   Filter: (id = 666)
 Total runtime: 103.838 ms
(3 rows)


test=# explain analyse select * from test where id= 666::int8;
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Index Scan using test_pkey on test  (cost=0.00..4.82 rows=2 width=412) (actual time=0.041..0.045 rows=1 loops=1)
   Index Cond: (id = 666::bigint)
 Total runtime: 0.111 ms
(3 rows)

103.838 ms

MarcusZ · **Joined:** Fri May 07, 2004 2:18 am **Posts:** 19

I appreciate the response. Unfortunately I cannot post the code. I finished testing with oracle. It is sometimes equal and sometimes 1/3 the speed of mysql but still more than useable compared with postgresql. I noticed this person had similar issues.
http://forum.hibernate.org/viewtopic.ph ... light=slow

Perhaps I could make things better by writing all the jdbc by hand and using explain to modify my sql but what would I gain by this. My app works. It works now and it works under heavy load. Hibernate does its job. It eliminates a lot of extra hand coding and to be honest it feels more robust than going the jdbc route. I still like postgresql (Although I am no longer going to use it) and hopefully someone will come up with some simple solution to this bug

Marcus