I've been wrestling with performance issues for the last fe wdays now. I know this stuff can be tricky and hard. I know it's not trivial. So therefore I've really done some homework, read some books and read as many posts and articles about performance as I can. But I'm just baffled with these few things about Postgresql.
1. Dealing with large table sizes where the database uses sequential scans
2. Dealing with the query planner not chosing the right join order, even when you know the correct one yourself and wish you could just specify it.
Issue #1 is the one that is bothering me the most. For example, let's say I want to query for a list of objects of type A who have 100's of children of object type B. I want to use a filter on B to arrive at these sets of A's. To me, this seems really straight forward. If there is an instead from B to A, the query should be fast. But because the distribution is so bad, I'm basically forced to put this filter on A instead to avoid joining (and reading in on occasion) to type B.
I do Dan Tow's joining diagrams (amazing book) since his method seems to work for many queries). Sometimes though, the only filter is a limit since the other filters have very bad filter ratios. I have to make up requirements just to get these results down to size so that postgres isn't looping. However, even with a pleasent 0.08 filter ratio on a 32,000k table, the query takes 250-275 milliseconds with all the proper indexes.
Yes, I've even eliminated the indexes and started adding at random, running explain several times. Dan tow's method makes sense. That's the one that should work. But postgres is just stupid, or I am. This leads me to believe the filters aren't good enough, but I muck with the queries anymore without really changing the business reason for using them now (for example, that 0.08 filter ratio came from getting only the last 7 days of entries. The 7 days is crutial to business requirements).
This does alarm me however. In Dan's book, he mentions that you should only be tuning about 5% of the big queries in the app. I find myself tuning practically everything. I used to think it might be Hibernate being inefficient. I used to think maybe I was fetching too little, or maybe not enough. But no, it's just postgres.
But part of me really wonders why it takes 1.8 seconds to look up a unique 3 column value (that has an implied index) in a table with 80,000 rows. I don't get it. The index is actually being used in this case too. I drop the entire database, thinking it might be fudged, and reload it. However, I still get the same results. I vacuum full analyze and the results are the same.
I'm not sure how big websites can serve so much content. Heck, even blogs sometimes join and serve quite a bit per page. However, I can refresh those pages a lot and get great results and I'm not even local to where the application is running. However, I run this stuff on my local machine with 1GB ram, a P4 3.0 ghz processor and a decent WD 120GB harddrive and I get poor performance. How can I feel confident that this stuff is going to perform correctly?
Issue #2 is the really bogus one. It's just frustrating when I take great care calculating the diagrams from Dan Tow's book. I know I'm doing it right since I got all of the answers correct in all of his exercises. Surely I'm skilled enough to tune my modest application, right? Perhaps I put too much faith in the logic of his techniques, but my best guess is that Postgres' query planner has other plans, choosing join orders that are very different from the optimal plan.
So, this leads me to one of two things:
1. I'm either an idiot; or
2. It's time to choose a new database
I really respect OSS and love it. Hibernate is fantastic once you get past the learning curve (and perhaps write down special cases into a log file somewhere since my memory is just horrible). But when it comes to performance, I'm really worried.
Anyway, thanks for listening. If you have some comments, feel free to share.
|