-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 8 posts ] 
Author Message
 Post subject: table with 5-700 cols, performance issue?
PostPosted: Fri Sep 05, 2003 8:34 am 
Regular
Regular

Joined: Tue Aug 26, 2003 7:53 pm
Posts: 66
Location: Lakeland, Florida USA
We've got a table with 5-700 cols. This table is in 3nd normal form as far as I can tell. We could break chucks of cols off to seporate tables but they would have 1:1 relationships with "mother" :-). Would this be preferable to 1 table with 5-700 cols?

My opining is that separate tables would be much preferable. But this is based on what I perceive to be a performance issue hydrating all those cols. Note that access to the table will probably always be through a foreign key where the rs would contain at most 10-12 rows.

Anyone ever use a table with this many cols?

Jeff Boring
Siemens Power Corp.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Sep 05, 2003 9:38 am 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 3:00 pm
Posts: 1816
Location: Austin, TX
What is 5-700? Between 5 and 700? Five thousand and seven-hundred (5,700)?

From a database perfomance perspective, it is actually better to not break them out into seperate tables. One of the tables in our old schema modelled employee information in this manner. There was a main ENT_PEOPLE table and two corralary tables ENT_PEOPLE_SUPP and ENT_PEOPLE_AUX with one-to-one between all three. And sure enough every time we needed employee info the data we needed was not all on the main ENT_PEOPLE table, but on _SUPP or on _AUX or on both. Thus what should have been a simple lookup turned into a three table join each and every time. If you can more logically partition off the data elements this can probably be minimized though.

We were using direct sql queries at that time, so I can't speak to Hibernate specific performance in relation to this.

If you broke the data apart, it would be a true one-to-one, correct? It seems you could proceed one of two ways using Hibernate on top of this.
1) Break the tables apart and map the on-to-one relations using hibernate
2) Keep it all as one table and use the Lightweight pattern to acheive different lightweight views onto this data.


Top
 Profile  
 
 Post subject: normalization
PostPosted: Fri Sep 05, 2003 10:17 am 
Beginner
Beginner

Joined: Thu Sep 04, 2003 2:50 pm
Posts: 45
Location: US: New York NY
You can review nomalization by reading this:
http://www.databasejournal.com/sqletc/article.php/26861_1428511_4
or other stuff.

Database Normalization has nothing to do with hibernate. How far to normalize your data will depend on the data, the business and the database you choose. If your relational database does not perform well when the data is normalized, then you either have to forget achieving 1st normal form. If your database supports (performs well) nomalization, then there should be no reason not to. Hibernate seems to work well in either condition ( i say seems to because I have not used it extensivly.)

You always have to closely analyze your data and your business requirements. I've worked with dba's that pushed hard to always go 4th normal form. An example is a customer table, where each customer has 3 phone numbers. I would make that phone_1, _2 and _3 (or home_phone, work_phone, cell_phone,) because I am normally comfortable with that decision and it usually makes the coding less complicated. The 'always normalize' camp woud map the phone numbers and suggest a stored procedure to ease coding. I think that both parties are correct. However, every descision has a consequence.

Hope that helps. Russ


Top
 Profile  
 
 Post subject:
PostPosted: Fri Sep 05, 2003 11:07 am 
Regular
Regular

Joined: Tue Aug 26, 2003 7:53 pm
Posts: 66
Location: Lakeland, Florida USA
I got an actual count, it's 493. That's four hundred ninety three cols :-)

When I said the table is in 3NF, I meant that it appears the data has been properly modeled from a relational /data modeling view.
Quote:
> Database Normalization has nothing to do with hibernate. How far to normalize your data will depend on the data, the business and the database you choose. <


Most apps that I know of in the J2EE/RDBMS environment almost always store data in 3NF. The question was about performance of a 493 col table not normalization. I only mentioned normalization because the first question most people would ask about a 493 col table is, is it normalized.

Jeff


Top
 Profile  
 
 Post subject:
PostPosted: Fri Sep 05, 2003 3:08 pm 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 3:00 pm
Posts: 1816
Location: Austin, TX
Actually, the "one big table with a gazillion columns" concept is typically 4th-NF; i.e., slightly denormalized for query/access performance reasons.

The only concerns I would have would be:
* the amount of raw data that that would cause the jdbc driver to have to deal with
* the amount of throw-away objects created by jdbc in order to handle all those columns

Of ciurse the more data, the longer it will take hibernate to hydrate all the objects, but I've found that to be pretty linear performance.[/list]


Top
 Profile  
 
 Post subject:
PostPosted: Fri Sep 05, 2003 3:42 pm 
Beginner
Beginner

Joined: Thu Sep 04, 2003 2:50 pm
Posts: 45
Location: US: New York NY
steve, where did you get that definition for 4th noraml form... i would enjoy reading that.

http://home.earthlink.net/~billkent/Doc/simple5.htm

or

http://lochsa.lcsc.edu/cs351-sp-1999/normalization2.html

does not seem to fit your definition

if you have 493 columns, it would seem the data is not normalized, but that would be judging it on the cover. Get 10 rows of 493 columns will take longer then 10 rows of 10 columns. Will the same 10 rows return faster if normalized (assuming it could be normalized?) Good question, but one that should not be too hard to test. I would put $5 down that with lazy initialization and hibernate, the normalized set would perform better, but I can not back that with hard data.

there's my two cents :-)


Top
 Profile  
 
 Post subject:
PostPosted: Fri Sep 05, 2003 4:12 pm 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 3:00 pm
Posts: 1816
Location: Austin, TX
Quote:
concept is typically 4th-NF

Trying to respond to fast... Of course, 1st-NF

Quote:
Get 10 rows of 493 columns will take longer then 10 rows of 10 columns

You'll notice I said "From a database perfomance perspective"... In pure database performance terms, it actually has more to do with the number joins that have to be performed. Very rarely in SQL are you returning all rows in a table. So the total number of columns defined for a table is largely irrelavent. What is far more relevant is how you partition this data if you decide to break it up to avoid those joins as often as possible.

Quote:
Get 10 rows of 493 columns will take longer then 10 rows of 10 columns

Not if those 10 columns are spread across 10 different tables. The quickest access to data is going to be sans joins. Period. After about 5-6 join tables on most RDBMSs, you are going to start exponentially decreasing performance.

Quote:
I would put $5 down that with lazy initialization and hibernate, the normalized set would perform better, but I can not back that with hard data.

Of course, again "From a database perfomance perspective"... You'll notice my two suggestions included either breaking apart the data for Hibernate, or at the very least mapping lightweight views of the data so that not all the columns are returned


Top
 Profile  
 
 Post subject:
PostPosted: Fri Sep 05, 2003 6:24 pm 
Regular
Regular

Joined: Tue Aug 26, 2003 7:53 pm
Posts: 66
Location: Lakeland, Florida USA
Typical for us might be about 5 rows, 20-30 cols spread across 2 tables. This should perform better than 5 rows, 493 cols & 1 table. Of course, 5 rows, 20-30 cols & 5 tables might not. But perform better where?

Not from between PreparedStatement execute and ResultSet. I've never considered this would be the problem. I was thinking about Hibernate instantiating all those instances. The domain bean has no primitives. I ran an informal test today and it took between 5 & 10 secs to hydrate 1 row of 493 cols. By comparison, less than 1 sec for 1 row/table, 6 cols. [run in non-optimized environment - debug] There is now no question about finding a better way. This query will be run for 95% of the request. [Of course, we've also got to consider dehydration.]

We know what screens will use which cols and there seems to be some logical groupings so splitting the table will help. But is it necessary? Lightweight shows how to split a table into 2 main classes but we need about 5. Can we do something with interfaces? Can Hibernate deal with 1 class implementing 5 interfaces? Or 5 classes with 1-to-1 relationships all mapped to 1 table?

Thanks for your time,

Jeff


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 8 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.