-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 10 posts ] 
Author Message
 Post subject: Inheritence mapping: what's best? subclass, joined-subclass
PostPosted: Tue Oct 25, 2005 7:02 pm 
Beginner
Beginner

Joined: Wed Oct 19, 2005 4:11 am
Posts: 48
I'm fairly new to hibernate (just a few weeks) and thought I'd short circuit a few days experimentation by asking what most people find is the best way to do inheritance mapping.

I suppose the important considerations for me are (in order of priority):
(don't worry if you can't cover all points - any info would be valuable)

1. Ability to support complex multilevel inheritance hierachies ie., a depth of more than 2 (not multiple inheritance).

eg., (please excuse the Asciigram)

A
| \
B C
| | \
D E F

2. Full and transparent support for polymorphism.

3. Ability for subclasses to have associations with other classes (or even themselves)

4. Query Performance


Top
 Profile  
 
 Post subject:
PostPosted: Thu Oct 27, 2005 9:15 am 
Regular
Regular

Joined: Thu Oct 27, 2005 8:06 am
Posts: 55
Location: München, Germany
In your list of criteria, nothing speaks against table-per-class-hierarchy. The polymorphism and performance criteria even strongly speak for this method. The only drawback might be that attributes, including foreign keys for associations, of subclasses have to be nullable, so you can't use the database mechanisms for checking attributes that are logically non-nullable.

This nullability argument, however, is of limited importance. If your database is only updated via your application (in contrast to spontaneous sql update or insert commands that some user might issue), your software has to check for not-null values, including association cardinality, anyway. If the user enters something that would lead to an illegal null value, you want to catch this error before hitting a database exception, don't you? Therefore, having the database checking null values is just a double check for your software correctness.

Therefore, the only case in which you have to think twice is when the database is updated via different pieces of software that you don't have under your control. In this case, that nullability argument might be so strong that you have to sacrifice some performance and software simplicity for it.

As this is a methodological question for which no yes-no answer exists, I also encourage other experts to comment on this.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Oct 27, 2005 9:28 am 
Beginner
Beginner

Joined: Wed Oct 19, 2005 4:11 am
Posts: 48
Thanks for the info. I've heard other people say that the table-per-subclass is a bit more OO like. After investigating further there are a few things against the table-per-class-hierarchy that I have discovered:

1. Redundant data. The set of table columns is a union of all the fields representing every attribute in every subclass. When objects of the superclass are stored (assuming its not abstract) then it takes up a row that has many columns that it will never use.

2. Schema changes. As we add new subclasses we must change the schema so that the table includes the new columns mapped from the new class' attributes. Not sure how much of a problem this is. Hibernate probably does good job of updating in this case. Table per subclass allows you to add a new subclass without affecting the superclass' table.

3. Have to use a discriminator.

I can see how table-per-hierachy may give a performance boost but is this boost overwhelming or is it just a 5-10% boost.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Oct 27, 2005 9:56 am 
Regular
Regular

Joined: Thu Oct 27, 2005 8:06 am
Posts: 55
Location: München, Germany
Quote:
I've heard other people say that the table-per-subclass is a bit more OO like

Syntactically, it looks oo-like, because one table reflects excactly those attributes declared in one class. Semantically, it isn't, because in OO, when an instance of a subclass is created, the instance will also contain all attributes of the superclass. This is reflected better in the table-per-class-hierarchy approach, where you always get all data of one object.

There is nothing like a really oo-like relational database structure. This is called impedance mismatch between OO and RDBM, and it was the original reason that people started thinking about ORM ;-))

Quote:
When objects of the superclass are stored (assuming its not abstract) then it takes up a row that has many columns that it will never use.


Logically true. Physically, every serious database shouldn't use space for empty attributes.

Quote:
As we add new subclasses we must change the schema


That's true. If this happens a lot, it may be a nuisance. However, adding new subclasses is a major software change. If such a change occurs, isn't it likely that new attributes will be created, apart from new subclasses? In that case, you would have to touch your table structure anyway.

Quote:
Have to use a discriminator


Where's the problem? This is handled completely transparently by Hibernate.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Oct 27, 2005 9:58 am 
CGLIB Developer
CGLIB Developer

Joined: Thu Aug 28, 2003 1:44 pm
Posts: 1217
Location: Vilnius, Lithuania
"table-per-class-hierarchy" is natural way to express generalization in relational model, it is the best for performance reasons too. There are a few problems with this mapping but all of them have solutions:

1. "NOT NULL constraint". It becomes CHECK ( NOT(discriminator = 'mySubclass' AND myField IS NULL) ) in subclasses.
2. "Redundant data". Declare nullable fields last, database will optimize null storage.
3. "Single large table and indexes". Create discriminator based partition.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Oct 27, 2005 12:58 pm 
Beginner
Beginner

Joined: Tue Jan 06, 2004 4:51 pm
Posts: 48
Why are you concerned if it is "OO-like" in a relational database? The whole point of Hibernate is to leverage the relational model on the database side and the OO model on the application end. You should build your tables such that it makes sense relationally on the DB end (which, I would suggest, is table-per-class-hierarchy).


Top
 Profile  
 
 Post subject:
PostPosted: Thu Oct 27, 2005 1:59 pm 
Hibernate Team
Hibernate Team

Joined: Mon Aug 25, 2003 9:11 pm
Posts: 4592
Location: Switzerland
Uhm, the normalized model is joined-subclass. Everything else is a hack, even worse than doing the class==table we are doing.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Oct 27, 2005 2:48 pm 
Beginner
Beginner

Joined: Wed Oct 19, 2005 4:11 am
Posts: 48
Quote:
Uhm, the normalized model is joined-subclass. Everything else is a hack, even worse than doing the class==table we are doing.


Yes that's what I was thinking. Even though my experience over many years has been probably 90% OO modeling and coding and 10% RDBMS I did enough RDBMS to know good normalization when I see it and I agree with Christian that table-per-subclass is the way to go.

Having said that, many of the legacy schemas that I worked on invariably expressed inheritance using 'table per hierarchy' but I think this is only because it is a lot easier to implement it that way when you are mapping "by hand" because you end up writing less queries with joins.

Things change though, now with hibernate we can manage all the queries with joins. The effort involved in developing a normalized schema is no more than making a less normalized schema.

I don't think there would be too much of a performance penalty even though there are extra joins - they only involve a single common primary key and besides RDBMS are "pretty darn good" at joining.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Oct 28, 2005 3:04 am 
CGLIB Developer
CGLIB Developer

Joined: Thu Aug 28, 2003 1:44 pm
Posts: 1217
Location: Vilnius, Lithuania
Performance is a single reason to denormalize database, if you do not have problems with performance then you do not need denormalization. "table-per-class-hierarchy" violates second normal form and there is redundancy for this reason, but redundancy is always "NULL" and denormalization is not a problem in this case. Normal forms is a very good guidelines for database design, but denormalization is not wrong in exceptional cases like this too.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Oct 31, 2005 1:12 pm 
Regular
Regular

Joined: Thu Oct 27, 2005 8:06 am
Posts: 55
Location: München, Germany
Just to keep an interesting discussion alive -- does table-per-hierarchy really violate 2nd normal form? When reading this the first time, I mentally agreed, because having all those null values for the subclass the current instance is not a member of in the same table intuitively looks denormalized. To be sure, I looked up the formal definition of 2NF, and I can't find the violation.

When my only key (candidate) is an artificial key, there is no way to violate 2NF at all. Even the most scrupulous normalizers wouldn't take issue with a table like
Code:
primary key person-id
string first-name not null
string middle-name
string last-name not null

Now, if some inheritance aficionado comes along and puts Java classes on top of that table like this:
Code:
class Person {
   long personId
   String firstName
   String lastName
...}

class PersonWithMiddleName extends Person {
   String middleName
...
}

-- then the table suddenly gets a hierarchy-per-table implementation of the class hierarchy. Silly example, but you get the idea, and I can't see any normalization fault with this. Btw, even the necessary introduction of a discriminator into the table doesn't change this, as, with the real-world semantics of persons, person-id still remains the only key candidate.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 10 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.