Please give me a sanity check...

eagle79 · **Posted:** Fri Oct 21, 2005 2:48 pm

Folks, please give me a sanity check on a few things:

First some background:
We are building an application with what I would describe as a moderately large domain model. There are not many inheritance relationships, but most objects have at least one association with another object - in many cases a one-to-many relationship. The object model is rooted almost entirely on a single object -- Person. Each Person associated with one or more Case objects. Each Case object contains a lot of data collected during a period of time where the Person is is a consumer of our company. This data is represented in terms of both simple properties, as well as collections of related entites (many of which, in turn, have their own associated entities). So the object graph for a Consumer is fairly deep.

Now for the questions:
1. The domain model was designed with a lot of unidirectional associations, where the parent has a reference to the child, but not a reference from the child to the parent. However, in some cases, the child sometimes has an Integer property representing the database ID of the parent. For example, something like this:

Code:

-------------        
| Person    |        -------------
|-----------|        | Address   |
| personId  |        |-----------|
| addresses | -----> | addressId |
| ...       |        | personId  | (Integer)
|-----------|        | ...       |
| ...       |        |-----------|
-------------        | ...       |
                     -------------

My opinion, though, was twofold:
- Most importantly, if an object has an ID reference to another object, that reference should be replaced with an actual object reference. More like this:

Code:

-------------        
| Person    |        -------------
|-----------|        | Address   |
| personId  |        |-----------|
| addresses | -----> | addressId |
| ...       | <----- | person    | (Person)
|-----------|        | ...       |
| ...       |        |-----------|
-------------        | ...       |
                     -------------

- Second, I feel that a bidirectional association is preferred in most cases to a unidirectional association unless the relationship doesn't model a real-world situation (a EBook object on an online bookstore would likely not have a reference to its owner, though the owner object (Customer?) might have a collection of owned books).

Am I wrong on these?

2. Because we are using a Session Facade of remote EJBs (SLSBs) to retrieve information for the web tier, we must have all objects initialized (fetched) before they are passed over the RMI layer (since they have to be Serialized). Because of this, our developers have been using eager fetching in the object mappings a great deal. But this has me concerned that getting a Person object would result in a load of ALL the information for that person that's in the database, even if the data we want is right there in the Person object.

My opinion is that for the most part, the object model should be set up for lazy-loading, and data to be sent back to the client should be retrieved using the Query or Criteria APIs, so that only the necessary data will be fetched and can be immediately sent to the remote client.

Yes? No?

Thanks in advance for your help.

pksiv · **Posted:** Fri Oct 21, 2005 2:57 pm

The simple answer is you are sane. Your assumptions seem correct.

For the first one, having an associated object instead of just the PK, allows you to persist the associated objects as part of their parent with fewer database queries. If all you have is the Address.personId on the ADDRESS Object, you will need to insert the person, then take the ID and update each Address Object, then save each Address.

If, on the other hand, the Address Object contains a Person Reference, when the Person Object is inserted, the Address knows the PersonId automaticlaly - assuming the Person reference has been set on the Address Object.

The remote EJB layer will require you to have things fully/properly initialized prior to returning your objects to the Web Tier. And I agree with you, as does just about everything I've ever read from the Hibernate guys, that the mapping files should remain lazy, and the fetching strategy, depth etc... should be done using the APIs.

Hope this helps... Not that you really needed my help.

eagle79 · **Posted:** Fri Oct 21, 2005 8:40 pm

Thanks for the reply pksiv. Any opinion on this statement?

Quote:

...I feel that a bidirectional association is preferred in most cases to a unidirectional association unless the relationship doesn't model a real-world situation (a EBook object on an online bookstore would likely not have a reference to its owner, though the owner object (Customer?) might have a collection of owned books).

eagle79 · **Posted:** Fri Oct 21, 2005 8:41 pm

Sorry, what I mean to say there is that basically, you should expect to use a bidirectional association unless you can think of a reason why to not have one.

eagle79 · **Posted:** Mon Oct 24, 2005 2:44 am

eagle79 wrote:

...I feel that a bidirectional association is preferred in most cases to a unidirectional association unless the relationship doesn't model a real-world situation (a EBook object on an online bookstore would likely not have a reference to its owner, though the owner object (Customer?) might have a collection of owned books).

Am I wrong on these?

As a partial response to my own question, I found these quotes in Hibernate in Action, pp. 232-233:

Quote:

Good uses for unidirectional one-to-many associations are uncommon in practice…

Quote:

There is an important issue to consider, which, in our experience, puzzles many Hibernate users at first. In a unidirectional one-to-many association, the foreign key column… in the [child table] must be nullable. [A child object] could be saved without knowing anything about [its parent]—it’s a stand-alone entity! This is a consistent model and mapping, and you might have to think about it twice if you deal with a not-null foreign key and a parent/child relationship. Using a bidirectional association (and a Set) is the correct solution.

I would still like for a more clear-cut opinion on when unidirectional vs. bidirectional associations are appropriate from the experts out there, though. What I have here tells me that in a one-to-many, a bidirectional should be used unless the child can legitimately exist without an associated parent. What about other situations? one-to-one? many-to-one? Are there other times when a one-to-many should be unidirectional?

snpesnpe · **Joined:** Sat Jun 12, 2004 4:49 pm **Posts:** 915

one-to-many isn't so simple - what about many side is > 1000000

eagle79 · **Posted:** Mon Oct 24, 2005 10:38 am

snpesnpe wrote:

one-to-many isn't so simple - what about many side is > 1000000

How would that affect the directionality of the association?

baliukas · **Posted:** Mon Oct 24, 2005 11:04 am

eagle79 wrote:

Thanks for the reply pksiv. Any opinion on this statement?

Quote:

...I feel that a bidirectional association is preferred in most cases to a unidirectional association unless the relationship doesn't model a real-world situation (a EBook object on an online bookstore would likely not have a reference to its owner, though the owner object (Customer?) might have a collection of owned books).

You can use private "parent" if you want to "hide" this property (hibernate can access private properties) "real-world" models always have foreign key.

snpesnpe · **Joined:** Sat Jun 12, 2004 4:49 pm **Posts:** 915

Quote:

How would that affect the directionality of the association?

you get 'Out of memory' and you can' resolve it execpt that remove association

eagle79 · **Posted:** Mon Oct 24, 2005 12:57 pm

snpesnpe wrote:

Quote:

How would that affect the directionality of the association?

you get 'Out of memory' and you can' resolve it except that remove association

I don't follow.

If a Person has a collection of contacts (10000 of them), but the Contact doesn't have an object reference back to the Person, getting a person and his contacts would involve loading 1+10000 objects.

If a Person has a collection of contacts (10000 of them), and the Contact does have an object reference back to the Person, getting a person and his contacts would involve loading 1+10000 objects.

The only difference as far as memory is concerned is that in the unidirectional case, each Contact would take slightly less memory (32 bits or so?) because there is no additional object pointer from the Contact to the Person.

snpesnpe · **Joined:** Sat Jun 12, 2004 4:49 pm **Posts:** 915

Quote:

I don't follow.

If a Person has a collection of contacts (10000 of them), but the Contact doesn't have an object reference back to the Person, getting a person and his contacts would involve loading 1+10000 objects.

If a Person has a collection of contacts (10000 of them), and the Contact does have an object reference back to the Person, getting a person and his contacts would involve loading 1+10000 objects.

The only difference as far as memory is concerned is that in the unidirectional case, each Contact would take slightly less memory (32 bits or so?) because there is no additional object pointer from the Contact to the Person.

you don't understand me

example : customer have 1000000 (bilion no tousend) orders and if you have one-to-meny relation to orders you have 'out of memory' when load collection

I know never how much row is in collection and it can be bilion

relation orders - > customer is many-to-one and i know that i have true ONE customer for order

my claim : every one-to-many is potential Out of memory

eagle79 · **Posted:** Mon Oct 24, 2005 3:17 pm

I see... the conjecture here is that sometimes it is appropriate for the child to have a reference to the parent, but the parent to not have a collection of its children. This is a good point.

Usually for our domain model, however, we have the opposite: the parent has a collection of its children, while the children (currently) do not have a reference to their parent. Also, such large collections are not common for us, so it's a moot point.

However, from a theoretical standpoint, this is an important point to make. Thank you.

eagle79 · **Posted:** Mon Oct 24, 2005 3:26 pm

eagle79 wrote:

...the conjecture here is that sometimes it is appropriate for the child to have a reference to the parent, but the parent to not have a collection of its children...

As a side note, in such a situation, this would make it impossible to access all the children for the parent from the parent (it could of course be done through HQL), so it is something I would want to do only as a last resort:

Code:

//this won't work because there is no Customer.getOrders() method
Customer customer = //get customer

Iterator iter = customer.getOrders().iterator();
while (iter.hasNext()) {
...

//we have to do it differently:
Query query = session.createQuery(
   "from Order order join fetch Customer customer where customer.id = :id");

List orders = query.list();

Iterator iter = orders.iterator();
while (iter.hasNext()) {
...

pksiv · **Posted:** Mon Oct 24, 2005 3:29 pm

snpesnpe wrote:

my claim : every one-to-many is potential Out of memory

I would disagree with this statement.

If you understand the business behind the association, I think it would be quite easy in most cases to determine which <one-to-many>'s have this OutOfMemory potential and which ones don't.

For example, I can be quite certain that a one-to-many from Person->Address in a Contacts system or State->County will never reach an unreasonable limits.

snpesnpe · **Joined:** Sat Jun 12, 2004 4:49 pm **Posts:** 915

it isn't a theoretical standpoint it is a real case
customer have many orders, but orders have items and bidirectional relations can throw out of memory real in ERP application - you can control for web application (shopping type), but it is hard for general application - application work fine in test environment and crash in production
solution is that hibernate have 'cursored' collections (like true database have cursor query) and we have xxx row in memory (fetch size) and other are in database.When user search for row xxx+1 hibernate load row from database and remove any row in memory - i simulate this in my application for every query with scrollable results and i haven't one-to-many - my query load xxx rows with scrollable result (compute size of table) and have only xxx rows in memory - i simulate one-to-many (when i need) with new hql query (it is cursored again) - problem with this is query read consistence when i load rows from database in differnet session, but it is less problem than 'out of memory' and this is only for crud type application - for true query i do it in one session and database do read consistence in scrollable result (hibernate's setMaxResults/setFirstResult have read consistence problem always)
- mysql (and postgresql until last version) haven't cursor and load all query rows in memory - they don't use fetch size - for postgresql < 8 jdbc statementthrow 'out exception in query 'select * from table ' for any table with many rows
oracle have cursors automatic and load only fetch size rows - you can have table with bilions rows and iterate with resultset.netx without problem (if you don't have own cache rows )