Time to give this a bump as I am faced with the same type of temporal problem and so far praying for Divine Intervention for a solution hasn't worked too well.
I'm in a situation where I have multiple tables within my object model that are date effective. In my situation, the end user has the ability to go back in time in order to issue amendments on this time sensitive data.
So imagine a fairly basic data graph where A -< B -< C.
If only C were date effective I think the scenario becomes pretty basic. i.e. I multiple physical Cs and I can use hib 3's filter functionality in order to pare down my results.
Updating the data becomes a bit tricky however. If one wants to stay away from comp keys then I would resort to some surrogate key mechanism. Each C would have it's own meaningless ID and be bound by an effective date range. This is where it's becoming sticky in my head. In actuality, each physical C is a version of a logical C. By logical I mean that if C were to represent a unique entity, say an Address of 123 Main St., Anywhere USA. When I introduce time effectiveness and change some property such as whether the address is a business or a residence I imagine I would create a new C (with a new ID) with a newly bound date range. In essence I now have 2 records that represent the same address.
Everything is still fine as long as the 2 Cs do not have an overlapping date effectiveness. However, if I may run into a situation where I say that a record is effective from Jan 1, 2004 to Dec 31, 2004. At some later date I say that from Feb 1, 2004 to Feb 15, 2004 there was some change to the record. I have a few possibilities at this point.
a)I try to time slice the existing record, terminating it. Then turn around and create a new record from Feb 1 - Feb 15 and then finally dump in yet another record from Feb 16-Dec 31 that mirrors the Jan 1 one. If you've followed that then you're probably saying, ick :)
b)Dump in a new record from Feb 1-Feb 15 with an addition create or update date and then when I encounter 2 (or more) overlapping records then I take the most recent.
I lean towards b as I think it lends itself to a lesser degree of hocus pocus when modifying records.
However, I believe that it becomes much more tangly if my 'B' is also time effective. If I simply create another B whenever I attempt to update one then I really need to copy/clone/insert your favorite term any associated records that are also valid for that particular timeframe. i.e. B has a a collection of Cs then each C should have a FK reference back to B. If I am inserting a fresh B then I will need to also insert fresh Cs that have a reference back to the newly inserted Bs.
This has the potential in my case to affect A LOT of records. However, this functionality that I am trying to implement should not be used that often therefore I am willing to take the hit of having to create this extra batch of data.
If I am manipulating the C objects directly it might not be that bad, however, with the event of cascades I am trying to find a slick method of flipping my UPDATES to INSERTS when I send the objects off to be persisted. I thought of using the unsaved-value='any' however that doesn't seem to work for a few reasons:
-I do have some comp keys in my system so I use an interceptor to call an isSaved() method. To get to the unsaved-value check I have to return null so that would require some manipulation for it in order to work.
-However, if I return NULL out of the interceptor and check the unsaved-value='any', during a cascade delete hibenate thinks that I have a transient object and rolls over.
So, in essence, for a hibernate solution I am looking for a 'good' way to implement this INSERTING rather than UPDATING behaviour for my time sensitive objects.
I think I can handle the selects, etc via the filtering mechanism (I hope), I'm just having a huge difficulty getting my head wrapped around actually maintaining the data and having it remain sane with the cascading. Worst case scenario I guess I would turn cascading off and handle it manually but that thought makes me cringe.
My 'other' option is to use Oracle's Workspace Manager Valid Time support. We've already opted to use the Workspace Manager in order to provide a working sandbox for the user to manipulate the data in a 'safe' environment, verify their changes, and then merge the data back in.
For those who are not at all familiar with it, the Valid Time support will essentially slice up your data based on VALID_FROM, VALID_TO columns that it adds to your tables and manages them internally. When you open a connection you can specify a valid time range and you will only see data that meets that criteria. It also handles the slicing and dicing of the records and converting the UPDATES to INSERTS.
In the beginning it looked like this was the silver bullet solution to my particular problem. However, in this scenario, what it does is take what you've defined as your PK and wrap it up in its own comp key. So in reality you can get multiple records of what you think is your PK. This is all fine as long as you don't get multiple records within the same time period.
I am thinking that if I had a many-to-one defined that said to load a 'B' with ID 5 and it found 2 of them, chaos would ensue.
Maybe I am over complicating this but if I am, someone please give me a slap and give me a nudge in the right direction will you? I am thinking that the solution that doesn't involve Oracle's valid time support would be cleaner in the sense that it doesn't further tie us to Oracle. But in reality, it doesn't matter as long as we can get it to work.
I'd love to hear any suggestions or experiences with any aspect of this. I need a fresh perspective and hopefully I can find some way to implement this relatively cleanly.
|