-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 
Author Message
 Post subject: Incremental development cycle and seed data (long)
PostPosted: Tue Jun 22, 2004 5:33 pm 
Beginner
Beginner

Joined: Wed Jan 28, 2004 3:42 pm
Posts: 36
Hi All,

Apologies if this is wrong forum for this kind of question.

Each developer in our team has his own Oracle schema. We have a bunch of seed data that is constantly expanding as we make progress in the project with new data from our clients or as required test data is added. This seed data is mostly reference data.

Curently this data is in CSV format. The nature of the project and development cycle results in the databases (schemas really but for simplicity's sake) need to be reloaded frequently as data is added and as the db schema (mapping files) are modified and developed. As much as 5 or more times a day. It's been decided to keep the data in CSV format so that it could be modified with relative ease with a spreadsheet progam when the data model changes. Also, this allows our clients to maintain some of the data or give us additional data.

Currently data is imported with a loader I wrote that uses Hibernate to persist these CSV files. It's a bit messy right now and order of populatin & relationships needs to be resolved programmatically but a new object/table can be added to the infrastructure fairly quickly. Nothing too complicated. It's implemented as a unit test and runs from ant. So the workflow is something like this:

ant schemaexport (creates the new data model)
<manipulate csv data if there was a data model change that requires it>
ant testallrefence (loads the csv files and persists them)

[If only Hibernate could make schema changes without destroying everything first :)] So it's a fairly quick and lightweight process. Inherited objects are represented in their own file with all the properties (including those from the parent object).

Now, exporting the hierachical object model to a "flat" csv view is a little more tricky and I got the bulk of it working using Hibernate for db access but had to divert attention elsewhere. In the mean while we needed something quick to export the db to csv so I wrote something quick using jdbc and just dump the flat view into a bunch of CSV files. I also explored other options for loading the data:

1. Oracle's SQL*Loader which requires control files too, not too hard to generate them with the csv files while you still have access to the metadata.
2. Dump a bunch of insert statements in text files so that the values can be formatted correctly (for example, use TO_DATE for date fields, enclosing varchar2 fields etc).

These solutions are not optimal either, to coordinate the execution of insert statements or calling sqlldr for each table, in the correct order! will require some additional development, especially to make it feasible cross platform as the bulk of the developers are on windows platforms.

So a third solution is to convert the jdbc csv dump to csv files usable by the hibernate csv loader, specifically deal with the inherited classes, should be feasible.

Of course at this point, I have to wonder, what is everybody else doing?:)

Any comments, insights, suggestions welcome!

Thanks,
Peter


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 22, 2004 5:42 pm 
Beginner
Beginner

Joined: Wed Jan 28, 2004 3:42 pm
Posts: 36
Oh also, most ETL tools looks like complete overkill? Or is this the way to go? Oracle's warehouse builder looks like another fine horror from redwood shore. Virtually impossible to install on Fedora as usual. Currently messing around with the windows version.


Top
 Profile  
 
 Post subject:
PostPosted: Wed Jun 23, 2004 10:30 am 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 3:00 pm
Posts: 1816
Location: Austin, TX
If you have access to SQLServer, it has a really nice import/export tool called DTS (Data Transformation Service, I think). It can take data in a large number of formats and storages, optionally run transformations on it, and put it in an equal number of formats/storages. It can deal with CSV files and Oracle databases. Plus the whole transformation process can be saved off kind of like a script and rerun later. I used to use this in a previous job where we supported both SQLServer and Oracle plus had customers sending us data in CSV and Access formats. I used this DTS tool, to keep everything in synch in our Oracle and SQLServer dev and test dbs.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.