We are faced with the task of migrating data in an DB2/AS400 database to an Oracle 10g database. We are talking about 7 tables in the DB2 environment which has to be replicated in the Oracle environment.
The issue is that on the Oracle side of things, we are generating our IDs with sequences. This was not the case on the DB2 side. Once all data is migrated over to the Oracle database, all relations between all the tables has to be maintained. On the DB2 side, they did this with composite keys and keys that had too much business meaning. This is what we wanted to avoid on the Oracle side of things.
So when tasked with this enourmous project, we had very little time and very little expertise in the field of migrating large amounts of data from one database to another. After using Hibernate in a few projects in the past, I came to the conclusion that we could accomplish this project with Hibernate in a somewhat easy way. Although there would probably be a performance penalty, we were attracted to the fact that we could load a fully initialized object graph from the DB2 side, compose our objects on the Oracle side call save on the session and Hibernate would obviously handle all of the key generation and relations between our object graph.
So our first crack at this is very simple. We have a data loader class that uses a ScrollableResultSet and fectches 20 fully initialized objects at a time. It returns this list of 20 objects to another class which in turn creates a separate thread to compose the Oracle objects from the DB2 objects and save them to the Oracle database.
This data loader class has one significant method:
public List fetchNextBatch() throws Exception {
List result = new ArrayList();
int count = 0;
if(!resultInitialized()){
BaseDAO base = (BaseDAO)as400MigrationDao;
session = base.getSessionFactory().openSession();
Query q = session.createQuery(BaseDAO.POLICY);
results = q.scroll();
started = true;
}
while(results.next()){
Policy as400Pol = (Policy) results.get(0);
result.add(as400Pol);
count++;
if(count == BATCHSIZE || results.isLast()){
session.flush();
session.clear();
return result;
}
}
return null;
}
As you can see, every BATCHSIZE number of times the session gets cleared and flushed and the list of results is returned to the calling client. This client then calls a saver class which is responsible for composing the Oracle objects and saving them in a separate thread:
public void run() {
List oraPolicies = new ArrayList();
Policy as400Pol = null;
try{
for(int i = 0; i < policies.size(); i++){
as400Pol = (Policy) policies.get(i);
OraPolicy oraPolicy = new OraPolicy();
getImportService().composePolicy(oraPolicy, as400Pol);
oraPolicies.add(oraPolicy);
}
oraDao.savePolicies(oraPolicies);
//null out the policies just in case is not being garbage collected
this.policies = null;
}
catch(RuntimeException rex){
MigrationUtils.handlePolicySaveExceptions(as400Pol.getReferenceNumber(), as400Pol.getPolicyNumber(), rex);
}
catch(Exception ex){
MigrationUtils.handlePolicySaveExceptions(as400Pol.getReferenceNumber(), as400Pol.getPolicyNumber(), ex);
}
}
In the above example, policies is the List of policies obtained from the DB2 database. We loop through those policies, and for each policy we build an Oracle policy, add it to a different List and then call a DAO which saves all the newly composed Oracle policies.
In a nutshell this is our process. We have encountered recently an out of memory error when processing more than 40,000 policies. As you will notice, we are flushing and clearing the session from the loading standpoint. So if anyone can possibly give me any pointers of where else to look and identify the memory problem I would appreciate it.
I understand that an ORM tool might not be the right solution in terms of performance for this type of project, but we are not so much worried about performance as much as being able to get the data over to the Oracle database with all dependencies maintained and with keys that do not have any business meanings. We are especially satisfied how we can load a graph of objects from the DB2 side, and recreate the different objects on the Oracle side and save them relatively easily with tools like BeanUtils and a little bit of hand coding at times. So from our needs, we are getting about 90% of what we need from Hibernate, we just need to figure out the memory issue and we would be on our way.
I apologize for the long post and for not following the posting guidelines, but I didn't see it as necessary given the nature of my post.
--Thanks in advance,
Juan
|