-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 6 posts ] 
Author Message
 Post subject: Using NHibernate for long running process
PostPosted: Fri May 25, 2007 5:34 am 
Beginner
Beginner

Joined: Wed Nov 29, 2006 12:23 pm
Posts: 42
I've been working on a system with a large and complex entity model (using 3 sessionfactories/databases etc.) that is used with a web application. I have now reached a point where I need to implement a number of "long running" batch processes, and am wondering what the best approach is. These batch processes involve iterating over a lot of records and traversing many entity associations while producing an output - one instance is producing letters to be printed, the other is carrying out calculations to produce invoices and invoice lines, and there are others as well

Each individual action as part of the batch (i.e. each iteration) is not too long, but the entire batch process could be - up to a number of minutes. I'm concerned about locking of tables impacting on web users.

The entire process doesn't need to be transactional (although in the case of invoicing, this would be nice).

Is the best approach to create a session at the start of the batch process, and then put each iteration of the loop into it's own transaction, beginning and committing the transaction at the start and end of the loop, using the same session for the whole batch? Or should I also open and close the session at the start and end of each iteration? (based on what I've read in the NHibernate manual on page 154, below)

Any advice would be appreciated.

2 of the 3 batch processes I'm writing could be contained to overnight processing, where locking is not such an issue, but the third runs numerous times throughout the day, and could impact web users.

Would another option be detached objects? This would not be preferable, gievn the potential size of the data needed to process the entire batch

Quote:
In a two tiered architecture, consider using session disconnection.
Database Transactions have to be as short as possible for best scalability. However, it is often neccessary to
implement long running Application Transactions, a single unit-of-work from the point of view of a user.
This Application Transaction might span several client requests and response cycles. Either use Detached
Objects or, in two tiered architectures, simply disconnect the NHibernate Session from the ADO.NET connection
and reconnect it for each subsequent request. Never use a single Session for more than one Application
Transaction usecase, otherwise, you will run into stale data.


Top
 Profile  
 
 Post subject:
PostPosted: Sat May 26, 2007 10:32 am 
Beginner
Beginner

Joined: Wed Nov 29, 2006 12:23 pm
Posts: 42
anyone? any tips?


Top
 Profile  
 
 Post subject:
PostPosted: Tue May 29, 2007 1:41 pm 
Expert
Expert

Joined: Fri May 13, 2005 11:13 am
Posts: 292
Location: Rochester, NY
Your post is very open ended, and doesn't seem to lend itself to a quick answer.

I would not suggest using a single session for the whole batch. Session efficiency tends to degrade with size, because of dirty-object checking and God-knows-what-else that goes on inside there.

I use NH for some batch processes. First, I'll generate a single session and then "pilfer" its connection for the rest of the sessions (there is probably a way to do this without generating the session, but it works). From that connection I'll build a Session for the initial load (might be 1000s of graphed objects, so that query needs tuning for production). Then I close the session.

In the processing loop, I generate a session per iteration (using the same connection!). This may require locking the objects of interest for that iteration to the new session first. This usually only involves 10s (or fewer) objects, so the load on these sessions is much less than if one session were used for the whole process. This also fits well with our transactional needs.

YMMV. This also may be a monumentally stupid way of going about this, but at least I'm starting a conversation :)


Top
 Profile  
 
 Post subject:
PostPosted: Tue May 29, 2007 2:10 pm 
Beginner
Beginner

Joined: Wed Nov 29, 2006 12:23 pm
Posts: 42
Thanks very much for the reply. I'm glad you decided to start the conversation, as I was starting to lose hope of a reply. Sorry the initial post was a bit open ended, but I wasn't too sure what to ask. I've created the whole site using a particular architecture, and then wrote the batch processes using some similar concepts, and realised that it may be totally inappropriate. When I started looking around for recommendations for batch processing, apart from the couple of short points in the pdf doc, I was unable to find much. I don't have a great deal of time on this project at this stage to allow for a lot of testing different ideas, or for performance testing, so thought I'd see how others were doing it.

My main concern is how the running of the batch may affect other live users. The batch process won't be very long running (< 5 mins), and only runs 5 times throughout the day. the process itself actually runs on a grunty box, so the overall performance of the batch job itself is not a huge issue (although I'd like to optimise it as much as is easily possible). I guess I'm more concerned about locks etc. being caused in the database while the process is running, preventing web users from working.

I see what you mean about the session growing during the execution of the batch process, and using the same connection throughout the process.

I'll be looking at the batch jobs tomorrow, and shall take your suggested approach into account while reworking the job I've already written.

When executing the processing loop, I'm assuming if any of your objects get updated/created, you just use SaveOrUpdate with whatever session you're using at the time to persist the changes?

For the initial load of your "1000s of graphed objects", do you have separate NHibernate queries that retrieve objects using joins? i.e. retrieve the object hierarchy all at once (i.e. non lazily)?

It sounds like generally speaking the batch process I'm working on is quite similar to yours, so if it works for you, hopefully it will work for me :)

Thanks again.


Top
 Profile  
 
 Post subject:
PostPosted: Tue May 29, 2007 2:44 pm 
Regular
Regular

Joined: Mon May 08, 2006 6:00 am
Posts: 53
Location: India
Not sure how good idea it is, why dont u try to simulate ur batch with dummy data....basically a proof of concept, that shall give you more confidence.....think @ it.

Sudhir


Top
 Profile  
 
 Post subject:
PostPosted: Tue May 29, 2007 3:55 pm 
Expert
Expert

Joined: Fri May 13, 2005 11:13 am
Posts: 292
Location: Rochester, NY
hitch wrote:
My main concern is how the running of the batch may affect other live users. The batch process won't be very long running (< 5 mins), and only runs 5 times throughout the day.


I guess it depends on how much you need to shield the records in question from changes while the batch is running. It sounds like in your case, you may want to just load the identifiers for the records you need to act on. Then in the "iteration" section of your batch, load the data with each new transaction. That should minimize collisions.

Quote:
When executing the processing loop, I'm assuming if any of your objects get updated/created, you just use SaveOrUpdate with whatever session you're using at the time to persist the changes?


Yes.

Quote:
For the initial load of your "1000s of graphed objects", do you have separate NHibernate queries that retrieve objects using joins? i.e. retrieve the object hierarchy all at once (i.e. non lazily)?


Exactly.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 6 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.