-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 
Author Message
 Post subject: Strange Caching Problem
PostPosted: Sat Dec 24, 2005 3:28 am 
Newbie

Joined: Thu Nov 17, 2005 5:50 am
Posts: 6
We have a Spring/Hibernate web application running on two clustered Tomcat servers with LVS for load balancing. In turn, the datasource used by the servers load balances between two Oracle 10g database servers in an RAC.

We have a function in the application that displays a list and allows users to remove items from that list.

We process the request by removing the item from the list in one request and then issuing a redirect to the browser to request the list again. ( this is so the user can hit refresh on the list page without re-trying the removal )

Sometimes, when the user removes an item, it is still displayed in the list afterwards. ( until the user does a refresh)

I have determined that when this happens, the request to perform the update and the second request to query the list are serviced by different Tomcat servers.

On the face of it, it sounds like a cache that is holding data that has been updated in the DB by the other process.

But:

- We are not using second-level caching anywhere so It can't be that.

- We are using Spring's OpenSessionInView filter, so there should be a new session for each request. Given this, I don't see how it could be a stale session cache.

- The isQueryCaching() method on Spring's Hibernate Template reports that query caching is not enabled.

Given all this, I am at a loss to explain it from a Hibernate perspective.

Does anyone know of a possible explanation that I have missed?

cheers
Perryn

PS Failing that does anyone know of a possible caching problem between the two Oracle servers?


Top
 Profile  
 
 Post subject:
PostPosted: Fri Jan 20, 2006 6:37 pm 
Newbie

Joined: Wed Aug 24, 2005 3:16 pm
Posts: 7
Location: Minneapolis, MN
Perryn, Have you any more information on your situation? I am wondering if this is due to the RAC environment of the Oracle database.

I have been experiencing the same behavior. Like you, I have an action (the Index action) that displays a list of items. When the user selects to delete an item, the request is handle by the Delete action. After the item is deleted, the Delete action redirects the user back to the Index action. Initially the list still includes the deleted item, but a quick refresh by the user causes it to disappear from the list. I get a similar result with adding a new Item. After the item is added, the Add action redirects to the Index action, and the new item is not included in the list. This behavior is not consistent, however, and sometimes things work just fine.

Also like you, the item class is not set up to use the second level cache, nor is the query used by the Index action being cached.

We are using Oracle 9i, but it is a RAC setup.

After discussing this with the DBA, he suggested that this is a result of the RAC environment. With the RAC environment, there are multiple nodes, each containing a copy of the database. Any given database connection works directly with just one node. When the delete transaction is committed, the node for the database connection is changed immediately, but the commit is not broadcasted to the other nodes immediately. Since there is a redirect between the Delete action and the Index action, the Index action gets its own database connection from the pool, which may not be the same connection as the Delete action. More importantly, the database connection for the Index action may be for a different node than the one for the Delete action. The query for the Index action may not be seeing that the item was deleted because the commit has not yet been propogated to that node.

*** Of course, this is JUST A THEORY at this point.

BUT IF THIS IS THE PROBLEM, there is a setting for the RAC that would eliminate it. There is a configuration setting named MAX_COMMIT_PROPAGATION_DELAY. This setting determines the maximum amount of time that may pass before a commit to one node is propogated to the other nodes. It is specified in hundreds of a second. It defaults to 700, which means the maximum time is 7 seconds. If this value is set to zero (or apparently anything less than 100), a commit will be immediatley propagated to all nodes. This is known as "broadcast on commit."

There is, of course, a performance trade-off for broadcast-on-commit. Accordingly to the DBA, Oracle recommends against using broadcast-on-commit. Therefore, he is unwilling to consider it. Instead, he is suggesting one of the following changes be made to the code:

(1) A 7 second delay be placed in the code after the delete trx is committed and before the redirect is issued. This would ensure that the commit has been propogated to all other nodes before any subsequent queries are executed for the user. (Unless the user doesn't actually wait for the response and navigates to another page.) I find this unacceptable. Of course, a shorter delay like 1 or 2 seconds would probably be long enough for the commit to be propagated most of the time. Perhaps a compromised could be reached, and the MAX_COMMIT_PROPAGATION_DELAY could be reduced, but not all the way to zero.

(2) Change the redirect to a forward, so the query to get the list would use the same database connection that was used by the delete transaction. Again, I do not find this acceptable. I consider the Redirect-After-Post (aka Post-Redirect-Get) idiom to be a best practice.

In either case, I don't like that the RAC environment causes a change in code like this.

Does anyone else have any experience with this? In particular, does the theory sound reasonable? Is this really the reality of a RAC environment? If so, what is the best way to deal with it? Is changing the Oracle setting to broadcast-on-commit really that big of a performance hit, or is it the way to go?

I apologize that this appears to have nothing to do with Hibernate specifically, but my first reaction was that this was a caching issue, too.

-John


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.