I have a repeatable case where my app appears to be hanging or deadlocked trying to flush or commit items to an Oracle DB. In my simplified testing structure, I have up to 6 threads doing inserts, though I believe it will happen with even less because it happens one at a time. One will hang, and then 1-5 minutes later another, and so on until all have hung. Some threads may start, flush, commit, and complete just fine in between. The number of items being committed does not seem to make a difference. I've seen the hang with 8, 16, 50, 200, etc. It will happen within 10-20 transactions of starting, so doesn't take long at all.
I cannot get the hang to happen when not running under JBoss with the JTA. If I just run in a standalone test with the exact same code, it runs without a problem. It also does not seem to happen (at least not easily seen) when running under JBoss but always keeping the number of items to commit low - say somewhere between 50 - 100. As I noted above, I see the hang on threads that are committing very few items, but I wonder if there is some interaction with other threads that are in the process of committing many items (my test allows up to 2000 per thread). I am trying to figure out this factor more now, but wanted to see if anyone had any insight.
Thanks, e
Hibernate version: 3.1rc3 under JBoss 4.0.3sp1 on Windows XP SP2.
Application Setup: My Hibernate mapping docs are in a HAR within an EAR. All classes are in a
.deployer SAR loaded before the HAR.
Mapping documents: The only entity being written is this one:
Code:
<hibernate-mapping default-lazy="false">
<class table="DATASOURCE_ITEMS" batch-size="20" name="com.osc.QueryTreeNG.datasource.DataSourceItem">
<cache usage="read-write"/>
<id name="id">
<generator class="assigned"/>
</id>
<property name="itemId" not-null="true"/>
<many-to-one name="source" class="com.osc.QueryTreeNG.query.QuerySource">
<column name="sourceId"/>
</many-to-one>
</class>
<query name="getItem"><![CDATA[from DataSourceItem doc where doc.source = :source and doc.itemId = :itemId]]></query>
</hibernate-mapping>
The related QuerySouce is always already in the DB and doesnt change:
Code:
<hibernate-mapping default-lazy="false">
<class table="Query_Source" name="com.osc.QueryTreeNG.query.QuerySource">
<cache usage="read-write"/>
<id column="id" name="id">
<generator class="assigned"/>
</id>
<property name="name" unique="true">
<column name="name" index="name_index"/>
</property>
<property name="online">
<column name="is_online"/>
</property>
</class>
<query name="getQuerySource"><![CDATA[from QuerySource as source where source.name = :name]]></query>
<query name="getOfflineSources"><![CDATA[from QuerySource as source where source.online = 0]]></query>
</hibernate-mapping>
Code between sessionFactory.openSession() and session.close(): This is the method where both examples of the hang that I list below occur. As you can see, it's based on the example DAO setup from CaveatEmptor and is pretty straightforward. Most of the logic is for handling the case where another thread commits a duplicate item before this one gets to commit. In that case, we rollback this session and try again. However, the lockups occur even when there have been no conflicting commits.
Code:
public Set<DataSourceItem> getItems(QuerySource source, Collection<String> itemIds)
{
Set<DataSourceItem> retVal = null;
final int flushInterval = 500;
int curTries = 0;
long timer = Calendar.getInstance().getTimeInMillis();
boolean committed = false;
//only have to create items once
retVal = new LinkedHashSet<DataSourceItem>();
for (String itemId : itemIds)
{
DataSourceItem newObj = new DataSourceItem(source, itemId);
retVal.add(newObj);
}
while (!committed)
{
long attemptTimer = Calendar.getInstance().getTimeInMillis();
curTries++;
try
{
int count = 0;
final Session localSession = getSession();
//now persist any new ones
for (DataSourceItem item : retVal)
{
count++;
if (findById(item.getId()) == null)
makePersistent(item);
if (++count % flushInterval == 0)
{
localSession.flush();
localSession.clear();
}
}
//attempt to commit the changes
DaoFactory.getInstance().finishSession(this);
committed = true;
}
catch (Exception e)
{
attemptTimer = Calendar.getInstance().getTimeInMillis() - attemptTimer;
if (curTries < RETRY_ATTEMPTS)
{
log.warn("Error getting items. Elapsed time: " + attemptTimer + " for " + itemIds.size() + " items. Trying again. Try #" + curTries + ". Msg: " + e.getMessage());
DaoFactory.getInstance().rollbackSession(this);
}
else
{
log.error("Could not get items after multiple tries. Elapsed time: " + attemptTimer + " for " + itemIds.size() + " items. Msg: " + e.getMessage());
DaoFactory.getInstance().rollbackSession(this);
throw new RuntimeException(e);
}
}
finally
{
DaoFactory.getInstance().finishSession(this);
}
}
timer = Calendar.getInstance().getTimeInMillis() - timer;
log.debug(ProfileUtils.logTimePerItem("GetDsItems", itemIds.size(), timer));
return retVal;
}
Full stack trace of any exception that occurs: No exception, but have two samples of thread dumps:
Code:
"MQ_4660" prio=5 tid=0x2706a0f0 nid=0xfec runnable [0x2a02f000..0x2a02fae8]
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at oracle.net.ns.Packet.receive(Unknown Source)
at oracle.net.ns.DataPacket.receive(Unknown Source)
at oracle.net.ns.NetInputStream.getNextPacket(Unknown Source)
at oracle.net.ns.NetInputStream.read(Unknown Source)
at oracle.net.ns.NetInputStream.read(Unknown Source)
at oracle.net.ns.NetInputStream.read(Unknown Source)
at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1099)
at oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1070)
at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:478)
at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:216)
at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:955)
at oracle.jdbc.driver.OraclePreparedStatement.executeBatch(OraclePreparedStatement.java:10580)
- locked <0x07b83ab8> (a oracle.jdbc.driver.T4CPreparedStatement)
- locked <0x06bf74a8> (a oracle.jdbc.driver.T4CConnection)
at org.jboss.resource.adapter.jdbc.WrappedStatement.executeBatch(WrappedStatement.java:487)
at org.hibernate.jdbc.BatchingBatcher.doExecuteBatch(BatchingBatcher.java:58)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:195)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:230)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:140)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:296)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:27)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:908)
at com.osc.QueryTreeNG.dao.hibernate.DataSourceItemDaoHib.getItems(DataSourceItemDaoHib.java:127)
at com.osc.QueryTreeNG.dao.hibernate.DataSourceItemDaoHib.getItems(DataSourceItemDaoHib.java:89)
at com.osc.QueryTreeNG.query.MassQuery.loadTermResults(MassQuery.java:538)
at com.osc.QueryTreeNG.query.MassQuery.doRun(MassQuery.java:357)
at com.osc.util.dao.SessionAwareRunner.run(SessionAwareRunner.java:26)
at java.lang.Thread.run(Thread.java:595)
And:
Code:
"MQ_4656" prio=5 tid=0x287195f0 nid=0x3ac runnable [0x29f2f000..0x29f2fc68]
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at oracle.net.ns.Packet.receive(Unknown Source)
at oracle.net.ns.DataPacket.receive(Unknown Source)
at oracle.net.ns.NetInputStream.getNextPacket(Unknown Source)
at oracle.net.ns.NetInputStream.read(Unknown Source)
at oracle.net.ns.NetInputStream.read(Unknown Source)
at oracle.net.ns.NetInputStream.read(Unknown Source)
at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1099)
at oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1070)
at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:478)
at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:216)
at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:955)
at oracle.jdbc.driver.OraclePreparedStatement.executeBatch(OraclePreparedStatement.java:10580)
- locked <0x07a9f368> (a oracle.jdbc.driver.T4CPreparedStatement)
- locked <0x0789ee48> (a oracle.jdbc.driver.T4CConnection)
at org.jboss.resource.adapter.jdbc.WrappedStatement.executeBatch(WrappedStatement.java:487)
at org.hibernate.jdbc.BatchingBatcher.doExecuteBatch(BatchingBatcher.java:58)
at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:195)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:230)
at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:140)
at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:296)
at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:27)
at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:908)
at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:344)
at org.hibernate.transaction.CacheSynchronization.beforeCompletion(CacheSynchronization.java:59)
at org.jboss.tm.TransactionImpl.doBeforeCompletion(TransactionImpl.java:1473)
at org.jboss.tm.TransactionImpl.beforePrepare(TransactionImpl.java:1092)
at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:306)
at org.jboss.tm.TxManager.commit(TxManager.java:224)
at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:126)
at org.hibernate.transaction.JTATransaction.commit(JTATransaction.java:146)
at com.osc.util.hibernate.HibernateManager.commitTransaction(HibernateManager.java:371)
at com.osc.util.hibernate.HibernateManager.closeSession(HibernateManager.java:287)
at com.osc.util.hibernate.HibernateDaoFactory.finishSession(HibernateDaoFactory.java:63)
at com.osc.QueryTreeNG.dao.hibernate.DataSourceItemDaoHib.getItems(DataSourceItemDaoHib.java:132)
at com.osc.QueryTreeNG.dao.hibernate.DataSourceItemDaoHib.getItems(DataSourceItemDaoHib.java:89)
at com.osc.QueryTreeNG.query.MassQuery.loadTermResults(MassQuery.java:538)
at com.osc.QueryTreeNG.query.MassQuery.doRun(MassQuery.java:357)
at com.osc.util.dao.SessionAwareRunner.run(SessionAwareRunner.java:26)
at java.lang.Thread.run(Thread.java:595)
Name and version of the database you are using: Oracle 10g with newest driver (10.2.1.0). Also happens with last driver (10.1.2.0).