First: I *know* this is not a Hibernate issue, per se. However, I also know this forum way better than any other forum, and I have gotten way more help here than anywhere else ever, so I'm taking a chance on posting here.
Our enterprise application includes a distributed set of data-munging servers (coordinated via JMS and Hibernate). This is basically a distributed batch-processing job-handling system. When there are bugs in the job scheduling logic, it's painful to go looking through all the different log files on the various servers. So I've been working on adding a central database log of the actions taken by the job system.
The basic top-level object in the job system is a "WorkRequest" which is kind of like a "job ticket" -- it describes a set of related data processing operations that should be scheduled and run as efficiently as possible. The basic logging object is a "WorkLogEntry". WorkLogEntries refer to WorkRequests. WorkRequests cascade delete to WorkLogEntries (to clean up the logging info when we clean up the WorkRequest itself).
The relevant mapping portions look like this:
Code:
<hibernate-mapping>
<class name="com.nimblefish.core.domain.work.WorkRequest">
<meta attribute="generated-class">com.nimblefish.core.domain.generated.WorkRequestGen</meta>
<id name="id" type="long" unsaved-value="null" >
<meta attribute="scope-set">private</meta>
<generator class="native"/>
</id>
<!-- the time at which the last task in this WorkRequest was run.
Used by the work scheduler to ensure fairness among all running WorkRequests. -->
<property name="lastTaskTime" type="long"/>
...
<!-- The WorkLogEntries pertaining to this WorkRequest. -->
<set name="workLogEntries" cascade="delete" inverse="true" lazy="true">
<key column="workrequest_id"/>
<one-to-many class="com.nimblefish.core.domain.work.WorkLogEntry"/>
</set>
</class>
</hibernate-mapping>
<hibernate-mapping>
<class name="com.nimblefish.core.domain.work.WorkLogEntry">
<meta attribute="generated-class">com.nimblefish.core.domain.generated.WorkLogEntryGen</meta>
<id name="id" type="long" unsaved-value="null" >
<meta attribute="scope-set">private</meta>
<generator class="native"/>
</id>
<property name="server" type="string"/>
<property name="env" type="string"/>
<property name="timestamp" type="timestamp"/>
<property name="level" type="string"/>
<property name="loggingClass" type="string"/>
<property name="message" type="string"/>
<many-to-one name="workRequest" column="workrequest_id" not-null="true"
class="com.nimblefish.core.domain.work.WorkRequest"/>
<many-to-one name="job" column="job_id"
class="com.nimblefish.core.domain.work.Job"/>
</class>
</hibernate-mapping>
This is all relatively fine. However, the weird part is that when I run this locally -- with my J2EE server, EJB-client Junit acceptance test, and MySQL InnoDB database all colocated on my desktop box -- I am getting all kinds of timing-dependent deadlocks.
For example, if I have two threads both performing this kind of code:
Code:
WorkLogEntry entry = new WorkLogEntry(
BarracudaProperties.getProperty("barracuda.server"),
BarracudaProperties.getProperty("barracuda.env"),
new Timestamp(System.currentTimeMillis()), level,
loggingClass,
message,
workRequest, job);
// TODO: research (stress test?) whether explicit save is much more expensive than cascade save
// (and group flush from WorkRequest).
Persistence.save(entry);
MySQL reports the following (from "show innodb status"):
[code]------------------------
LATEST DETECTED DEADLOCK
------------------------
040409 15:35:16
*** (1) TRANSACTION:
TRANSACTION 0 210343, ACTIVE 0 sec, OS thread id 3148 inserting
mysql tables in use 1, locked 1
LOCK WAIT 8 lock struct(s), heap size 1024, undo log entries 1
MySQL thread id 395, query id 147979 localhost 127.0.0.1 dummy update
insert into WorkLogEntry (server, env, timestamp, level, loggingClass, message, workrequest_id, job_
id) values ('localhost', 'local_test', '2004-04-09
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 16388 n bits 72 table barracuda_test/workrequest index PRIMARY trx i
d 0 210343 lock mode S locks rec but not gap waiting
Record lock, heap no 4 RECORD: info bits 0 0: len 8; hex 8000000000000003; asc