Hi,
we're trying to do a bulk/batch insert of about 100000 objects uploaded via a web service (XFire) in packages containing 1000 objects.
The environment is a JBoss application server with container managed persistence.
At the moment, each single object is stored in the database in its own transaction, managed by the container. Here is an example of the DAO class:
Code:
@javax.ejb.TransactionAttribute(javax.ejb.TransactionAttributeType.REQUIRED)
public abstract class EmployerDaoBase
implements EmployerDao
{
/**
* Inject persistence context ejb3xfire
*/
@javax.persistence.PersistenceContext(unitName = "ejb3xfire")
protected javax.persistence.EntityManager emanager;
...
public Object create(final Employer employer)
throws EmployerDaoException
{
if (employer == null)
{
throw new IllegalArgumentException(
"Employer.create - 'employer' can not be null");
}
try
{
emanager.persist(employer);
emanager.flush();
return employer;
}
catch (Exception ex)
{
throw new EmployerDaoException(ex);
}
}
...
}
The reason for this is that if the insert of one object fails, the other objects should be inserted in the database anyway.
As far as I have seen in the examples for batch/bulk insertions with flush/clear after inserting a number of hibernate.jdbc.batch_size items (see
http://www.hibernate.org/hib_docs/reference/en/html/batch.html) all inserted items would be rolled back if only one item fails, right?
Has somebody an idea how we can improve the performance with a batch/bulk insert that does not completely fail even if one or more items can not be inserted?
Additionally, we are facing a second issue:
Our objects, e.g. Employer in our example above, hold a lazy reference to an Emplyoee class (many-to-one, 1..*). The referenced Employees do already exist in the database when the Employers are inserted.
Now, if an Employer object is being persisted, Hibernate gets a database snapshot to check if the referenced employee exists in the database to avoid a foreign key constraint violation. In our case, this is not necessary and causes an additional SELECT to the database.
I tried to use a StatelessSession in the DAO shown above to check the performance gain and it is about 30%, because the inserts of the Employers are performed without checking the referenced Employees (and some other features of the normal session are omited).
Code:
@TransactionAttribute(TransactionAttributeType.NOT_SUPPORTED)
public Object create(final Employer employer)
throws EmployerDaoException
{
if (employer == null)
{
throw new IllegalArgumentException(
"Employer.create - 'employer' can not be null");
}
try
{
HibernateEntityManagerFactory hibEMF =
(HibernateEntityManagerFactory) ((InjectedEntityManagerFactory)emf).getDelegate();
SessionFactory sf = hibEMF.getSessionFactory();
StatelessSession session = sf.openStatelessSession();
org.hibernate.Transaction tx = session.beginTransaction();
session.insert(employer);
tx.commit();
return employer;
}
catch (Exception ex)
{
throw new EmployerDaoException(ex);
}
}
But I think this is not the best solution. Is it possible to use a native query inside a container managed environment to accomplish this?
Sorry, my knowledge of this topic is still limited and I couldn't find a hint anywhere else ...
Thanks,
Jürgen
Hibernate version: 3.2.4.sp1 (JBoss 4.2.2.GA)
Name and version of the database you are using: Oracle 10.2.01