-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 13 posts ] 
Author Message
 Post subject: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Tue Jul 27, 2010 4:23 pm 
Beginner
Beginner

Joined: Wed Nov 10, 2004 5:48 pm
Posts: 32
Location: Portland
HSearch version 3.2.0.Final

Very odd. At one of my customers this problem is consistently repeatable. A process runs that adds a lot of new @Indexed entities each of which has at least one @IndexedEmbedded entity. Consistently one thread gets hung - this thread is using 100% of one CPU core. It appears to be stuck that way forever now.

If they restart their server, the next time the process runs we get stuck in the exact same place.

Here's the thread dump in question in case anyone has a suggestion:

Code:
"schedulerFactoryBean_Worker-5" prio=10 tid=0x000000000d6d9800 nid=0x5ca3 runnable [0x0000000042786000..0x0000000042787c10]
   java.lang.Thread.State: RUNNABLE
   at org.hibernate.search.engine.DocumentBuilderIndexedEntity.addWorkToQueue(DocumentBuilderIndexedEntity.java:319)
   at org.hibernate.search.engine.DocumentBuilderContainedEntity.addWorkForEmbeddedValue(DocumentBuilderContainedEntity.java:726)
   at org.hibernate.search.engine.DocumentBuilderContainedEntity.processSingleContainedInInstance(DocumentBuilderContainedEntity.java:709)
   at org.hibernate.search.engine.DocumentBuilderContainedEntity.processContainedInInstances(DocumentBuilderContainedEntity.java:664)
   at org.hibernate.search.engine.DocumentBuilderContainedEntity.processSingleContainedInInstance(DocumentBuilderContainedEntity.java:705)
   at org.hibernate.search.engine.DocumentBuilderContainedEntity.processContainedInInstances(DocumentBuilderContainedEntity.java:659)
   at org.hibernate.search.engine.DocumentBuilderContainedEntity.addWorkToQueue(DocumentBuilderContainedEntity.java:612)
   at org.hibernate.search.backend.impl.BatchedQueueingProcessor.addWorkToBuilderQueue(BatchedQueueingProcessor.java:270)
   at org.hibernate.search.backend.impl.BatchedQueueingProcessor.processWorkByLayer(BatchedQueueingProcessor.java:248)
   at org.hibernate.search.backend.impl.BatchedQueueingProcessor.prepareWorks(BatchedQueueingProcessor.java:147)
   at org.hibernate.search.backend.impl.PostTransactionWorkQueueSynchronization.beforeCompletion(PostTransactionWorkQueueSynchronization.java:70)
   at org.hibernate.search.backend.impl.EventSourceTransactionContext$DelegateToSynchronizationOnBeforeTx.doBeforeTransactionCompletion(EventSourceTransactionContext.java:144)
   at org.hibernate.engine.ActionQueue$BeforeTransactionCompletionProcessQueue.beforeTransactionCompletion(ActionQueue.java:530)
   at org.hibernate.engine.ActionQueue.beforeTransactionCompletion(ActionQueue.java:211)
   at org.hibernate.impl.SessionImpl.beforeTransactionCompletion(SessionImpl.java:563)
   at org.hibernate.jdbc.JDBCContext.beforeTransactionCompletion(JDBCContext.java:229)
   at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:142)
   at org.hibernate.ejb.TransactionImpl.commit(TransactionImpl.java:76)
   at org.springframework.orm.jpa.JpaTransactionManager.doCommit(JpaTransactionManager.java:467)
   at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:754)
   at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:723)
   at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:375)
   at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:120)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:89)
   at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
   at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
   at $Proxy103.processEvent(Unknown Source)
   at com.attensa.core.job.AggregationJobImpl.executeInternal(AggregationJobImpl.java:75)
   at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86)
   at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
   at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:525)


Top
 Profile  
 
 Post subject: Re: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Wed Jul 28, 2010 12:06 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
DocumentBuilderIndexedEntity.java:319

sorry there's no code in 3.2.0.Final at that line .. please explain: if there's an issue I'd like to track it.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Wed Jul 28, 2010 6:25 pm 
Beginner
Beginner

Joined: Wed Nov 10, 2004 5:48 pm
Posts: 32
Location: Portland
Well I didn't expect that - I checked and you're correct. Perhaps somehow this customer has an older hsearch jar on their classpath. I'll check that possibility first.


Top
 Profile  
 
 Post subject: Re: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Fri Jul 30, 2010 3:45 pm 
Beginner
Beginner

Joined: Wed Nov 10, 2004 5:48 pm
Posts: 32
Location: Portland
We verified that the correct version of Hibernate search is in use:
hibernate-search-3.2.0.Final.jar 457,393

Stopped and redeployed the server with a fresh copy of the app, still seeing the exact same problem. Ran the scheduled job where we see the issue. Same thing, still on line 319.

Thread hung, 100% CPU, stack trace:


Code:
"schedulerFactoryBean_Worker-2" prio=10 tid=0x00002aab793e2800 nid=0x596a runnable [0x0000000042fe4000]
   java.lang.Thread.State: RUNNABLE
   at org.hibernate.search.engine.DocumentBuilderIndexedEntity.addWorkToQueue(DocumentBuilderIndexedEntity.java:319)
   at org.hibernate.search.engine.DocumentBuilderContainedEntity.addWorkForEmbeddedValue(DocumentBuilderContainedEntity.java:726)
   at org.hibernate.search.engine.DocumentBuilderContainedEntity.processSingleContainedInInstance(DocumentBuilderContainedEntity.java:709)
   at org.hibernate.search.engine.DocumentBuilderContainedEntity.processContainedInInstances(DocumentBuilderContainedEntity.java:664)
   at org.hibernate.search.engine.DocumentBuilderContainedEntity.processSingleContainedInInstance(DocumentBuilderContainedEntity.java:705)
   at org.hibernate.search.engine.DocumentBuilderContainedEntity.processContainedInInstances(DocumentBuilderContainedEntity.java:659)
   at org.hibernate.search.engine.DocumentBuilderContainedEntity.addWorkToQueue(DocumentBuilderContainedEntity.java:612)
   at org.hibernate.search.backend.impl.BatchedQueueingProcessor.addWorkToBuilderQueue(BatchedQueueingProcessor.java:270)
   at org.hibernate.search.backend.impl.BatchedQueueingProcessor.processWorkByLayer(BatchedQueueingProcessor.java:248)
   at org.hibernate.search.backend.impl.BatchedQueueingProcessor.prepareWorks(BatchedQueueingProcessor.java:147)
   at org.hibernate.search.backend.impl.PostTransactionWorkQueueSynchronization.beforeCompletion(PostTransactionWorkQueueSynchronization.java:70)
   at org.hibernate.search.backend.impl.EventSourceTransactionContext$DelegateToSynchronizationOnBeforeTx.doBeforeTransactionCompletion(EventSourceTransactionContext.java:144)
   at org.hibernate.engine.ActionQueue$BeforeTransactionCompletionProcessQueue.beforeTransactionCompletion(ActionQueue.java:530)
   at org.hibernate.engine.ActionQueue.beforeTransactionCompletion(ActionQueue.java:211)
   at org.hibernate.impl.SessionImpl.beforeTransactionCompletion(SessionImpl.java:563)
        ...


Top
 Profile  
 
 Post subject: Re: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Fri Jul 30, 2010 7:44 pm 
Beginner
Beginner

Joined: Wed Nov 10, 2004 5:48 pm
Posts: 32
Location: Portland
We've narrowed this down a bit. The problem only occurs during transactions that add a large amount of luceneWork.

In one case we were able to reproduce the problem by updating about 1000 @Indexed entities (each with average two @IndexedEmbedded entities included within) in a single transaction.

The problem appears to be that the algorithm used in addWorkToQueue is inefficient for large N. In our case it can get so bad that this runs for days.

If I understand correctly, this seems to be a known issue per this comment in the source code:

Code:
//TODO with the caller loop we are in a n^2: optimize it using a HashMap for work recognition


Of course a possible workaround for us is to try to find a way to break the work into multiple smaller transactions. This would be a massive change to our system but it might be worthwhile for us to make the change in any case just because of other issues (like locking+concurrency) introduced by transactions that are too large.


Top
 Profile  
 
 Post subject: Re: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Sat Jul 31, 2010 5:50 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
thanks for the valuable insight, I've opened an issue to track this: http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-570

If you happen to take a closer look to the addWorkToQueue method feel free to suggest a patch ;)

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Sat Jul 31, 2010 5:31 pm 
Beginner
Beginner

Joined: Wed Nov 10, 2004 5:48 pm
Posts: 32
Location: Portland
We should discuss design a bit. I might be able to provide a patch but my initial idea is a very big change. Because I am not familiar with hsearch internals I am hesitant to attempt such a large change.

I think that it might be worth considering that List<LuceneWork> is not the best data structure given the way this data is used. To make it more refactorable in the future it would be best to create a new class LuceneWorkQueue that abstracts operations on a queue and hides the storage implementation.

Doing this would of course would have a very big impact.

Based on the logic in addWorkToQueue it seems that the best data structure might be:
Map<Class, Map<Serializable, LuceneWork>>

Outer map key is entityClass, inner map key is id. No looping would be needed in the main part of DocumentBuilderIndexedEntity. What do you think? Do you have a suggestion that is easier?


Top
 Profile  
 
 Post subject: Re: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Sat Jul 31, 2010 7:46 pm 
Beginner
Beginner

Joined: Wed Nov 10, 2004 5:48 pm
Posts: 32
Location: Portland
Also in the plan I've described, LuceneWorkQueue would need to maintain internally both the Map of Maps and a List. The List is needed to preserve ordering.


Top
 Profile  
 
 Post subject: Re: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Fri Oct 08, 2010 7:50 am 
Hibernate Team
Hibernate Team

Joined: Sun Sep 14, 2003 3:54 am
Posts: 7256
Location: Paris, France
Hi jnadler,
Sorry we dropped the ball on this for so long. If you are still in the game, I would appreciate your ideas and even better patch :)

List<LuceneWork> is a a semi-public class used when jobs are serialized / deserialized and in a few other semi-public APIs like WorkQueue. But internally we can have an intermediate structures and it seems you want the optimized structure within BatchedWueueingProcessor#prepareWorks

We have a copy of Hibernate Search on GitHub and it's trivial to do a fork and play with the source code http://github.com/emmanuelbernard/hibernate-search

Let me know if you are still interested.

Emmanuel

_________________
Emmanuel


Top
 Profile  
 
 Post subject: Re: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Fri Oct 08, 2010 12:12 pm 
Beginner
Beginner

Joined: Wed Nov 10, 2004 5:48 pm
Posts: 32
Location: Portland
Hi Emmanuel,

Thanks for the follow up. I'm happy to work on this but it will take me some time: Wife's due for a baby any day now and I'll be out from work for a bit.

If I recall correctly my idea was that before this loop starts, we transform the data structure to Map<Class, Map<Serializable, LuceneWork>> - outer map key is entityClass, inner map key is id.

Once this is done, no looping would be needed in the main part of DocumentBuilderIndexedEntity, it's just a couple of map lookups. For a given class, for a given key, get its LuceneWork.

Does this make sense at least abstractly? I'm not confident with the HSearch internal design so I'd love some validation from you guys before I build the patch.

Thanks again,

Jeff


Top
 Profile  
 
 Post subject: Re: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Fri Oct 08, 2010 12:14 pm 
Beginner
Beginner

Joined: Wed Nov 10, 2004 5:48 pm
Posts: 32
Location: Portland
For anyone else having a similar problem: I worked around this in my application by breaking the work up into multiple smaller transactions. Needless to say this is easier on the DB as well. Just wanted to make it clear that for most apps it probably isn't strictly necessary to do such large transactions.

In my case the only downside is the need to catch any exceptions and do some specialized clean-up to preserve atomicity.


Top
 Profile  
 
 Post subject: Re: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Fri Oct 08, 2010 12:22 pm 
Hibernate Team
Hibernate Team

Joined: Sun Sep 14, 2003 3:54 am
Posts: 7256
Location: Paris, France
Yes it made sense to me at least :)

I wonder if we should have something that kicks in on for big lists of work. Ie the regular work for small and medium lists and the extra data structure creation for the big lists. I'm a bit concerned that the structure creation brings overhead and nothing more for most use cases.

_________________
Emmanuel


Top
 Profile  
 
 Post subject: Re: Indexing Hangs Thread in addWorkToQueue on TX Commit
PostPosted: Fri Oct 08, 2010 12:29 pm 
Beginner
Beginner

Joined: Wed Nov 10, 2004 5:48 pm
Posts: 32
Location: Portland
Makes sense. I'm a little nervous about having two distinct code paths for this core functionality, it's always easy for someone to change the 'main' path in the future and forget about the 'big data' path.

Still I understand the concern about overhead for the typical transaction with perhaps 10 or less LuceneWork items.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 13 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
cron
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.