-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 6 posts ] 
Author Message
 Post subject: How to investigate slow index updates?
PostPosted: Sat May 15, 2010 5:25 am 
Beginner
Beginner

Joined: Wed Sep 30, 2009 5:29 am
Posts: 29
Hi all,

I have a web app running Hibernate 3.5.1 and Search 3.2.0. I had similar problems with Hibernate 3.3.1 and Search 3.3.1, however, upgrading hasn't fixed the problem. I only get this problem on my VPS box, at the hosting company, my local installation seems fine. My question is about how to enable better logging to help investigate. I use java.util.logging, and Hibernate Search seems to do so also (via commons-logging?) yet I don't see enough information to solve my problem.

I'm running on test data, there are only about 10 objects being updated in this operation (there are only about 20 objects in total in the entire database). I've tried deleting my entire index directory, re-indexing my database, and running the operation again, same result. So, I don't think my index is stuffed with old or otherwise incorrect data.

The problem is that updating my objects, which are stored in a MySQL database, persisted by Hibernate, and indexed by Hibernate Search to a local FSDirectoryProvider, is very slow. I've tracked it down to something in the Hibernate Search indexing operations, while a transaction is being committed.

My code does the usual:
    - start a session
    - start a transaction
    - load objects using a Criteria query
    - modify some objects
    - commit the transaction
    - close the session

When the 'commit the transaction' step is reached, there is a significant pause between the call to close the transaction, and the call to close the session, and turning on java.util.logging levels for hibernate search shows it's probably something in the indexing operation:
Code:
15-May-2010 09:49:49 ltd.myapp.Brains interpretMessage
FINE: processMessage: committing active transaction.
15-May-2010 09:49:49 ltd.myapp.db.dbcp.DBCPConnectionProvider logStatistics
INFO: active: 0 (max: 15)   idle: 2(max: 8)
15-May-2010 09:49:49 org.hibernate.search.backend.impl.lucene.PerDPQueueProcessor run
FINE: Opening an IndexWriter for update
15-May-2010 09:49:49 org.hibernate.search.backend.impl.lucene.PerDPQueueProcessor run
FINE: Opening an IndexWriter for update
15-May-2010 09:49:53 ltd.myapp.Brains interpretMessage
FINE: processMessage closing session.
15-May-2010 09:49:53 ltd.myapp.Brains interpretMessage
FINER: processMessage session closed.
15-May-2010 09:49:53 ltd.myapp.MyService processMessage
FINER: MyService saw interpretMessage returned.
15-May-2010 09:49:53 ltd.myapp.MyService _ctl
FINE: ****** RETURN SERVICE MESSAGE: uid 3, messageClass: subscribeToNotification
15-May-2010 09:49:53 ltd.myapp.MyService _ctl
INFO: TIMING subscribeToNotification took (4730) ms


After the PerDPQueueProcessor line of "Openening an IndexWriter for update" there's a 4 second pause, leading to my simple data update request to take 4.7 seconds in total. The pause is sometimes greater, leading to operations of up to 15 seconds for fairly trivial updates.

My indexes are stored in a local filesystem, in /var/lucene/indexes. It's not NFS mounted or otherwise (according to 'mount'), I can't see why file system access would be degrading performance.

My logging.properties file contains:
Quote:
org.hibernate.search.level = FINE
org.apache.lucene.level = FINE


I tried turning on some config in hibernate.cfg.xml:
Code:
      <property name="hibernate.search.worker.execution">sync</property>
      <property name="hibernate.search.default.optimizer.transaction_limit.max">100</property>


I also found a post on this forum telling me to try something like this:
Code:
      <property name="log4j.logger.org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate">trace</property>


However, this doesn't work, I see no extra logging.. I don't use log4j in my project, should I provide a log4j.xml file with any specific setup to debug Hibernate Search?

What should I do to further investigate the slow indexing operation? I've a feeling that extra logging may help, I just don't see how to turn it on.

Thanks :)
Nick


Top
 Profile  
 
 Post subject: Re: How to investigate slow index updates?
PostPosted: Sat May 15, 2010 5:50 am 
Beginner
Beginner

Joined: Wed Sep 30, 2009 5:29 am
Posts: 29
The obvious occurred to me, I set the two j.u.l levels to FINEST instead of FINE, and got a little more info. I've changed entity names and cut out the Document<> contents for brevity. The situation is that a single Parent object, and the 7 Child objects that it owns (in a @OneToMany List<>) are all being updated, having previously existed in both database and index.

Code:
15-May-2010 10:39:30 org.hibernate.search.backend.impl.BatchedQueueingProcessor performWorks
FINEST: Lucene WorkQueue to send to backend:
        DeleteLuceneWork: ltd.myapp.db.ParentEntity#fa4fde2e-2c76-43ef-8f90-4f531d9e9745
        AddLuceneWork: ltd.myapp.db.ParentEntity#fa4fde2e-2c76-43ef-8f90-4f531d9e9745
        DeleteLuceneWork: ltd.myapp.db.ChildEntity#726015c2-40bc-4a92-8f55-8b895230af83
        AddLuceneWork: ltd.myapp.db.ChildEntity#726015c2-40bc-4a92-8f55-8b895230af83
        DeleteLuceneWork: ltd.myapp.db.ChildEntity#769ec92e-d6a9-42cd-a5f8-f1d53e235001
        AddLuceneWork: ltd.myapp.db.ChildEntity#769ec92e-d6a9-42cd-a5f8-f1d53e235001
        DeleteLuceneWork: ltd.myapp.db.ChildEntity#769f7524-5450-4191-a48b-f00e06b1f77c
        AddLuceneWork: ltd.myapp.db.ChildEntity#769f7524-5450-4191-a48b-f00e06b1f77c
        DeleteLuceneWork: ltd.myapp.db.ChildEntity#5cde210b-4a37-447e-87b5-04ec63a7674a
        AddLuceneWork: ltd.myapp.db.ChildEntity#5cde210b-4a37-447e-87b5-04ec63a7674a
        DeleteLuceneWork: ltd.myapp.db.ChildEntity#03287780-0f08-4147-a692-fd5d865103e8
        AddLuceneWork: ltd.myapp.db.ChildEntity#03287780-0f08-4147-a692-fd5d865103e8
        DeleteLuceneWork: ltd.myapp.db.ChildEntity#6f29787d-0e86-4812-8b6a-40e503bef94a
        AddLuceneWork: ltd.myapp.db.ChildEntity#6f29787d-0e86-4812-8b6a-40e503bef94a
        DeleteLuceneWork: ltd.myapp.db.ChildEntity#d1b2a805-8acc-401b-9c07-1e7a2fe1d8a9
        AddLuceneWork: ltd.myapp.db.ChildEntity#d1b2a805-8acc-401b-9c07-1e7a2fe1d8a9

15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.PerDPQueueProcessor run
FINE: Opening an IndexWriter for update
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.PerDPQueueProcessor run
FINE: Opening an IndexWriter for update
15-May-2010 10:39:31 org.hibernate.search.backend.Workspace getIndexWriter
FINEST: IndexWriter opened
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.DeleteExtWorkDelegate performWork
FINEST: Removing class ltd.myapp.db.ParentEntity#fa4fde2e-2c76-43ef-8f90-4f531d9e9745 by id using an IndexWriter.
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate performWork
FINEST: add to Lucene index: class ltd.myapp.db.ParentEntity#fa4fde2e-2c76-43ef-8f90-4f531d9e9745:Document<XXXCUTXXX>
15-May-2010 10:39:31 org.hibernate.search.backend.Workspace getIndexWriter
FINEST: IndexWriter opened
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.DeleteExtWorkDelegate performWork
FINEST: Removing class ltd.myapp.db.ChildEntity#726015c2-40bc-4a92-8f55-8b895230af83 by id using an IndexWriter.
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate performWork
FINEST: add to Lucene index: class ltd.myapp.db.ChildEntity#726015c2-40bc-4a92-8f55-8b895230af83:Document<XXXCUTXXX>
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.DeleteExtWorkDelegate performWork
FINEST: Removing class ltd.myapp.db.ChildEntity#769ec92e-d6a9-42cd-a5f8-f1d53e235001 by id using an IndexWriter.
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate performWork
FINEST: add to Lucene index: class ltd.myapp.db.ChildEntity#769ec92e-d6a9-42cd-a5f8-f1d53e235001:Document<XXXCUTXXX>
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.DeleteExtWorkDelegate performWork
FINEST: Removing class ltd.myapp.db.ChildEntity#769f7524-5450-4191-a48b-f00e06b1f77c by id using an IndexWriter.
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate performWork
FINEST: add to Lucene index: class ltd.myapp.db.ChildEntity#769f7524-5450-4191-a48b-f00e06b1f77c:Document<XXXCUTXXX>
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.DeleteExtWorkDelegate performWork
FINEST: Removing class ltd.myapp.db.ChildEntity#5cde210b-4a37-447e-87b5-04ec63a7674a by id using an IndexWriter.
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate performWork
FINEST: add to Lucene index: class ltd.myapp.db.ChildEntity#5cde210b-4a37-447e-87b5-04ec63a7674a:Document<XXXCUTXXX>
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.DeleteExtWorkDelegate performWork
FINEST: Removing class ltd.myapp.db.ChildEntity#03287780-0f08-4147-a692-fd5d865103e8 by id using an IndexWriter.
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate performWork
FINEST: add to Lucene index: class ltd.myapp.db.ChildEntity#03287780-0f08-4147-a692-fd5d865103e8:Document<XXXCUTXXX>
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.DeleteExtWorkDelegate performWork
FINEST: Removing class ltd.myapp.db.ChildEntity#6f29787d-0e86-4812-8b6a-40e503bef94a by id using an IndexWriter.
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate performWork
FINEST: add to Lucene index: class ltd.myapp.db.ChildEntity#6f29787d-0e86-4812-8b6a-40e503bef94a:Document<XXXCUTXXX>
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.DeleteExtWorkDelegate performWork
FINEST: Removing class ltd.myapp.db.ChildEntity#d1b2a805-8acc-401b-9c07-1e7a2fe1d8a9 by id using an IndexWriter.
15-May-2010 10:39:31 org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate performWork
FINEST: add to Lucene index: class ltd.myapp.db.ChildEntity#d1b2a805-8acc-401b-9c07-1e7a2fe1d8a9:Document<XXXCUTXXX>
15-May-2010 10:39:36 org.hibernate.search.backend.Workspace commitIndexWriter
FINEST: Index changes commited.
15-May-2010 10:39:36 org.hibernate.search.backend.Workspace closeIndexWriter
FINEST: IndexWriter closed
15-May-2010 10:39:36 org.hibernate.search.backend.Workspace commitIndexWriter
FINEST: Index changes commited.
15-May-2010 10:39:36 org.hibernate.search.backend.Workspace closeIndexWriter
FINEST: IndexWriter closed
15-May-2010 10:39:36 ltd.myapp.Brains interpretMessage
FINE: processMessage closing session.


Is it normal that all objects would be removed from the index, and re-added, if modified?

Anyway, there is still a 5 second gap in the logs, between the dump of the last object being indexed and the final "Index changes committed" message.


Top
 Profile  
 
 Post subject: Re: How to investigate slow index updates?
PostPosted: Sat May 15, 2010 2:30 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
Is it normal that all objects would be removed from the index, and re-added, if modified?

yes it is, there's not "update" in Lucene, so we have to remove and add again; but this shouldn't be a problem it's usually fast and we do it in the most efficient manner, applying some additional tricks if the specific model permits.

Quote:
I tried turning on some config in hibernate.cfg.xml:
Code:
<property name="hibernate.search.worker.execution">sync</property>
<property name="hibernate.search.default.optimizer.transaction_limit.max">100</property>

using "async" instead of "sync" will prevent your user thread from being blocked and continue with other work (like serving a response to user), but of course it's not a solution but rather a workaround to hide your problem as it should be way faster. I think your VPS appliance is having very slow disk, or some hardware or OS level problem.

If your application is the only one writing to this index, you can enable a mayor performance boost by setting:
Code:
<property name="hibernate.search.default.exclusive_index_use">true</property>

to debug your issue, you could try the blackhole backend to verify the problem is not to be found in the loading of additional data from MySQL (see manual).
Then you could also try configuring a RAMDirectoryProvider, to see if something strange happens during indexwriting, but not limited to your real disk.

Finally about the logging: Hibernate uses slf4j, similar to commons logging, which provides adapters to whatever logging you like (no need to configure log4j)

Run some filesystem benchmarks and compare with your local disk ;)

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: How to investigate slow index updates on a shared VPS host?
PostPosted: Sun May 16, 2010 10:56 am 
Beginner
Beginner

Joined: Wed Sep 30, 2009 5:29 am
Posts: 29
Sanne, thank you for the very helpful reply.

s.grinovero wrote:
using "async" instead of "sync" will prevent your user thread from being blocked and continue with other work (like serving a response to user), but of course it's not a solution but rather a workaround to hide your problem as it should be way faster. I think your VPS appliance is having very slow disk, or some hardware or OS level problem.


I think you're essentially right about the VPS hardware problem. I am paying some tiny amount of money each month, and God only knows what else I'm sharing the hardware with. I have run some disk i/o benchmarking (using 'iozone') and found their disks to be up to twice as slow (or half as fast..) as my laptop's puny 5600rpm drive. However, the VPS performance seems to be up and down, which is making my performance testing rather erratic.

s.grinovero wrote:
If your application is the only one writing to this index, you can enable a mayor performance boost by setting:
Code:
<property name="hibernate.search.default.exclusive_index_use">true</property>

to debug your issue, you could try the blackhole backend to verify the problem is not to be found in the loading of additional data from MySQL (see manual).


'blackhole' completely removed my timing problems, of course, and all operations took a few ms to complete.

s.grinovero wrote:
Then you could also try configuring a RAMDirectoryProvider, to see if something strange happens during indexwriting, but not limited to your real disk.


I saw RAMDirectoryProvider in the Hibernate Search docs, but so little is said about it I didn't understand it, so didn't try it before now. I thought it would behave like a FSDirectoryProvider starting with an empty index directory each time the app server is started.. that is, it would lose all data when the application is quit. It doesn't seem to use any properties, so I presume indexBase and in fact any disk-backed index store is not used. http://opensource.atlassian.com/project ... SEARCH-135 seems to support this.

I tried running with it, and it gives extremely good performance, so it's clearly a file system i/o problem. The index functions, very quickly, so I don't think my use of the core indexing system is broken.

s.grinovero wrote:
Finally about the logging: Hibernate uses slf4j, similar to commons logging, which provides adapters to whatever logging you like (no need to configure log4j)


Yeah, some 'documentation' you can run into on the internet can be rather misleading :)

Switching back to FSDirectoryProvider, I find the performance problem no longer occurs, and the latest benchmark I ran on my VPS host produced ridiculously small (i.e. fast) results. It could be that running the disk benchmarking software caused the VM management system to allocate my VPS better resources, at least temporarily. Before today, it would have been doing damn near nothing 100% of the time, and so possibly allocated the least of their available VPS i/o bandwidth. Perhaps I should schedule a benchmark in cron daily :P

If I manage to get the problem to recur, I'll try taking all my Indexing annotations off my classes and introduce them one by one until the problem happens again.

Thanks again,
Nick


Top
 Profile  
 
 Post subject: Re: How to investigate slow index updates?
PostPosted: Sun May 16, 2010 11:12 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
It could be that running the disk benchmarking software caused the VM management system to allocate my VPS better resources, at least temporarily.

that's very likely; I've seen more reports -not necessarily related to Search- about this kind of very slow resource allocation on first access on other virtual host providers.
A known trick is to use "dd" command to create a big file piping from /dev/zero to fill up your filesystem, and then remove the file: that makes sure all your space is really allocated to you. It shouldn't be needed to be repeated, if you manage to get all space assigned.

Quote:
I saw RAMDirectoryProvider in the Hibernate Search docs, but so little is said about it I didn't understand it, so didn't try it before now. I thought it would behave like a FSDirectoryProvider starting with an empty index directory each time the app server is started.. that is, it would lose all data when the application is quit.

Ok I'll take note to improve docs in that: very simply said it avoids using the disk, it keeps the index in memory (RAM);
so if you can afford to loose the index at application crash/restart and the index is not so huge to fit in your free memory it's the best performing option.

Quote:
If I manage to get the problem to recur, I'll try taking all my Indexing annotations off my classes and introduce them one by one until the problem happens again.

no need to do that, from the blackhole test results you got it's quite sure the problem was to be found in IO.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: How to investigate slow index updates?
PostPosted: Mon May 17, 2010 3:08 am 
Beginner
Beginner

Joined: Wed Sep 30, 2009 5:29 am
Posts: 29
s.grinovero wrote:
Quote:
I saw RAMDirectoryProvider in the Hibernate Search docs, but so little is said about it I didn't understand it, so didn't try it before now. I thought it would behave like a FSDirectoryProvider starting with an empty index directory each time the app server is started.. that is, it would lose all data when the application is quit.

Ok I'll take note to improve docs in that: very simply said it avoids using the disk, it keeps the index in memory (RAM);
so if you can afford to loose the index at application crash/restart and the index is not so huge to fit in your free memory it's the best performing option.


Still not quite clear. When you talk of affording to lose data, I imagine you're talking about an ungraceful shutdown (e.g. crash or forced kill of the process) that would cause RAMDirectoryProvider to lose your data. Do you mean that in the case of graceful shutdown, it would actually save your index somehow? As the RAMDirectoryProvider does not take config (e.g. indexBase for a file system location) I don't see how it would ever save your index between restarts of the application.

I imagine a use case would be that the app would be configured with RAMDirectoryProvider, and force re-indexing on startup to populate the in-memory index. In the case of a Hibernate app backed by Hibernate Search and Lucene, this would involve the 'mass re-indexing' scenario covered in the docs, to force all persistent entities to be re-examined by the indexing system. Without this re-indexing on startup, the RAMDirectoryProvider's in-memory index would remain empty until an indexed Entity is persisted (saved or updated). A few words along these lines in the docs would clarify how and why to use this Directory Provider.

Cheers :)
Nick


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 6 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.