-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 
Author Message
 Post subject: Tuning massindexer
PostPosted: Tue Feb 05, 2013 7:39 pm 
Newbie

Joined: Tue Feb 05, 2013 7:24 pm
Posts: 2
Hi

I'm using Hibernate Search 3.4 and I've changed our system to use the massindexer to rebuild an index. There are about 750,000 records for this, and when I run the indexer with the following parameters:

fullTextSession
.createIndexer(c)
.batchSizeToLoadObjects( 50 )
.cacheMode( CacheMode.IGNORE )
.threadsToLoadObjects( 30 )
.threadsForSubsequentFetching( 20 )
.start();

it runs fine for about the first 200,000 records.

I'm watching the progress on the MBeans, and I can see that the LoadedEntitiesCount goes up by around 10k per second, and the DocumentsAddedCount runs a few thousand behind, but keeping up.

After about 200k records however, it slows down dramatically, with the LoadedEntitesCount reducing to about 100 per second, and the DocumentsAddedCount runs almost level with it, suggesting the indexing isn't the problem.

I can see all 30 entityLoader threads are running, but not sure what causes the huge slow down? It's possible it's the database but whilst the CPU is at 100% usage, it's roughly 75% Java and 25% SQL Server, so I don't _think_ that's the problem.

Any ideas?

thanks
Pete


Top
 Profile  
 
 Post subject: Re: Tuning massindexer
PostPosted: Tue Feb 05, 2013 7:49 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi

With "SQL Server" you mean Microsoft SQL Server? which version and which driver are you using?

Could you connect with a profiler or JConsole and get an idea of the garbage collection activity?

Lucene will start to slow down a bit when the index gets larger, but it should be barely noticeable so I don't think it's Lucene; although all parties involved are memory hungry, so I'd check your memory levels in the JVM.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Tuning massindexer
PostPosted: Tue Feb 05, 2013 9:05 pm 
Newbie

Joined: Tue Feb 05, 2013 7:24 pm
Posts: 2
Yes, MS SQL Server, in this case 2012, via the jTDS driver.

I'm connected using JVisualVM to view the MBeans, and looking at the profile there, there's not a lot of GC going on, and the heap never fills completely (there's 512mb available and it goes from about 180-320mb then GCs again).

I'm fairly sure it's not the lucene side of things as the indexwriter threads are waiting most of the time, and indeed it can load 'documents' and queue them for lucene to process - but it's the loading which grinds to a halt it would seem.

I guess it could be SQL Server's caches slowing up but again I don't think even if that were the case it would slow by a factor of 100x! Weird.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.