Hibernate Search 4.3.0.Final scalability

weebl · **Joined:** Fri Apr 24, 2015 1:28 pm **Posts:** 1

Hey guys,

I was wondering how well Hibernate Search would scale using a single instance. We are using Hibernate Search as part of Modeshape 3.x(DV 6.1), indexing about 10 string properties per node. We are currently at about 1 million nodes which we are querying. The machines each have two cores, 24 gb of ram. Indexing is currently done asynchronous since we are experiencing a bottleneck, which is probably the filesystem. We have been trying to use the near-real-time indexmanager, but the JBoss EAP kit for Modesehape does not correctly pass on this attribute to Hibernate search, which is being worked on.

We are mostly querying for only one property on the nodes. Currently queries using an equals query is yielding good speeds, in the order of 30-150 ms for a query. I was wondering how large the indexes could grow before this is going to exceed 2 seconds, of course this is dependent on a lot of factors. Currently heap size is 16gb since we were experiencing some problems with garbage collections, we are trying to reduce this heap size. I understand we should leave as much as possible to the operating system in order to have a lot of in-memory caching for the filesystem. The amount of concurrent queries is currently lower than one, but will probably grow in the future.

OS is RHEL6, but mmap directory system does not seem to be used, it is unclear as to why. Filesystem is ext4 and it seems Modeshape is passing the correct "auto" parameter to Hibernate search.

Any input is welcome, if you would like to know more details, let me know. :)

Thanks in advance!

sanne.grinovero · **Posted:** Sun May 03, 2015 3:28 pm

Hi weebl,
I believe we had a chat on IRC on the same subject? Was that before post this? Wondering if you still need some advice.

I agree on what you wrote so far, especially leaving free memory to the OS would help to keep the indexes buffered. Not least, a small heap size would keep your performance consistent over time.

Having a too small heap size though would prevent the caching of Filter definitions, as these use Soft References. I'm not sure if ModeShape declares any such filter: just pointing out the possible things to check.

Is the JVM running in 64 bits? That's one of the conditions to have it pick the MMapDirectory. Other conditions depend on the used JDK, in particular you'll need to have "sun.misc.Cleaner" and "java.nio.DirectByteBuffer" available on classpath, and java.nio.DirectByteBuffer should have a method named "cleaner". (Details are in the source code of org.apache.lucene.store.MMapDirectory.UNMAP_SUPPORTED, defines if the MMapDirectory is safe enough to be enabled)