How to use Hibernate Search with Amazon Elastic Beanstalk?

pmeier · **Joined:** Thu Apr 28, 2011 6:19 am **Posts:** 2

Hi,

I am new to Hibernate Search and started reading the book "Hibernate Search in Action". While reading I asked myself how to implement this in my application. I am using a normal Java Web Application backed by Spring Web MVC with Hibernate for persistence and running on the new Amazon Elastic Beanstalk environment (some machines with a tomcat server installed behind a load balancer). The database server is not running in the cloud but on a dedicated server on the internet - communication between the webapplication and database server is implemented over a ssh-tunnel with Jsch. As you might guess I have full control of that database-server (running Ubuntu-Server 10.04) - maybe this is some useful information.

It would be pretty nice if I could use Hibernate Search in this environment. After some reading, thinking and googling I found no satisfying answer to that question, because the only main problem remaining is how and where to store the lucene index.

Until now the only appropriate solution seems to be the JDBCDirectory provided by the Compass Project - but I already read this isn't the best solution too. The other thought was to store the index on the dedicated database server - but I didn't find a solid solution how to do this.

I hope there is a way because I don't want to use Solr - Hibernate Search would be exactly what I need.

Thanks a lot for any answers and ideas....

sanne.grinovero · **Posted:** Thu Apr 28, 2011 8:03 am

Hi,
I never used Beanstalk, but the Infinispan Lucene Directory was developed mainly around EC2 requirements; it has been my development testbed.

to read what it is:
http://community.jboss.org/wiki/InfinispanAsADirectoryForLucene

We made the integration with Hibernate Search (it's really just a couple of binding classes):
http://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#infinispan-directories

As Infinispan is "in memory", you can either use a CacheLoader to have a write-through asynchronous copy of it's content, or in case you loose all nodes you can just reindex. I'd suggest to configure Infinispan either to use S3 as a storage option, or to use the JDBCCacheStore to replicate the index in the same database you use for Hibernate entities.
On the Infinispan forums I once pasted a complete cacheloader configuration tuned for MySQL.

Infinispan is based on JGroups, which usually uses multicast to autodiscover peer nodes; on Amazon multicast is blocked you you need to configure an alternative discovery protocol: having a database already makes it easy via JDBC_PING:

http://community.jboss.org/wiki/JDBCPING
http://www.jgroups.org/javadoc/org/jgro ... _PING.html

finally, having a database behind an ssh tunnel might make your application slow (and expensive as you pay for the outbound traffic), so I'd suggest to reuse Infinispan as Hibernate second level cache; this comes almost for free as you're configuring Infinispan anyway, so it would be nice to reuse it for this purpose.

pmeier · **Joined:** Thu Apr 28, 2011 6:19 am **Posts:** 2

Thanks a lot for your quick and great answer.