Hibernate Books

All times are UTC - 5 hours [ DST ]



Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: hibernate search with inifinspan - not syncing
PostPosted: Fri Aug 03, 2012 2:22 pm 
Newbie

Joined: Fri Aug 03, 2012 11:41 am
Posts: 3
We're trying to switch to using an infinispan directory provider for Hibernate Search. Before we were using Hibernate Search with a custom rsync directory provider, and it worked okay, but it had a 120 second delay built in, and we're hoping infinispan will give us a more rapid update of the indexes.

We have taken the suggestions outlined in the documentation and are using JGroups to channel to a single IndexWriter. We have our master and one slave at the moment.

Master settings :

Code:
hibernate.search.lucene_version=LUCENE_35
hibernate.search.default.exclusive_index_use=true
hibernate.search.infinispan.configuration_resourcename=infinispan/search.xml
hibernate.search.default.directory_provider.copytask.enabled=false
hibernate.search.worker.backend=jgroupsMaster
hibernate.search.default.directory_provider=infinispan

and slave has the same except this:

Code:
hibernate.search.worker.backend=jgroupsMaster

With infinispan, I can bring up the master, and it does a full reindexing, the app works, and I see no problems in the logs. However we have a FileCacheStore set and I can't seem to find the actual lucene files in it.

Code:
         <loaders passivation="true" shared="false" preload="true">
            <loader class="org.infinispan.loaders.file.FileCacheStore"
               fetchPersistentState="true" ignoreModifications="false"
               purgerThreads="3" purgeSynchronously="true" purgeOnStartup="false">
               <properties>
                  <property name="location" value="/var/dex/lucene" />
               </properties>
               <singletonStore enabled="true" pushStateWhenCoordinator="true" pushStateTimeout="20000"/>
            </loader>
         </loaders>

Then when I bring up the slave I see these errors:

STDOUT [ERROR] [2012.08.03 13:50:05] cacheviews.CacheViewsManagerImpl - ISPN000172: Failed to prepare view CacheView{viewId=5, members=[MarkMac-29668, firecracker-57641]} for cache LuceneIndexesData, rolling back to view CacheView{viewId=1, members=[MarkMac-29668]}
java.util.concurrent.ExecutionException: org.infinispan.CacheException: org.jgroups.TimeoutException: TimeoutException
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:262)
at java.util.concurrent.FutureTask.get(FutureTask.java:119)
at org.infinispan.cacheviews.CacheViewsManagerImpl.clusterPrepareView(CacheViewsManagerImpl.java:318)
at org.infinispan.cacheviews.CacheViewsManagerImpl.clusterInstallView(CacheViewsManagerImpl.java:249)
at org.infinispan.cacheviews.CacheViewsManagerImpl$ViewInstallationTask.call(CacheViewsManagerImpl.java:875)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.infinispan.CacheException: org.jgroups.TimeoutException: TimeoutException
at org.infinispan.util.Util.rewrapAsCacheException(Util.java:525)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:172)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:489)
at org.infinispan.cacheviews.CacheViewsManagerImpl$2.call(CacheViewsManagerImpl.java:302)
at org.infinispan.cacheviews.CacheViewsManagerImpl$2.call(CacheViewsManagerImpl.java:299)
... 5 more
Caused by: org.jgroups.TimeoutException: TimeoutException
at org.jgroups.util.Promise._getResultWithTimeout(Promise.java:145)
at org.jgroups.util.Promise.getResultWithTimeout(Promise.java:40)
at org.jgroups.util.AckCollector.waitForAllAcks(AckCollector.java:93)
at org.jgroups.protocols.RSVP$Entry.block(RSVP.java:275)
at org.jgroups.protocols.RSVP.down(RSVP.java:114)
at org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:238)
at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:1025)
at org.jgroups.JChannel.down(JChannel.java:729)
at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:617)
at org.jgroups.blocks.RequestCorrelator.sendUnicastRequest(RequestCorrelator.java:202)
at org.jgroups.blocks.UnicastRequest.sendRequest(UnicastRequest.java:44)
at org.jgroups.blocks.Request.execute(Request.java:83)
at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:366)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:270)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:165)
... 8 more

What techniques can I use to troubleshoot this? How can I monitor what infinispan is doing with regards to jgroups?


Top
 Profile  
 
 Post subject: Re: hibernate search with inifinspan - not syncing
PostPosted: Sat Aug 04, 2012 5:31 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2296
Location: Third rock from the Sun
Hi,
the stacktrace you shared seems to highlight a problem during State Transfer with either Infinispan/JGroups.

That's best reported on the Infinispan forums at https://community.jboss.org/en/infinispan?view=discussions ; there they can help you better; I guess they will ask you for a TRACE level log for the JGroups and Infinispan categories.

Some more comments on your configuration:

- hibernate.search.worker.backend=jgroupsMaster
That's correct on the master, but you mention having it on the slaves as well? I guess it's a copy/paste error.

- singletonStore
Why is your CacheLoader set as singleton? are you sharing the directory via a network filesystem?

You won't find the typical set of Lucene files in the CacheLoader storage directory as Infinispan encodes its data in a different way. I've contributed a "translator" to Infinispan which is able to read a Lucene index as a CacheLoader source, but I didn't implement writing - in case you would need that I could point you at the needed methods to be implemented.

Also could you describe the versions you're using of Search, Infinispan and JGroups? what are your Infinispan and JGroups configurations? the default ones should work pretty much out of the box, but when using both JGroups for the master/slave channel and indirectly for Infinispan you might want to make sure the two clusters communicate over different ports.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: hibernate search with inifinspan - not syncing
PostPosted: Sun Aug 05, 2012 1:28 pm 
Newbie

Joined: Fri Aug 03, 2012 11:41 am
Posts: 3
Thanks so much for replying.

Versions:

4.1.1.Final hibernate-serach, hibernate-infinispan
3.1.0.Final jgroups

Quote:
- singletonStore
Why is your CacheLoader set as singleton? are you sharing the directory via a network filesystem?

no, we just wanted a persistent store in case the master goes down, and we didn't want to have to reindex everything when it comes back up. I'm not sure how to accomplish that. We thought a CacheStore would solve it.

Quote:
what are your Infinispan and JGroups configurations? the default ones should work pretty much out of the box, but when using both JGroups for the master/slave channel and indirectly for Infinispan you might want to make sure the two clusters communicate over different ports.


This is interesting. The hibernate search backend is starting a jgroups cluster because it seems the hsearch JGroupsBackendQueueProcessor initializes one. It uses a default name of "Hibernate Search Cluster" Again, here are the props available to the entity manager when hsearch starts up:

Code:
hibernate.search.worker.backend=jgroupsMaster
hibernate.search.worker.backend.jgroups.configurationFile=hsearch/jgroups-configuration.xml
hibernate.search.default.directory_provider=infinispan
hibernate.search.infinispan.configuration_resourcename=infinispan/search.xml


and then infinispan's JGroupsTransport is also doing a channel startup based on the infinispan config (search.xml in my setup):

Code:
        <transport
           transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport"
            clusterName="Hibernate_Search_Cluster"
            distributedSyncTimeout="120000">
       <properties>
                <property name="configurationFile" value="hsearch/jgroups-configuration.xml" />
            </properties>
            <!-- Note that the JGroups transport uses sensible defaults if no configuration
                property is defined. See the JGroupsTransport javadocs for more flags -->
        </transport>


Should I really have two channels? WHat is the purpose of each cluster. Based on the documentation, I thought Jgroups was recommended in order to make sure there was just one IndexWriter. I guess I'm just unclear on how to configure that. Btw, we're using tcp - here is our jgroups config. Full infinispan config below. Again, thank you so much!

Code:
<config>

    <TCP bind_port="7804"
         max_bundle_size="120000"/>
    <TCPPING timeout="3000"
             initial_hosts="127.0.0.1[7804],127.0.0.1[7805]"
             port_range="1"
             num_initial_members="2"/>
    <VERIFY_SUSPECT timeout="1500"  />
    <pbcast.NAKACK use_mcast_xmit="false"
                   retransmit_timeout="300,600,1200,2400,4800"
                   discard_delivered_msgs="true"/>
    <pbcast.STABLE stability_delay="1000"
                   desired_avg_gossip="50000"
                   max_bytes="400000"/>
    <pbcast.GMS print_local_addr="true"
              join_timeout="5000"
              view_bundling="true"/>
    <pbcast.FLUSH level="info"/>
    <pbcast.STATE />
</config>



Code:
<?xml version="1.0" encoding="UTF-8"?>
<infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xmlns="urn:infinispan:config:5.1"
            xsi:schemaLocation="urn:infinispan:config:5.1 http://www.infinispan.org/schemas/infinispan-config-5.1.xsd">

    <!-- *************************** -->
    <!-- System-wide global settings -->
    <!-- *************************** -->

    <global>
   
        <!-- Duplicate domains are allowed so that multiple deployments with default configuration
            of Hibernate Search applications work - if possible it would be better to use JNDI to share
            the CacheManager across applications -->
        <globalJmxStatistics
            enabled="false"
            cacheManagerName="HibernateSearch"
            allowDuplicateDomains="true" />

        <!-- If the transport is omitted, there is no way to create distributed or clustered
            caches. There is no added cost to defining a transport but not creating a cache that uses one,
            since the transport is created and initialized lazily. -->
        <transport
           transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport"
            clusterName="Hibernate_Search_Cluster"
            distributedSyncTimeout="120000">
         <properties>
                <property name="configurationFile" value="hsearch/jgroups-configuration.xml" />
            </properties>
            <!-- Note that the JGroups transport uses sensible defaults if no configuration
                property is defined. See the JGroupsTransport javadocs for more flags -->
        </transport>

        <!-- Used to register JVM shutdown hooks. hookBehavior: DEFAULT, REGISTER, DONT_REGISTER.
            Hibernate Search takes care to stop the CacheManager so registering is not needed -->
        <shutdown
            hookBehavior="DONT_REGISTER" />

    </global>

    <!-- *************************** -->
    <!-- Default "template" settings -->
    <!-- *************************** -->

    <default>

         <loaders passivation="true" shared="false" preload="true">
            <loader class="org.infinispan.loaders.file.FileCacheStore"
               fetchPersistentState="true" ignoreModifications="false"
               purgerThreads="3" purgeSynchronously="true" purgeOnStartup="false">
               <properties>
                  <property name="location" value="/var/dex/lucene" />
               </properties>
               <singletonStore enabled="true" pushStateWhenCoordinator="true" pushStateTimeout="20000"/>
            </loader>
         </loaders>


        <locking
            lockAcquisitionTimeout="120000"
            writeSkewCheck="false"
            concurrencyLevel="5000"
            useLockStriping="false" />
           

        <!-- Invocation batching is required for use with the Lucene Directory -->
        <invocationBatching
            enabled="true" />

        <!-- This element specifies that the cache is clustered. modes supported: distribution
            (d), replication (r) or invalidation (i). Don't use invalidation to store Lucene indexes (as
            with Hibernate Search DirectoryProvider). Replication is recommended for best performance of
            Lucene indexes, but make sure you have enough memory to store the index in your heap.
            Also distribution scales much better than replication on high number of nodes in the cluster. -->
        <clustering
            mode="replication">

            <!-- Prefer loading all data at startup than later -->
            <stateRetrieval
                timeout="120000"
                fetchInMemoryState="true"
                />

            <!-- Network calls are synchronous by default -->
            <sync
                replTimeout="20000" />
        </clustering>

        <jmxStatistics
            enabled="false" />

        <eviction
            maxEntries="-1"
            strategy="NONE" />

        <expiration
            maxIdle="-1" />

    </default>

    <!-- ******************************************************************************* -->
    <!-- Individually configured "named" caches.                                         -->
    <!--                                                                                 -->
    <!-- While default configuration happens to be fine with similar settings across the -->
    <!-- three caches, they should generally be different in a production environment.   -->
    <!--                                                                                 -->
    <!-- Current settings could easily lead to OutOfMemory exception as a CacheStore     -->
    <!-- should be enabled, and maybe distribution is desired.                           -->
    <!-- ******************************************************************************* -->

    <!-- *************************************** -->
    <!--  Cache to store Lucene's file metadata  -->
    <!-- *************************************** -->
    <namedCache
        name="LuceneIndexesMetadata">
        <clustering
            mode="replication">
            <stateRetrieval
                fetchInMemoryState="true" />
            <sync
                replTimeout="120000" />
        </clustering>
    </namedCache>

    <!-- **************************** -->
    <!--  Cache to store Lucene data  -->
    <!-- **************************** -->
    <namedCache
        name="LuceneIndexesData">
        <clustering
            mode="replication">
            <stateRetrieval
                fetchInMemoryState="true"/>
            <sync
                replTimeout="120000" />
        </clustering>
    </namedCache>

    <!-- ***************************** -->
    <!--  Cache to store Lucene locks  -->
    <!-- ***************************** -->
    <namedCache
        name="LuceneIndexesLocking">
        <clustering
            mode="replication">
            <stateRetrieval
                fetchInMemoryState="true"/>
            <sync
                replTimeout="120000" />
        </clustering>
    </namedCache>

</infinispan>


Top
 Profile  
 
 Post subject: Re: hibernate search with inifinspan - not syncing
PostPosted: Tue Aug 07, 2012 9:32 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2296
Location: Third rock from the Sun
Quote:
Versions:

4.1.1.Final hibernate-serach, hibernate-infinispan
3.1.0.Final jgroups

You should use JGroups 3.0.9.Final or a newer version in the 3.0.x series, as this version of Hibernate Search requires Infinispan 5.1.x which in turn requires a JGroups 3.0.x.

Quote:
no, we just wanted a persistent store in case the master goes down, and we didn't want to have to reindex everything when it comes back up. I'm not sure how to accomplish that. We thought a CacheStore would solve it.

Yes a CacheStore is correct to achieve a persistent index, but it doesn't need to be a SingletonStore - that depends on which node you want to access that filesystem path for writes, but for read operations all nodes should be able to interact with it directly. This implies the path you configured should be valid and pointing to the same physical files for each of your nodes, as with a network shared mount point.

Quote:
This is interesting. The hibernate search backend is starting a jgroups cluster because it seems the hsearch JGroupsBackendQueueProcessor initializes one. It uses a default name of "Hibernate Search Cluster" Again, here are the props available to the entity manager when hsearch starts up:


That's not wrong and should work as JGroups isolates network packages having a different cluster name; still you might want to fine-tune things by customizing the configuration of each group and set different network ports.

Quote:
Should I really have two channels? WHat is the purpose of each cluster. Based on the documentation, I thought Jgroups was recommended in order to make sure there was just one IndexWriter. I guess I'm just unclear on how to configure that. Btw, we're using tcp - here is our jgroups config. Full infinispan config below. Again, thank you so much!

Yes for now you need two channels, and you're correct on the reasons.
That's not a bad thing in terms of performance, as the two aspects can be finely tuned independently but I realize it might be annoying to have to maintain two sets of JGroups configuration files + system and firewall rules. Next version will allow sharing of a single channel for both services, but that won't buy you extra performance: it's just a convenience in configuration.

The JGroups configuration you posted seems incomplete/outdated; where did you get it from? I wouldn't recommend writing a new configuration from scratch unless you're a JGroups expert; I would suggest taking the jgroups-tcp.xml contained in the Infinispan jar as a starting point, then tune that as needed (for example with your custom TCP_PING)

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.