-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 19 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: HS : OutOfMemoryException
PostPosted: Wed Jun 30, 2010 5:38 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Hi Guys,

I currently have a OutOfMemoryException occuring when searching is taking place. I have failed to replicate the cause locally but the problem arises after a period of time 1-2 weeks in our test environment.

Stack trace when OOMException occurs looks like so

Quote:
java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:688)
at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:208)
at org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:676)
at org.apache.lucene.search.FieldComparator$StringOrdValComparator.setNextReader(FieldComparator.java:667)
at org.apache.lucene.search.TopFieldCollector$MultiComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:435)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:253)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:236)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:179)
at org.apache.lucene.search.Searcher.search(Searcher.java:90)
at org.hibernate.search.query.QueryHits.updateTopDocs(QueryHits.java:103)
at org.hibernate.search.query.QueryHits.<init>(QueryHits.java:61)
at org.hibernate.search.query.FullTextQueryImpl.getQueryHits(FullTextQueryImpl.java:376)
at org.hibernate.search.query.FullTextQueryImpl.list(FullTextQueryImpl.java:293)
at org.hibernate.search.jpa.impl.FullTextQueryImpl.getResultList(FullTextQueryImpl.java:92)


My index sizes in total are roughly 1.9g, so small in terms of lucenes capabilities. My Vm has 1g allocated to it and is set up to use the FSSlaveDirectoryProvider. I took a closer look at the heap using MAT and I can see who is hogging the memory. The org.apache.lucene.search.FieldCacheImpl is accounting for roughly 850MB of the total VM.

Im just looking for any tips on a way forward with this one, any suggestions of a cause/solution??

Thanks guys,
LL


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Wed Jun 30, 2010 12:24 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi, the FieldCacheImpl is known to take a lot of memory; depending on your index structure and what you indexed it might grow to a sensible % of the whole index.
Are you sure this is a memory leak, and not a constant requirement? you might want to check the behaviour using a larger heap.

If you really find a memory leak in FieldCacheImpl there's not much we can do, you should write to the lucene mailing list.
BTW which Lucene version are we talking about?

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Thu Jul 01, 2010 8:41 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Quote:
depending on your index structure and what you indexed it might grow to a sensible % of the whole index.

The data isnt exactly what i would call big, there is alot of records but generally the field data is small, mainly strings 15-30 characters long on average.

Can you tell me more about how the FieldCacheImpl works........what gets stored and when does it get released?? What happens for example when the slave directory switches over to the new index, does fieldcacheimpl still store all the old values from the old index or does it release them? My hunch is maybe after switching X amount of times that maybe the FieldCacheImpl still holds the old index values in memory and this builds up over time?? It is just a hunch/theory at the moment and i need to prove it.

Quote:
BTW which Lucene version are we talking about?

Im using lucene 2.9.2 wtih HS 3.1.1. I know there is newer versions and plan to step up but would like to solve this problem first or at least find out the reason for it.


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Thu Jul 01, 2010 10:56 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
Can you tell me more about how the FieldCacheImpl works........what gets stored and when does it get released?

It's used mainly for sorting purposes, so if you do sort it gets used more. Also it depends on the number of unique terms you're sorting on, so it depends on the analyzers, how you split sentences and how many unique terms this project generates (always good to tune the analyzers).

Quote:
What happens for example when the slave directory switches over to the new index, does fieldcacheimpl still store all the old values from the old index or does it release them?

Each FieldCache is local to the currently opened IndexReader, so as long as you don't forget to close indexReaders you shouldn't have trouble. Search is handling IndexReaders lifecycle in sane ways AFAIK, but you might be asking for readers using the low level API and not closing them all?
Which ReaderProvider are you using? the default one should be your best choice.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Mon Jul 05, 2010 11:51 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Hi Sanne,
Quote:
It's used mainly for sorting purposes, so if you do sort it gets used more.

Yes we do sort, if we do a sort query will the FieldCache load up all the unique items all in one go or does it do it over a period of time?? So if it is the same query but different input to the query, it shouldnt make the Fieldcache grow bigger after the first query because all the unique terms should be loaded from the first query, right???

Quote:
Also it depends on the number of unique terms you're sorting on, so it depends on the analyzers, how you split sentences and how many unique terms this project generates (always good to tune the analyzers).

Can you expand on the tuning side here, any examples or links?? Couldnt find anything on web about it.

Quote:
Which ReaderProvider are you using? the default one should be your best choice.

Using default.

Thanks Sanne,
LL


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Tue Jul 06, 2010 8:52 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
So if it is the same query but different input to the query, it shouldnt make the Fieldcache grow bigger after the first query because all the unique terms should be loaded from the first query, right???

partially right, assuming the new Query didn't hit any term, so that the where all loaded by the first query. It might hit new terms, which make the cache grow larger as it initializes them.

Quote:
Can you expand on the tuning side here, any examples or links?? Couldnt find anything on web about it.

sorry my fault, "tuning" is not very appropriate in this case. I meant to carefully choose and configure your analyzers: depending on how you analyze the text you might produce a larger set of unique terms.
So for example if you don't lowercase your text, for each word you'll have the different combinations of cases found in your document for each word, while if you lowercase it's going to contain only the lowercase representation.
If you do lowercasing, remove simbols, and apply tri-grams (chunking words in groups of three characters), you'll have more terms per document but likely a reduced number of unique terms, so consuming less memory for sorting purposes.

correction: I nearly forgot, you won't be able to sort on fields using tri-grams as the field must contain a unique term to make sense. but the example is still fine in the sense that you could apply some analysis to reduce the set of unique terms.

This is more a subject for the Lucene mailing list, there you'll find better insight on these aspects; all I can say is that it's possible that you're not hitting a memory leak but just having an higher requirement in terms of memory in your use case of Lucene; in case you can't try with a larger heap you should try changing the way text is processed, or limit the sorting use cases on some fields you can predict to be of limited cardinality.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Thu Jul 08, 2010 10:39 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Hi Sanne,
I have setup a load test locally and query my index with the same query but with random input values, after each GC my VM size stays roughly around the 40MB mark. No updates to the index were done. I then introduced updates to the index etc etc and continue with the same query I know notice that after a while that my VM after GC is 80MB.

I had a look at the before and after picture over a period of time (12hours), the difference is the size of the search.FieldCacheImpl, looking at the different hashmap buckets and there entries in them. It looks like there are duplicate entries in comparison to how it looked when no switching of the index goes on.

Question for you, when the swap occurs from using "current1" to "current2", how does HS know the swap has occured and close the old readers?? From what I can see all HS has to work off is the interface DirectoryProvider and that has start(),stop(),getDirectory(). I know you said it works off a lifecycle basis so how or when does the underneath index reader get closed for the old directory??

I adapted the FSSlaveDirectory slightly just to see what happens when we do the swap over for the new index. It looks something like this

Code:
@Override
   public FSDirectory getDirectory() {
      int readState = current;// to have the read consistent in the next two
      log.debug("CurrentDirectory = "+directoryIndex);
      if (readState == directoryIndex) {
         return directory;
      }

      synchronized (this) {
         if (directory != null) {
            try {
               directory.close();
            } catch (Exception e) {
               log.error("Unable to properly close Lucene directory {}"
                     + directory.getFile(), e);
            }
         }
         directoryIndex = readState;
         try {
            directory = FSDirectory.getDirectory(new File(indexDir, String
                  .valueOf(readState)));
         } catch (IOException e) {
            log.error("Can not create FSDirectory " + indexDir + ", "
                  + readState + " {}", e);
            return null;
         }
         return directory;
      }
   }

Ive cleaned out unwanted parts but you should get the idea, when the marker changes we close the current directory and open up the new one. I just wanted to see what would happen underneath the covers. What I would have expected is the heap to grow and have a lot more frequent GC cycles but what happens is it grows and grows and eventually blows up because nothing gets released even though we closed the old directory.

How would you have expected the above to run??

Cheers,
LL


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Fri Jul 09, 2010 5:07 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Hi Sannne,
Could you point me in the right direction a bit please, on the switch over if i proactively cleaned up rather than wait for "lifecycle" to finish up. What would I need to do?? I see that the FieldCacheImpl has a purgeAllCaches() which would be good to call on the switch over but dont know what effects it might have if any??

Do you know how i could get hold of it through the HS API??


Thanks,
LL


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Fri Jul 09, 2010 10:48 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi,
I'm getting to an idea, could you please verify if you still have the problem using the "not-shared" ReaderProvider?
you'll experience lower performance, but it might fix the problem. I'm not sure of it, sorry I'm guessing, to write a proper test for it I'll find some time next week.

My idea is that the new internal caches of Lucene 2.9 don't work properly with the way we reuse buffers in the SharingBufferReaderProvider when using the master/slave directories - if you could confirm this that would be great and I think we could propose a solution.

Quote:
Do you know how i could get hold of it through the HS API??

From the "native" chapter in the reference docs you can find hints about getting access to all Lucene APIs, we don't hide it in case of advancaed need; unfortunately I'm not sure of Lucene's API is exposing this. Also look into ImmutableSearchFactory, you should be able to access more non-public-api services (but no backwards compatibility is attempted)

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Mon Jul 12, 2010 5:18 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Quote:
I'm getting to an idea, could you please verify if you still have the problem using the "not-shared" ReaderProvider?

I will set this reader and let you know what happens.

Quote:
I'm not sure of it, sorry I'm guessing, to write a proper test for it I'll find some time next week.

A quick and dirty way would be to just to adapt the SlaveDirectoryProvider to check if the marker files has changed and if it has then close the current directory and open the new directory. Repeat the cycle and watch what happens. Something similar to whats posted above, would get you started fairly quick.


Quote:
My idea is that the new internal caches of Lucene 2.9 don't work properly with the way we reuse buffers in the SharingBufferReaderProvider when using the master/slave directories - if you could confirm this that would be great and I think we could propose a solution.

I am using the master/slave paradigm but without JMS. I have my own custom updater who updates the master index. The rest follows the master/slave paradigm.


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Mon Jul 12, 2010 8:38 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
I am using the master/slave paradigm but without JMS. I have my own custom updater who updates the master index

Out of curiousity, did you try out the JGroups baased backend?
Also if you plugin an Infinispan Directory you won't need the Directory switch anymore and greatly improve performance and ease of setup - that's the goal we have for next version.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Tue Jul 13, 2010 6:12 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Hi Sanne.
Tried the not shared reader and that is actually worse. I made 5 queries and i get a OOM exception. So it kind of fast forwarded the problem. Please note i am using lucence 2.9.2 with HS 3.1.1. I am going to step up to the latest versions of HS and see if the problem persists.

A quick look at my heap and you can see who is hogging the memory

Class Name | Shallow Heap | Retained Heap | Percentage
----------------------------------------------------------------------------------------------------------------------------------
org.apache.catalina.loader.WebappClassLoader @ 0x51110340 | 168 | 867.624.864 | 93,49%
|- class org.apache.lucene.search.FieldCache @ 0x8f30f8b0 | 48 | 861.936.056 | 92,87%
| |- org.apache.lucene.search.FieldCacheImpl @ 0x54d3e3f8 | 16 | 861.935.928 | 92,87%
| | '- java.util.HashMap @ 0x54d3f038 | 40 | 861.935.912 | 92,87%
| | '- java.util.HashMap$Entry[16] @ 0x54d40b00 | 80 | 861.935.872 | 92,87%
| | |- java.util.HashMap$Entry @ 0x54d37f50 | 24 | 861.933.992 | 92,87%
| | | '- org.apache.lucene.search.FieldCacheImpl$StringIndexCache @ 0x54d3e3a8| 16 | 861.933.968 | 92,87%
| | | '- java.util.WeakHashMap @ 0x54d3efb8 | 48 | 861.933.952 | 92,87%
| | | |- java.util.WeakHashMap$Entry[16] @ 0x54d40a80 | 80 | 718.278.240 | 77,39%
| | | |- java.util.WeakHashMap$Entry @ 0x54d37e78 | 40 | 143.655.632 | 15,48%
----------------------------------------------------------------------------------------------------------------------------------

Hopefully you can replicate it on your side.

Quote:
Out of curiousity, did you try out the JGroups baased backend?

No I havent, time is against me im afraid so i wont be able to give it a go.


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Tue Jul 13, 2010 7:40 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Just another thing, im using it inside tomcat!!

I just seen http://opensource.atlassian.com/project ... SEARCH-314, although im not redeploying or anything like that but could be related.


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Tue Jul 13, 2010 9:19 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
I made 5 queries and i get a OOM exception.

Are you sure you close the indexReader? Yes please try HS3.2

The fact that your'e hitting OOM sooner using this implementation makes me thing that you just don't have enough memory, not that there's a leak, bringing up the first thoughts that you might need much memory to do sorting.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: HS : OutOfMemoryException
PostPosted: Tue Jul 13, 2010 9:21 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
I just seen http://opensource.atlassian.com/project ... SEARCH-314, although im not redeploying or anything like that but could be related.

No, if that is a problem that's really only limited to the redeployment on the same tomcat.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 19 posts ]  Go to page 1, 2  Next

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.