-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 26 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: File Descriptor Leak
PostPosted: Thu Jul 21, 2011 7:53 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Hi Guys,
Im currently experiencing a file descriptor leak in our app. We are using HS 3.3 and lucene 3.0.3 and it appears that after batch indexing via mas indexer that our app doesnt release deleted file descriptors.

Any other users seeing this???

Will post more when I know more but just want to make sure it aint a common issue (i have looked but dont see anyone else having a problem).


LL

Forgot to mention that we currently have a running version with lucene 2.9.3 and HS 2.3.1 and dont experience this problem in our app, it is just since we upgraded that we see it.


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Thu Jul 21, 2011 10:32 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
we currently have a running version with lucene 2.9.3 and HS 2.3.1

that should be HS 3.3.1 right?

I'm also not aware of problems in this area. Sometimes someone reports issues like this, but then it usually turns out they forgot to close something in a proper finally block.
Please let me know if you find out what's happening.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Thu Jul 21, 2011 12:51 pm 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Quote:
that should be HS 3.3.1 right?


Correct typo.


Quote:
Sometimes someone reports issues like this, but then it usually turns out they forgot to close something in a proper finally block.

Yep i know, thats why i just wanted to make sure.


Right i have made some headway on this and need the expert view. We are seeing this on our linux server and im seeing something funny on windows(local setup).

Ill go with my local version because its easier for me to play with the setup.

Ok we index many entities but just for test case i have focused on one

I start with a clean index for my entity, after server starts i check for any file handles to the index

1 . No handles
2. Kick off batch index for this entity, handles while running but when finished no handles open. Index now exists
3. Kick off batch index again, same result as 2.
4. At end of each run there are no handles open.


Now here is the strange part and i need your input on it
1. Start server with existing index in place.
2. Check for file handles. There is file handles (Nothing triggered or running)
3. Kick off batch index for this entity. More file handles opening but when finished only what was open in step 2 is present.
4. My index entity is double in size!!!


We have a similar setup to master/slave except instead of JMS we have a custom server which checks periodically against the DB for updates. We have a custom MasterDirectory but i have switched out for the above test case and used HS version and i still see the same.


The getDirectory on the MasterDirectoryProvider is the start

And eventually SharingBufferReaderProvider and IndexReader.open

Code:
protected IndexReader readerFactory(final Directory directory) throws IOException {
      return IndexReader.open( directory, true );
}


Makes my file handles look like so

Quote:
50: File C:\lucene\master\myentity\_g.nrm
54: File C:\lucene\master\myentity\_g.fdx
58: File C:\lucene\master\myentity\_g.fdt
5c: File C:\lucene\master\myentity\_g.prx
60: File C:\lucene\master\myentity\_g.frq
64: File C:\lucene\master\myentity\_g.tis


My entity however looks like so

Quote:
$ ls -l
total 30059
-r-xr-x---+ 1 mkgroup-l-d 2968775 Jul 21 18:28 _a.fdt
-r-xr-x---+ 1 mkgroup-l-d 548692 Jul 21 18:28 _a.fdx
-r-xr-x---+ 1 mkgroup-l-d 3346799 Jul 21 18:28 _a.frq
-r-xr-x---+ 1 mkgroup-l-d 1028794 Jul 21 18:28 _a.nrm
-r-xr-x---+ 1 mkgroup-l-d 1827444 Jul 21 18:28 _a.prx
-r-xr-x---+ 1 mkgroup-l-d 5618402 Jul 21 18:28 _a.tis
-r-xr-x---+ 1 mkgroup-l-d 2968775 Jul 21 18:29 _g.fdt
-r-xr-x---+ 1 mkgroup-l-d 548692 Jul 21 18:29 _g.fdx
-r-xr-x---+ 1 mkgroup-l-d 230 Jul 21 18:29 _g.fnm
-r-xr-x---+ 1 mkgroup-l-d 3346450 Jul 21 18:29 _g.frq
-r-xr-x---+ 1 mkgroup-l-d 1028794 Jul 21 18:29 _g.nrm
-r-xr-x---+ 1 mkgroup-l-d 1827444 Jul 21 18:29 _g.prx
-r-xr-x---+ 1 mkgroup-l-d 72787 Jul 21 18:29 _g.tii
-r-xr-x---+ 1 mkgroup-l-d 5618399 Jul 21 18:29 _g.tis
-r-xr-x---+ 1 mkgroup-l-d 20 Jul 21 18:29 segments.gen
-r-xr-x---+ 1 mkgroup-l-d 275 Jul 21 18:29 segments_7


So i have these handles open even though im doing nothing with the index, is this expected behaviour?? Will try the test case above on our linux server tomorrow just to confirm it but this is what im seeing on windows.

Should be an easy enough test case for you to try locally.

LL


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Thu Jul 21, 2011 1:03 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
the IndexReaders are pooled to provide a better Search performance, hence the index is opened at boot time and there is always at least an IndexReader open for each index (and that could use several file handles since an index is composed of several segments).

Now when batch-indexing the IndexReader might not notice the new arrivals, and old segments will only be removed when the indexer optimises the index in the end (optional but default), and at that point the already-open IndexReader might not know yet that it should free the previous file handles - this will be taken care of at the first Query.

So what you described looks like fine... it might be worth repeating the test by sequentially re-running the batch job, doing some queries, and run this in a loop, to see if there is a steady rise in file handles.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Thu Jul 21, 2011 1:04 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
this will be taken care of at the first Query.

actually, please read that as "will be taken care of at the next transaction", assuming you always close previous transactions.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Fri Jul 22, 2011 4:23 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Quote:
Now when batch-indexing the IndexReader might not notice the new arrivals, and old segments will only be removed when the indexer optimises the index in the end (optional but default), and at that point the already-open IndexReader might not know yet that it should free the previous file handles - this will be taken care of at the first Query.


I dont have any user queries. The app sole purpose is for generating the master index. So it just has two jobs, one checking for periodic updates and the second is for batch indexing. When master index is ready it copies over to shared and it is eventually gets to slaves where the real user queries happen.

Quote:
it might be worth repeating the test by sequentially re-running the batch job, doing some queries, and run this in a loop, to see if there is a steady rise in file handles.

I ran the batch job several times, no leakage of file handles.

This is how it looks to me
1. when index doesnt exist at startup, no handles open for it
2. Carry out batch job as many times you like, all handles closed at the end of job. So no file handles open or signs of leakage

If index exists
1. File handles open at start up
2. Batch job opens its own new reader(well this is how it looks) ignoring the already open reader, writes its own index as if there was no existing index thus ending up with double the index size!!


So looks like to me that Batch indexer doesnt reuse the already open reader from MasterDirectoryProvider, writes using its own version, closes its reader but the reader open from startup hasnt being cleaned up. Any thoughts on this? Where does the clean up of the references to the already open reader happen??

Funny enough though, it just stays double the original size. So i shutdown and restart with index containing double, run a batch run, stays the same size, so doesnt double again. So very strange!!


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Tue Aug 09, 2011 12:46 pm 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Hi Guys,
Sorry for the delay but only getting back to this now. Ive written a unit test and hopefully you can run it and tell me what you see.


Code:
import java.util.Properties;
import java.util.UUID;

import javax.persistence.EntityManager;
import javax.persistence.EntityManagerFactory;
import javax.persistence.Persistence;

import org.hibernate.Session;
import org.hibernate.search.FullTextSession;
import org.hibernate.search.Search;
import org.junit.Before;
import org.junit.Test;


public class FileLeakTest {

   private EntityManagerFactory emf = null;
   
   @Before
   public void setup(){
      emf = Persistence.createEntityManagerFactory("cors",getProperties());
   }
   
   private static Properties getProperties(){
      Properties props = new Properties();
      props.put("hibernate.connection.driver_class","org.h2.Driver");
      props.put("hibernate.dialect","org.hibernate.dialect.H2Dialect");
      props.put("hibernate.cache.provider_class","org.hibernate.cache.NoCacheProvider");
      props.put("hibernate.jdbc.charSet","UTF-8");
      props.put("hibernate.hbm2ddl.auto","create-drop");
      
      props.put("hibernate.connection.url","jdbc:h2:mem:test");
      props.put("hibernate.connection.username","sa");
      props.put("hibernate.connection.password","");

      props.put("hibernate.search.default.sourceBase","C:/lucene/fileleaktest/shared");
      props.put("hibernate.search.default.indexBase","C:/lucene/fileleaktest/master");
      props.put("hibernate.search.default.refresh","300");
      props.put("hibernate.search.default.directory_provider","org.hibernate.search.store.FSMasterDirectoryProvider");
      
      return props;
   }
   
   
   public FullTextSession getFulltextSession(){
      EntityManager em = emf.createEntityManager();
      Session session = (Session) em.getDelegate();
      FullTextSession fullTextSession = Search.getFullTextSession(session);
      return fullTextSession;
   }
   
   
   public void createDocuments(int size){
      FullTextSession session = getFulltextSession();
      try{
         session.getTransaction().begin();
         for(int i = 0; i < size; i++){
            HibDocument d = new HibDocument(UUID.randomUUID().toString(),UUID.randomUUID().toString(),UUID.randomUUID().toString());
            session.persist(d);
         }
         session.getTransaction().commit();

      }finally{
         session.close();
      }
   }
   
   
   public void batchIndex(){
      FullTextSession session = getFulltextSession();
      try {
         //tried with trans and without- same effect
         session.getTransaction().begin();
         session.createIndexer().batchSizeToLoadObjects(30)
         .threadsForSubsequentFetching(4)
         .threadsToLoadObjects(2)
         .startAndWait();
         session.getTransaction().commit();
         
      } catch (InterruptedException e) {
         e.printStackTrace();
      }finally{
         session.close();
      }
   }
   
   
   @Test
   public void testForLeaks(){
               //on second run we re-create again, not ideal, can use filebased Db here but i dont see this as an issue but can adjust it
      createDocuments(50000);
      while(true){
         try {
            batchIndex();
            Thread.currentThread().sleep(10000);
         } catch (InterruptedException e) {
            e.printStackTrace();
         }
      }
      
   }
   
   
}



HibDocument is actually your own test "Document.class", i just had to rename it for packaging purposes.

Code:
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;

import org.hibernate.search.annotations.Boost;
import org.hibernate.search.annotations.DocumentId;
import org.hibernate.search.annotations.Field;
import org.hibernate.search.annotations.Index;
import org.hibernate.search.annotations.Indexed;
import org.hibernate.search.annotations.Store;

@Entity
@Indexed(index = "Documents")
public class HibDocument {
   private Long id;
   private String title;
   private String summary;
   private String text;

   public HibDocument() {
   }

   public HibDocument(String title, String summary, String text) {
      super();
      this.summary = summary;
      this.text = text;
      this.title = title;
   }

   @Id
   @GeneratedValue
   @DocumentId
   public Long getId() {
      return id;
   }

   public void setId(Long id) {
      this.id = id;
   }

   @Field(store = Store.YES, index = Index.TOKENIZED)
   @Boost(2)
   public String getTitle() {
      return title;
   }

   public void setTitle(String title) {
      this.title = title;
   }

   @Field(name = "Abstract", store = Store.NO, index = Index.TOKENIZED)
   public String getSummary() {
      return summary;
   }

   public void setSummary(String summary) {
      this.summary = summary;
   }

   @Field(store = Store.NO, index = Index.TOKENIZED)
   public String getText() {
      return text;
   }

   public void setText(String text) {
      this.text = text;
   }
}


Hopefully it should be easy for you to run the above.

When i run the above with no index existing(so clean directory), the index size at the end when terminated is roughly 12 MB and after each batch run, all file handles are cleaned/gone. So no file handles present.

Kill the test case and re-run it with the existing "Documents" index present, if you break point before the batch index you shall notice the file handles open to the existing index. This is the master directory holding it open. Do a batch index, during the batch index you shall see more file handles open but are closed when batch index is finished. You should notice that the same file handles open at the very start are still open even though we have purged the index and ran a full batch index. The result is i have now doubled my index size from 12MB to 24MB. The file handles that Master directory opens at start up are never freed no matter how many batch runs take place.


Maybe im doing something wrong but it should be clear in test above what im doing wrong. One potential flaw is that im using the in memory DB and in the second run i re-create the same number of documents again but i dont see this as an issue. Like i said before i see this on windows and linux.

Thanks for the help,
LL


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Tue Aug 09, 2011 3:28 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi lovelyliatroim,
that looks like very useful, thanks!
Could you please open an issue on JIRA and attach your test to it?
So we can schedule it and I won't forget it (too many forum posts).

Also make sure to link the issue to this forum thread.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Wed Aug 10, 2011 3:58 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Done

https://hibernate.onjira.com/browse/HSEARCH-842


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Mon Oct 17, 2011 7:00 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Hi Guys,

Any sign of a solution for this?? In the meantime what approach would you suggest using for a workaround?? (https://hibernate.onjira.com/browse/HSEARCH-842)

Resort back to the old way of indexing before mass indexer? Need a solution for this issue.

Cheers,
JB


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Mon Oct 17, 2011 7:10 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi,
did you try version 4.0.0.CR1 ?

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Mon Oct 17, 2011 7:44 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Unfortunately it is not an option for me to step up to a newer version. Production release is fast approaching on the horizon and stepping up is not an option.

I need to come up with a workaround with the version we use.


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Mon Oct 17, 2011 8:43 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi, please help me understand how critical this issue is: I'm not much worried about it temporarily using twice the index size, as that is often needed for other maintenance processes too you should always have enough free space on the disk to store two or three times the optimized index, disk space needs to be flexible when adding new segments.

I've been analysing the behaviour of file deletion and file handles for other issues too: it seems that memory mapped files (the kind of Lucene uses to get good performance) can not be deleted aggressively: a deletion will happen eventually but it has to do with other timings, like garbage collector needing to kick in to run finalizers; look for example into

http://bugs.sun.com/bugdatabase/view_bu ... id=4715154
(but there are many others around deleting files in Java)

Of course if this is not the same case and the leaks are permanent that would be critical.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Mon Oct 17, 2011 11:10 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Hi Sanne,

Quote:
Hi, please help me understand how critical this issue is:


If you take the original test case posted in this thread, you should see the side effects this has. Should be straightforward for you to setup/follow.


LL


Top
 Profile  
 
 Post subject: Re: File Descriptor Leak
PostPosted: Mon Aug 06, 2012 8:56 am 
Regular
Regular

Joined: Thu Oct 08, 2009 10:34 am
Posts: 55
Hey guys,

I upgraded to 4.1 and i still see this.

Any movement on it??

https://hibernate.onjira.com/browse/HSEARCH-842




Cheers,
LL


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 26 posts ]  Go to page 1, 2  Next

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.