-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 9 posts ] 
Author Message
 Post subject: Update Index using Lucene
PostPosted: Thu Sep 17, 2009 11:00 am 
Pro
Pro

Joined: Wed Oct 03, 2007 2:31 pm
Posts: 205
Hi

I have a requirement to update the index created by HSearch and add some additional fields that are not mapped and cannot be mapped. So basically I was thinking to doing this and I was wondering if it is ok?

Code:
public void addAdditionalFieldsToIndex(Map<String, String> valuesToIndex, Long id) throws Exception{
      if (valuesToIndex == null || valuesToIndex.isEmpty()) {
         return;
      }
      FullTextSession fullTextSession= Search.getFullTextSession(sessionFactory.getCurrentSession());
      SearchFactory searchFactory = fullTextSession.getSearchFactory();
      @SuppressWarnings("unchecked")DirectoryProvider[] providers = searchFactory.getDirectoryProviders(Risk.class);
      ReaderProvider readerProvider = searchFactory.getReaderProvider();
      
      IndexReader indexReader = readerProvider.openReader(providers);
      IndexWriter indexWriter = null;
      Term idTerm= new Term("id", String.valueOf(id));
      try {
         TermDocs termDocs = indexReader.termDocs(riskIdTerm);
         Analyzer analyzer = searchFactory.getAnalyzer(IndexedClass.class);
         indexWriter = new IndexWriter(indexReader.directory(), analyzer, MaxFieldLength.UNLIMITED);
         while (termDocs.next()) {
            Document loadedDoc = indexReader.document(termDocs.doc());
            for(Map.Entry<String, String> entry: valuesToIndex.entrySet()) {
               loadedDoc.add(new Field(entry.getKey(), entry.getValue(), Field.Store.YES, Field.Index.ANALYZED));
            }
            indexWriter.updateDocument(idTerm, loadedDoc);
         }
      } finally {
         if (indexWriter != null) {
            try {
               indexWriter.close();
            } catch (IOException ioex) {
               throw new IllegalStateException(ioex);
            }
         }
         readerProvider.closeReader(indexReader);
         fullTextSession.close();
      }
      
      
   }


Is the above ok to use or is there any issues that I need to be concerned about as I am changing the document. We will be doing as master/slave configuration and do we have to create a LuceneWork item and place on the queue?


Top
 Profile  
 
 Post subject: Re: Update Index using Lucene
PostPosted: Thu Sep 17, 2009 5:09 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
The code itself is looking ok, but you're loosing a lot in flexibility. This is not going to work in master/slave configuration of course, as you shouldn't write to the local index but have to send changes in the form of LuceneWork(s).
Keep in mind that indexwriter.updateDocument() is going to do delete+insert, so we usually prefer to map them that way as some optimizations could be applied; generally speaking Hibernate Search takes care of most optimizations, but in this way you'll have to learn how and apply yourself. Of course LuceneWork only knows about add and delete, so you'll have to map your updates to adds and deletes anyway.

About the point of flexibility, why don't you use a ClassBridge? you can define all fields yourself, like you did with the Map<String,String>, but get away from IO concerns, transaction concerns, and a lot of code.

Another way to map fields to the index without mapping them to the database is to combine the @Transient and @Field annotations on the same getter: inside the getter logic you put the "how to define the value for this unmapped field value", and then still have all declarative features to define how it's going to be indexed. This is IMHO much better as it's also future proof, like in next version you'll be able to use the MassIndexer to rebuild your data, or use the QueryBuilder API, while with your code Hibernate Search can't help you out with nice new features.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Update Index using Lucene
PostPosted: Fri Sep 18, 2009 3:37 am 
Pro
Pro

Joined: Wed Oct 03, 2007 2:31 pm
Posts: 205
s.grinovero wrote:
The code itself is looking ok, but you're loosing a lot in flexibility. This is not going to work in master/slave configuration of course, as you shouldn't write to the local index but have to send changes in the form of LuceneWork(s).
Keep in mind that indexwriter.updateDocument() is going to do delete+insert, so we usually prefer to map them that way as some optimizations could be applied; generally speaking Hibernate Search takes care of most optimizations, but in this way you'll have to learn how and apply yourself. Of course LuceneWork only knows about add and delete, so you'll have to map your updates to adds and deletes anyway.

About the point of flexibility, why don't you use a ClassBridge? you can define all fields yourself, like you did with the Map<String,String>, but get away from IO concerns, transaction concerns, and a lot of code.

Another way to map fields to the index without mapping them to the database is to combine the @Transient and @Field annotations on the same getter: inside the getter logic you put the "how to define the value for this unmapped field value", and then still have all declarative features to define how it's going to be indexed. This is IMHO much better as it's also future proof, like in next version you'll be able to use the MassIndexer to rebuild your data, or use the QueryBuilder API, while with your code Hibernate Search can't help you out with nice new features.



Thanks for your reply. I would like to do what you recommend but unfortunately the model that I am working with makes almost impossible to do. Basically we receive an xml message from our client which is magically converted into an object which is persisted however the data I need is stored somewhere completely different and it is impossible to link to the object i am indexing. I have spent a couple of days trying to create a relationship but with no luck.

So if i want to use the master slave configuration I need to do the following:

Code:
  FullTextSession fullTextSession = Search.getFullTextSession(sessionFactory.getCurrentSession());
        DirectoryProvider[] directoryProviders = fullTextSession.getSearchFactory().getDirectoryProviders(IndexedClass.class);
        ReaderProvider readerProvider =  fullTextSession.getSearchFactory().getReaderProvider();
        IndexReader indexReader = readerProvider.openReader(directoryProviders[0]);
        IndexWriter indexWriter = null;
        final List<LuceneWork>  queue = new ArrayList<LuceneWork>();
        try {
            Term t = new Term("id", String.valueOf(id));
            TermDocs termDocs = indexReader.termDocs(t);
            if (termDocs.next()) {
                if (IndexWriter.isLocked(directoryProviders[0].getDirectory())) {
                    IndexWriter.unlock(directoryProviders[0].getDirectory());
                }
                Document docLoaded = indexReader.document(termDocs.doc());
                for(Map.Entry<String, String> entry: values.entrySet()) {
                    docLoaded.add(new Field(entry.getKey(), entry.getValue(), Field.Store.YES, Field.Index.ANALYZED));
                }
                LuceneWork deleteWork = new DeleteLuceneWork(id, id.toString(), IndexedClass.class);
                LuceneWork addWork = new AddLuceneWork(id, id.toString(), IndexedClass.class, docLoaded);
                queue.add(deleteWork);
                queue.add(addWork);
         
                jmsTemplate.send(destination, new MessageCreator() {
                      public Message createMessage(Session session) throws JMSException {
                        ObjectMessage objectMessage = session.createObjectMessage();
                        objectMessage.setObject((Serializable)queue);
                        return objectMessage;
                    }
                });
            }
        } finally {
            readerProvider.closeReader(indexReader);
        }


The above would participate in an existing transaction. Is creating a delete and add work item ok?

Any help is extremely appreciated!


Top
 Profile  
 
 Post subject: Re: Update Index using Lucene
PostPosted: Fri Sep 18, 2009 4:14 am 
Pro
Pro

Joined: Wed Oct 03, 2007 2:31 pm
Posts: 205
Just realised the solution may not work. The code handles when the object is created. However if the object is updated then we will lose the additional fields that we added to the LuceneDocument. Back to the drawing board.


Top
 Profile  
 
 Post subject: Re: Update Index using Lucene
PostPosted: Fri Sep 18, 2009 4:52 am 
Pro
Pro

Joined: Wed Oct 03, 2007 2:31 pm
Posts: 205
Hi

Is it possible to add some logic to the FullTextIndexEventListener to add the additional fields? Could it be extended to get the additional data via sql and then add to the document created? I couldn't see anywhere where it was done. Another alternative I was thinking is adding the fields to the document when we recieve the message in the AbstractJMSHibernateSearchController. This would be mean hitting the database to get the data and then add to the document and then let the super.onMessage do the work.

Is there a way to test this using non jms?

Cheers
Amin


Top
 Profile  
 
 Post subject: Re: Update Index using Lucene
PostPosted: Fri Sep 18, 2009 5:44 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
if (IndexWriter.isLocked(directoryProviders[0].getDirectory())) {
IndexWriter.unlock(directoryProviders[0].getDirectory());
}

Don't do this. Why are you unlocking? the index is in read-only mode anyway when opened by a ReaderProvider, you don't want to apply changes to it directly, also as it would break the master-slave configuration.

Another problem, you can't just read the local version for the Document, add fields and then send it back to the master: you might have an out-of date Document as the master-slave index copy is asynchronous: don't trust the index current state, trust the database: that's really transactional and that's what you regularly backup.

Quote:
Is it possible to add some logic to the FullTextIndexEventListener to add the additional fields? Could it be extended to get the additional data via sql and then add to the document created? I couldn't see anywhere where it was done.

You can extend FullTextIndexEventListener and register your own instead of the default, still I don't like this solution.

Could we go back to your first sentence, and elaborate a bit more?
Quote:
I have a requirement to update the index created by HSearch and add some additional fields that are not mapped and cannot be mapped.

where do the values for this additional fields come from? and why can't you map them?

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Update Index using Lucene
PostPosted: Fri Sep 18, 2009 6:31 am 
Pro
Pro

Joined: Wed Oct 03, 2007 2:31 pm
Posts: 205
Hi

Thanks for your response. Some of the code that I added was experimental and i agree completely with what you mentioned. After talking to the original developer of the domain model i managed to find a way to map the fields (using formula, etc). I now have the fields that I want and these are apart of the indexed entity.

Apologies for the long wided thread...our domain model is overly complicated for what we are doing!

Cheers


Last edited by amin-mc on Fri Sep 18, 2009 8:42 am, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Update Index using Lucene
PostPosted: Fri Sep 18, 2009 6:44 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Code:
After spending a few more hours and talking to the developer who original wrote the domain model (left the company as well) we managed to find a solution that works with HibernateSearch. No updates to anything except for the domain model.

Thanks!

cool! As a final thought, it would have been really weird to have fields coming from nowhere, and also quite hard to present the query results back without having an DTO, which you could have mapped to HS and was probably good to save in database for several reasons (record, backup, index rebuilding, data state validation, not to mention debugging).

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Update Index using Lucene
PostPosted: Fri Sep 18, 2009 8:53 am 
Pro
Pro

Joined: Wed Oct 03, 2007 2:31 pm
Posts: 205
s.grinovero wrote:
Code:
After spending a few more hours and talking to the developer who original wrote the domain model (left the company as well) we managed to find a solution that works with HibernateSearch. No updates to anything except for the domain model.

Thanks!

cool! As a final thought, it would have been really weird to have fields coming from nowhere, and also quite hard to present the query results back without having an DTO, which you could have mapped to HS and was probably good to save in database for several reasons (record, backup, index rebuilding, data state validation, not to mention debugging).


You have to see our codebase...:)


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 9 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.