-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 8 posts ] 
Author Message
 Post subject: Restricting Hibernate/Lucene Indexing of the Entity
PostPosted: Mon Sep 24, 2012 2:56 am 
Newbie

Joined: Mon Sep 24, 2012 2:40 am
Posts: 5
I am fairly new to Lucene and Hibernate search and using FullTextEntityManager to create indexes for my Entity.
My Entity has more than 12 Million records (last 4-5 year data) and it has a updated_date_time field.
Is it possible to limit/restrict indexing of my entity to a certain date (i.e. index just last 2 years)

Code:
private Future doIndex(Class<?>... classes) {
        logger.info("Starting Hibernate Search index");
        FullTextEntityManager fullTextEntityManager = entityManagerFactory.createFullTextEntityManager();
        Future indexer = fullTextEntityManager.createIndexer(classes)               
                .batchSizeToLoadObjects(50)
                .threadsToLoadObjects(1)
                .start();
        fullTextEntityManager.close();
        return indexer;
    }


Top
 Profile  
 
 Post subject: Re: Restricting Hibernate/Lucene Indexing of the Entity
PostPosted: Mon Sep 24, 2012 4:50 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi,
yes we added a way to decide for each entity instance if you want the indexing to happen or not:

http://docs.jboss.org/hibernate/search/4.1/reference/en-US/html_single/#search-mapping-indexinginterceptor

This is applied by the MassIndexer as well, but it's taking advantage of it in only after loading (to skip indexing), so it will be much faster in the index writing phase but will still load all the entities from the database.
We should improve that: as you can see this new feature is flagged experimental in the docs, we're waiting on more feedback on how people think the API looks like and if it's addressing your needs. Let me know how it works for you ;-)

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Restricting Hibernate/Lucene Indexing of the Entity
PostPosted: Mon Sep 24, 2012 7:47 pm 
Newbie

Joined: Mon Sep 24, 2012 2:40 am
Posts: 5
Thanks Sanne, I will try this out and provide my feedback :)


Top
 Profile  
 
 Post subject: Re: Restricting Hibernate/Lucene Indexing of the Entity
PostPosted: Tue Sep 25, 2012 12:35 am 
Newbie

Joined: Mon Sep 24, 2012 2:40 am
Posts: 5
Hi,
I have tried interceptor but it doesn't seem to work.

This is my entity class
Code:
@Entity
@Indexed(interceptor=RestrictIndexOnDateInterceptor.class)
@Table(name = "transaction")
public class Transaction implements Serializable {

    @Id
    @Column(name = "transaction_id")
    private BigDecimal transaction_id;
    .....
    @Column(name = "updated_tm")
    private Date updated_tm;
    ...

   public Date getUpdated_tm() {
        return updated_tm;
    }
   ...
}


This is the interceptor class
Code:
public class RestrictIndexOnDateInterceptor implements EntityIndexingInterceptor<Transaction> {

    private int NO_OF_YEARS_TO_RETAIN_INDEXING = 2;
    private final Logger logger = LoggerFactory.getLogger(RestrictIndexOnDateInterceptor.class);

    @Override
    public IndexingOverride onAdd(Transaction entity) {

        logger.info("Adding Index");

        if(entity.getUpdated_tm()!=null) {
            DateTime lastEntityUpdatedTime = new DateTime(entity.getUpdated_tm());
            if(lastEntityUpdatedTime.isBefore(new DateTime().minusYears(NO_OF_YEARS_TO_RETAIN_INDEXING)))
                logger.info("Adding Index");
                return IndexingOverride.APPLY_DEFAULT;

        }

        logger.info("Skipping Index");
        return IndexingOverride.SKIP;
    }

    @Override
    public IndexingOverride onUpdate(Transaction entity) {

        logger.info("Updating Index");

        if(entity.getUpdated_tm()!=null) {
            DateTime lastEntityUpdatedTime = new DateTime(entity.getUpdated_tm());
            if(lastEntityUpdatedTime.isBefore(new DateTime().minusYears(NO_OF_YEARS_TO_RETAIN_INDEXING)))
                logger.info("Updating Index");
                return IndexingOverride.UPDATE;
        }
        logger.info("Removing index");
        return IndexingOverride.REMOVE;
    }

    @Override
    public IndexingOverride onDelete(Transaction entity) {
        return IndexingOverride.APPLY_DEFAULT;
    }

    @Override
    public IndexingOverride onCollectionUpdate(Transaction entity) {
        return onUpdate(entity);
    }
}



This one creates the indexes
Code:
@Component
public class HibernateSearchIndex {
....

    private Future doIndex(Class<?>... classes) {
        logger.info("Starting Hibernate Search index");
        FullTextEntityManager fullTextEntityManager = entityManagerFactory.createFullTextEntityManager();
        Future indexer = fullTextEntityManager.createIndexer(classes)
                .progressMonitor(progressMonitor())
                .batchSizeToLoadObjects(batchSize)
                .threadsToLoadObjects(1)
                .start();
        fullTextEntityManager.close();
        return indexer;
    }
.....
}


I tried this program on a subset of data (50K records).
When i try to search for some transaction which is more than 2 years old, it still gives me the result.

Also the log does not show any of the logger.info statements.
Is this interceptor really getting called?


Top
 Profile  
 
 Post subject: Re: Restricting Hibernate/Lucene Indexing of the Entity
PostPosted: Thu Sep 27, 2012 10:38 pm 
Newbie

Joined: Mon Sep 24, 2012 2:40 am
Posts: 5
Hi,

Please let me know if you can point me to anything?
I am stuck here.

Regards,
Sumit


Top
 Profile  
 
 Post subject: Re: Restricting Hibernate/Lucene Indexing of the Entity
PostPosted: Fri Sep 28, 2012 5:19 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi,
apparently the MassIndexer is ignoring the indexing interceptor. I'll look into that tomorrow and will fix the issue if there is any.

In the meantime, could you use the other indexing strategy, the one described as "using flush to indexes" in the docs at
http://docs.jboss.org/hibernate/search/4.2/reference/en-US/html_single/#search-batchindex-flushtoindexes
?

In that way since you manually control the process, you can define the Criteria selecting the range of entities for indexing directly, so you can add any restriction you might need.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Restricting Hibernate/Lucene Indexing of the Entity
PostPosted: Thu Oct 04, 2012 1:58 am 
Newbie

Joined: Mon Sep 24, 2012 2:40 am
Posts: 5
Thanks Sanne.

Just a quick question, what version of hibernate is recommended ?

I am using the following :

<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-core</artifactId>
<version>4.1.2.Final</version>
</dependency>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-entitymanager</artifactId>
<version>4.1.2.Final</version>
</dependency>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-validator</artifactId>
<version>4.2.0.Final</version>
</dependency>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-search</artifactId>
<version>4.1.0.Final</version>
</dependency>


I saw there is a BETA release for hibernate-search 4.2.0.Beta1.
It this feature activated in this release?


Top
 Profile  
 
 Post subject: Re: Restricting Hibernate/Lucene Indexing of the Entity
PostPosted: Thu Oct 04, 2012 6:46 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
This is the list from the 4.1.1 final release of Hibernate Search, so they are certainly tested together:

Code:
  org.hibernate:hibernate-core:4.1.3.Final
  org.hibernate:hibernate-search-engine:4.1.1.Final
  org.hibernate:hibernate-search-orm:4.1.1.Final
  org.apache.lucene:lucene-analyzers:3.5.0
  org.hibernate.common:hibernate-commons-annotations:4.0.1.Final
  org.hibernate:hibernate-entitymanager:4.1.3.Final
  org.apache.lucene:lucene-core:3.5.0
  dom4j:dom4j:1.6.1
  org.jboss.logging:jboss-logging:3.1.0.GA
  org.hibernate.javax.persistence:hibernate-jpa-2.0-api:1.0.1.Final
  org.javassist:javassist:3.15.0-GA


If you take the 4.2 Beta1 release, or try [master] from git, most of these are slightly updated; the notable difference is Search 4.2.x is working with Lucene 3.6.x rather than 3.5.x

I just made these lists by running "mvn dependency:list" from the tagged version; another way to check the recommended combination of jars is to look in the distribution and see which jar versions it is containing.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 8 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.