-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 11 posts ] 
Author Message
 Post subject: Unindexing soft deletes
PostPosted: Sat Dec 10, 2011 6:06 pm 
Newbie

Joined: Sat Dec 10, 2011 5:57 pm
Posts: 11
Hi everybody.
What is the right approach to unindex soft deleted entities? I used to extend FullTextIndexEventListener, overwriting onPostUpdate() action, checking entity's isDeleted() flag in that method.
But in lates Hibernate Search it's no more easily possible to overwrite the class and the method. I'm certain there must be some better solution...

Thanks in advance


Top
 Profile  
 
 Post subject: Re: Unindexing soft deletes
PostPosted: Mon Dec 12, 2011 6:58 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi,
since I don't think we offer a better strategy (suggestions welcome), you will still likely need to replace the FullTextIndexEventListener; what is preventing you to do so?

You would likely have to write your own version of org.hibernate.search.hcore.impl.HibernateSearchIntegrator to register your custom liseners, and you can disable auto-registration of the default one by setting the Hibernate property hibernate.search.autoregister_listeners to false.

As an alternative, you could use sharding: soft deleted entities go to a special shard which is configured to use the blackhole backend, which just discards any index update.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Unindexing soft deletes
PostPosted: Mon Dec 12, 2011 8:24 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Quote:
As an alternative, you could use sharding: soft deleted entities go to a special shard which is configured to use the blackhole backend, which just discards any index update.


I like this idea. It is actually a good use-case for the otherwise only for testing used blackhole backend :-)


Top
 Profile  
 
 Post subject: Re: Unindexing soft deletes
PostPosted: Mon Dec 12, 2011 8:32 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
yep I like it too as it's pretty clean, but Documents would still be built before being discarded, hence they might even trigger additional loads from the database. Not optimal performance-wise.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Unindexing soft deletes
PostPosted: Mon Dec 12, 2011 9:29 am 
Newbie

Joined: Mon Aug 30, 2010 3:57 am
Posts: 4
s.grinovero wrote:
Hi,
since I don't think we offer a better strategy (suggestions welcome), you will still likely need to replace the FullTextIndexEventListener; what is preventing you to do so?


Hi, thanks for your reply.
Well, it would be probably possible to copy FullTextIndexEventListener and overwrite it to suit my needs. But ideally I would like to overwite just the onPostUpdate() method. It's then easier to maintain my code regarding HSearch updates. But the onPostUpdate() method calls some private attribute (dirtyStrategy) so I haven't managed to overwrite it yet.

s.grinovero wrote:
As an alternative, you could use sharding: soft deleted entities go to a special shard which is configured to use the blackhole backend, which just discards any index update.


As you mentioned later, the performance may be low in this case. So I would like to avoid building documents on index delete action.
A similar strategy may be to index even softly deleted entities with a deleted field. Then filter them out at search time with some Lucene filter.


Top
 Profile  
 
 Post subject: Re: Unindexing soft deletes
PostPosted: Mon Dec 12, 2011 9:39 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
MDMDMD wrote:
As you mentioned later, the performance may be low in this case. So I would like to avoid building documents on index delete action.
A similar strategy may be to index even softly deleted entities with a deleted field. Then filter them out at search time with some Lucene filter.


Right. Indexing the soft deleted entities and then apply a filter at search time is yet another possibility. You still are building Lucene documents for these soft deleted entities though. I don't think there is performance gain compared to the blackhole approach. Also with the filtering approach the index is bigger than with the backhole approach. The blackhole discards the documents which is also the reason that you don't need a filter in this case.

After all it depends a little on the use-case and the personal taste to chose the best strategy.

--Hardy


Top
 Profile  
 
 Post subject: Re: Unindexing soft deletes
PostPosted: Mon Dec 12, 2011 9:39 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
Well, it would be probably possible to copy FullTextIndexEventListener and overwrite it to suit my needs. But ideally I would like to overwite just the onPostUpdate() method. It's then easier to maintain my code regarding HSearch updates. But the onPostUpdate() method calls some private attribute (dirtyStrategy) so I haven't managed to overwrite it yet.

Please if you could propose a patch for FullTextIndexEventListener making the private method protected and other changes as you might need, we will integrate that. I think we still will at some point make the FullTextIndexEventListener final and non-extensions friendly, but obviously only after having found a decent solution for your use case.

Quote:
As you mentioned later, the performance may be low in this case. So I would like to avoid building documents on index delete action.
A similar strategy may be to index even softly deleted entities with a deleted field. Then filter them out at search time with some Lucene filter.

Wouldn't be that similar: you would do real I/O on the index which is the most expensive operation, invalidate all Reader caches, make your index larger and force you to use a filter, and use more memory for every query.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Unindexing soft deletes
PostPosted: Mon Dec 12, 2011 10:01 am 
Newbie

Joined: Mon Aug 30, 2010 3:57 am
Posts: 4
s.grinovero wrote:
Wouldn't be that similar: you would do real I/O on the index which is the most expensive operation, invalidate all Reader caches, make your index larger and force you to use a filter, and use more memory for every query.


You are right, good point...

However I'm quite surprised there is no "better" solution, a soluttion right for this usescase. I thought the unindex-soft-deletes usecase is quite common.


Top
 Profile  
 
 Post subject: Re: Unindexing soft deletes
PostPosted: Mon Dec 12, 2011 10:24 am 
Newbie

Joined: Sat Dec 10, 2011 5:57 pm
Posts: 11
s.grinovero wrote:
Please if you could propose a patch for FullTextIndexEventListener making the private method protected and other changes as you might need, we will integrate that. I think we still will at some point make the FullTextIndexEventListener final and non-extensions friendly, but obviously only after having found a decent solution for your use case.


So I've finaly managed to overwrite it. Problem was that I didn't perform clean&build in NB IDE after HSearch update, so I was getting strange errors. But now I have overwritten onPostUpdate() method only and it works.

If you can make FullTextIndexEventListener.getDocumentBuilder() method protected (which I hade to copy) it will be all I need.


Top
 Profile  
 
 Post subject: Re: Unindexing soft deletes
PostPosted: Mon Dec 12, 2011 11:59 am 
Newbie

Joined: Wed Sep 21, 2011 2:20 pm
Posts: 16
MDMDMD wrote:
However I'm quite surprised there is no "better" solution, a soluttion right for this usescase. I thought the unindex-soft-deletes usecase is quite common.

I can only quote this. there should be a better solution. It think it is a quite common use case. I e. g. have an attribute "visible" in an indexed entity. I am using a filter while searching. But it would be much better, if the attributes with visible=false dont go into the index. I do think that it is very usual, that the index must not be a one to one copy of the table.

it would be good to have a kind of configurable bridge (or whatever) that get the entity and can return false when the entity should not be indexed. And HS can the decide, whether just to ignore the entity or delete it (if it was indexed before).

I mean, it is possible not to include all attributes/columns of a table in the index, but it should also be possible to not include some entities/rows at all in the index (based on some computation an the entity).

kind regards


Top
 Profile  
 
 Post subject: Re: Unindexing soft deletes
PostPosted: Fri Dec 16, 2011 3:20 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
I'm agreeing on all your use cases, sounds important to implement this.
But there is a catch: let's say you have an indexed entity and it's state changes, so that you get in either of these cases:

- you had it indexed before, and now it should not
- it wasn't indexed before, now it should

I don't think we can detect this situation, hence to make sure the index is updated we should delete documents which might not be in the index, or use update statements even for elements which might being added only for the first time. (and update as a performance cost significantly higher than just add).

We might even add more listeners in Hibernate Core to detect the state change, but that is complex and most importantly it won't be able to take the proper decision in all cases, for example when your index exclusion/inclusion rule evaluates external attributes, like the current time.

How would you expect the new feature to look like, API wise? We need some ideas..

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 11 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.