-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 7 posts ] 
Author Message
 Post subject: PostCollectionRecreateEvent causes reindexing
PostPosted: Wed May 06, 2009 8:13 am 
Newbie

Joined: Thu Apr 24, 2008 4:55 am
Posts: 6
Hi,

I am trying to implement a way to supress indexing if and only if fields that are indexed are modified.

After thinking about it and trying some things out, I decided to override org.hibernate.search.event.FullTextIndexEventListener (FullTextIndexEventListener).

My code now checks the event, extract the modified object and then checks my own function which says if indexed fields were modified for this object.

My issue is with PostCollectionRecreateEvent. This even is fired sometimes when the session auto-flushes.
The problem is that it is causing reindexing to occur. This event is captured by FullTextIndexEventListener and then work is queued that I suspect would reindex the item. I have noticed PostCollectionRecreateEvent is fired for collections that have nothing to do with indexing and should not cause indexing to occur.

Does anyone know about the design of FullTextIndexEventListener and why this event is treated this way?


Top
 Profile  
 
 Post subject: Re: PostCollectionRecreateEvent causes reindexing
PostPosted: Thu May 14, 2009 2:05 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi,
the PostCollectionRecreateEvent event is registered in hibernate (core) so that Search is able to manage indexed collections; Core however doesn't know the difference about indexed or not indexed entities so the event has to be rised, and Search should ignore this kind of event for unindexed collections.
If they are not ignored (I didn't verify) please open a JIRA !

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: PostCollectionRecreateEvent causes reindexing
PostPosted: Thu May 14, 2009 4:29 am 
Newbie

Joined: Thu Apr 24, 2008 4:55 am
Posts: 6
Hi,

Can you please explain a bit more about why is this event important?

Also note hibernate not capable of detecting changes (see bugs quoted below) , so how can it check for changes in the parent entity or the children?

http://opensource.atlassian.com/project ... SEARCH-361
http://opensource.atlassian.com/project ... /HSEARCH-5


Top
 Profile  
 
 Post subject: Re: PostCollectionRecreateEvent causes reindexing
PostPosted: Thu May 14, 2009 6:47 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hibernate is definitely able to detect changes in all it's managed entities;
what's not really doable is to detect if the index is still consistent with the changed entities as the transformation into a Lucene Document is, in most use cases, one way.
So if hibernate detects "some" change happened on the entity,
it will fire this event and Search will have to recreate the Lucene Document to replace the existing in the index.
What is not implemented is the idea that you might verify that only non-indexed fields changed, making the reindexing unnecessary; still if only one field changed or one custom classbridge/fieldbridge is present it might be necessary (maybe time-dependent information, or depending from the state of non-indexed fields).

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: PostCollectionRecreateEvent causes reindexing
PostPosted: Thu May 14, 2009 7:40 am 
Newbie

Joined: Thu Apr 24, 2008 4:55 am
Posts: 6
So in order for reindexing to be useful in the context of this event the following conditions have to be met

1. The collection is used in the lucene index.
2. an element in the collection was modified.

And the assumption is that there would be some class/field bridge that would now be fired and the lucene index be correctly updated.

However, I have noticed the event fired when entity was modified, but collections not modifed at all, and are not relevant for indexing (which I undrestand Hibernate cannot know).
This would not be noticed easily. Because normally changing the enity forces reindexing, except in my case when I supress it.

In any case, I can't see a more optimal solution than what I imlemented, because Lucene has no idea what fields Im gonna add to the index and what metadata they will be ussing. So it is up to me to maintain this dirty bit and use it.

I do think that actually there is a deep issue caused hibernate search not aware of the fields. The fieldbridge/classbridge design is too loose and allows the developer to create any field they like - which is why hibernate cannot detect changes.

Perhaps the fieldbridge/class bridge should take into cosideration the matter of dirty checking so that the developer has to get involved and implement something. E.g. add a function called isDirty() to each field/classbridge. This would still leave significant amount of work for the developer - but at least now they are aware of it.


Top
 Profile  
 
 Post subject: Re: PostCollectionRecreateEvent causes reindexing
PostPosted: Thu May 14, 2009 9:01 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
yes you have some good ideas, but would be better to have the isDirty() implementation optional as in most cases it's not a priority for the developer.
but this is http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-361
,right?

What about implementing first a simpler improvement, like ignoring the collection event if a)there are no classbridges (or some classbridge option is "enabling" this) b)there are no indexed collections ?
In this case you could avoid all hash/dirty strategies, which could eventually be added later.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: PostCollectionRecreateEvent causes reindexing
PostPosted: Thu May 14, 2009 9:39 am 
Newbie

Joined: Thu Apr 24, 2008 4:55 am
Posts: 6
I think you have the right idea here the source of all trouble is the class bridge and its ability to add fields which are unknown to hibernate.
But there is another way to add an "unknown" field.
In a field bridge (not class bridge) , it is possible (and I do it) to add fields having a diffrent name than what it says in the annotation or the name of the property.
I do it to implement the following.
We have Content C1,C2
Content have field called "title".
C1 relates to C2 using property called "related" (many to one relationship).
What I wanted is that when you search some word that appear in C1 , you would get search result C1,C2 because C1 is related to C2 , but it must appear in lower rank because it does not appear as part of the main content.
So now we have two diffrenet fields . One for C1.title and one for C1.related.title.
I implemented this by putting a field bridge on "title" such that when the field bridge detects "related.title" in the name it will add it to the parent content BUT it will give it a lower boost value (implemeted using payload).

What is the best solution here going forward I dont know sure perhaps some more people should say their experiences.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 7 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.