-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 
Author Message
 Post subject: behavior of @AnalyzerDiscriminator on collections
PostPosted: Tue May 17, 2011 9:45 am 
Newbie

Joined: Tue Jan 18, 2011 9:01 am
Posts: 4
Hello all!

I am currently working on i18n of certain properties of our JPA entities. I wanted to use @AnalyzerDiscriminator on a 'language' field of my localized entities as follows (I only include the important part of the code):
Code:
@Entity
@Indexed
@AnalyzerDefs({
      @AnalyzerDef(name = "en", tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = { @TokenFilterDef(params = { @Parameter(value = "language", name = "English") }, factory = SnowballPorterFilterFactory.class) }),
      @AnalyzerDef(name = "fr", tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = { @TokenFilterDef(params = { @Parameter(value = "language", name = "French") }, factory = SnowballPorterFilterFactory.class) }) })
@Analyzer(definition = "fr")
public class IndexedEntity {

   @Id
   @GeneratedValue(strategy = GenerationType.AUTO)
   private long id;

   @OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL)
   @MapKey(name = "language")
   @IndexedEmbedded
   private Map<String, IndexedEntityI18N> i18n = new HashMap<String, IndexedEntityI18N>();

[...]
}

@Entity
public class IndexedEntityI18N {

   @Id
   @GeneratedValue(strategy = GenerationType.AUTO)
   private long id;

   private String language;

   @Field
   private String name;

   @AnalyzerDiscriminator(impl = I18NDiscriminator.class)
   public String getLanguage() {
      return language;
   }
[...]
}

public class I18NDiscriminator implements Discriminator {

   @Override
   public String getAnalyzerDefinitionName(Object value, Object entity,
         String field) {
      return (String) value;
   }

}

The idea is to index the same field several times using a map of localized objects. The analyzer discriminator was meant to point to the appropriate language analyzer via 'language' property (i.e. if language is set to 'en' we use 'en' analyzer etc.). I was really happy with this design :) Unfortunately it doesn't work like this. After some debugging I found out that the analyzer is determined only once for all collection elements (as they have all the same index keys).

The solution is then to 'localize' also the index key. This way it will be unique for each collection element and the analyzer will be determined separately for each one of them. For example using a class bridge like this (I know it is not generic but it's just an example):
Code:
public class I18NBridge implements FieldBridge {

   @Override
   public void set(String name, Object value, Document document,
         LuceneOptions options) {

         IndexedEntityI18N i18nInfos = (IndexedEntityI18N) value;
         options.addFieldToDocument("i18n.name." + i18nInfos.getLanguage(),
               i18nInfos.getName(), document);

      }

   }

}

Now we will have separate 'i18n.name.fr' and 'i18n.name.en' index keys indexed with correct analyzers.
The only problem is that this approach complexifies search queries if we want to perform search on all language versions at once.

The questions is - is this behavior on collections a bug or is it voluntary (and thus have a logical explanation)?

Thanks in advance for your opinions!

Best regards,
Michal


Top
 Profile  
 
 Post subject: Re: behavior of @AnalyzerDiscriminator on collections
PostPosted: Wed May 18, 2011 9:34 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
The problem is mainly that you don't want to mix different analyzers on the same field: when you perform a query, you want to analyze the query text from the user using the same analyser of the field you're matching.
If you where to apply different analysers (per language) on the same field, they would all be stored in the same field and matching it properly would be a mess.. likely not the proper results you'd expect; maybe it looks like it works but hardly predictable.

I agree the code is a bit verbose, if you have suggestions for improvements they're welcome.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.