Hello!
I have tried to set up a multilanguage application for using Hibernate Search. I have some trouble and would be very thankfull if you could have a look at it and give me some help!
The Arrangement class
Code:
@Indexed
public class Arrangement {
private Set<Text> summarytexts = new HashSet<Text>();
private Set<Tag> tags = new HashSet<Tag>();
public Arrangement() {}
@ManyToMany(targetEntity=Tag.class, cascade={CascadeType.PERSIST, CascadeType.MERGE}, fetch=FetchType.LAZY)
@JoinTable(name="ARRANGEMENT_TAG", joinColumns={@JoinColumn(name="arrangement_id")}, inverseJoinColumns={@JoinColumn(name="arrangementtag_id")})
@IndexedEmbedded
@Boost(2.5f)
public Set<Tag> getTags() {
return tags;
}
public void setTags(Set<Tag> arrangementtags) {
this.tags = arrangementtags;
}
@OneToMany(cascade=CascadeType.ALL, fetch=FetchType.EAGER)
@OrderBy("language ASC")
@JoinTable(name="ARRANGEMENT_SUMMARY", joinColumns={@JoinColumn(name="arrangement_id")}, inverseJoinColumns={@JoinColumn(name="text_id")})
@Boost(1.3f)
@Field(
name="summary",
index=Index.TOKENIZED,
store=Store.YES,
bridge = @FieldBridge(impl=I18FieldBridge.class,
params = @Parameter(name="prefix", value="summary")))
public Set<Text> getSummarytexts() {
return summarytexts;
}
public void setSummarytexts(Set<Text> summarytexts) {
this.summarytexts = summarytexts;
}
}
A fieldbridge
Code:
public class I18FieldBridge implements FieldBridge, ParameterizedBridge {
public void set(String name, Object value, Document document, LuceneOptions luceneOptions) {
Set<Text> texts = (Set<Text>) value;
for (Text text : texts) {
if (text == null) {
return;
}
Field field = new Field(
prefix + "_" + text.getLanguage(),
text.getWord(),
luceneOptions.getStore(),
luceneOptions.getIndex(),
luceneOptions.getTermVector());
Float boost = luceneOptions.getBoost();
field.setBoost(boost);
document.add(field);
}
}
private String prefix;
public void setParameterValues(Map parameters) {
this.prefix = (String) parameters.get("prefix");
}
}
The Text class.
Code:
@Table(name="TEXT")
@AnalyzerDefs({
@AnalyzerDef(name = "SWE",
tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = StopFilterFactory.class, params = {
@Parameter(name = "words", value = "stopwords_swe.properties"),
@Parameter(name = "ignoreCase", value = "true") }),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = {
@Parameter(name = "language", value = "Swedish") })
}),
@AnalyzerDef(name = "ENG",
tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = {
@TokenFilterDef(factory = ISOLatin1AccentFilterFactory.class),
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = StopFilterFactory.class, params = {
@Parameter(name = "words", value = "stopwords_eng.properties"),
@Parameter(name = "ignoreCase", value = "true") })
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = {
@Parameter(name = "language", value = "English") })
}),
@AnalyzerDef(name = "onsearchAnalyzerSWE",
tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = SynonymFactory.class, params = {
@Parameter(name = "ignoreCase", value = "true"),
@Parameter(name = "expand", value = "true"),
@Parameter(name = "synonyms", value = "synonyms_swe.properties")}),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = {
@Parameter(name = "language", value = "Swedish") })
}),
})
@AnalyzerDiscriminator(impl = LanguageDiscriminator.class)
public class Text extends Base {
private String word;
private Language language;
@Enumerated(EnumType.STRING)
public Language getLanguage() {
return language;
}
public void setLanguage(Language language) {
this.language = language;
}
@Column(nullable=true)
public String getWord() {
return word;
}
public void setWord(String word) {
this.word = word;
}
}
The LanguageDiscriminator
Code:
public class LanguageDiscriminator implements Discriminator {
public String getAnanyzerDefinitionName(Object value, Object entity, String field) {
return ((Text) entity).getLanguage().name();
}
}
So, simplyfied it a bit, but for example a Arrangement has 2 Text objects in it. One with a english summary and one with a swedish.
The fieldbridge sees to that the Text object is indexed like: summary_SWE or summary_ENG, and when I search on something I specify what summary that should be searched depending on the language of the searchword.
The language descriminator is used to put different analyzers on the different Text objects depending on the language in the language variable.
So far so good. I would like to know if I am using the correct approach here, but my problem is>
It seems that the SnowballPorterFilterFactory isnt applied on indexing, it is only applied when I search, and use the "onsearchAnalyzerSWE". It seems to use the other analyzers in the AnalyzerDef SWE and ENG, but not the snowball.
Can anyone explain why? Or what I should check?
Thankfull for any help, I have been trying quite long..