Hi,
I have a Tag class, which defines tags, e.g.
* Infection
* Infection of the hand
* Infection of the ear
This is a multilingual class and I load the translations from an external file. I use a class bridge to store the translated keys in the index
tag_en
tag_pt
tag_nl
etc
In some cases I want to search "fuzzily", in some cases I only want exact/keyword matches.
This is why I use the following:
Code:
@AnalyzerDefs({
@AnalyzerDef(name = "mlAnalyzer", tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = {
@TokenFilterDef(factory = StandardFilterFactory.class),
@TokenFilterDef(factory = ASCIIFoldingFilterFactory.class),
@TokenFilterDef(factory = LowerCaseFilterFactory.class)
}),
@AnalyzerDef(name = "keywordAnalyzer", tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class))
})
@ClassBridges({
@ClassBridge(
impl=MultiLingualClassBridge.class,
analyzer=@Analyzer(definition="mlAnalyzer"),
params=@Parameter(name="field",value="tag")
),
//Index the tag name without tokenizing for exact matches
@ClassBridge(
impl=MultiLingualClassBridge.class,
analyzer=@Analyzer(definition="keywordAnalyzer"),
params=@Parameter(name="field",value="tag_kw")
)
})
@Indexed
public class Tag implements Serializable{
Code:
@Configurable
public class MultiLingualClassBridge implements FieldBridge{
@Override
public void set(
String name,
Object value,
Document document,
LuceneOptions luceneOptions) {
....
luceneOptions.addFieldToDocument(fld + "_" + locale.getLanguage(), translatedValue, document);
}
However, I'm seeing that the "keyword-version" is stored in a tokenized way also in stead of as a keyword. It seems that when using multiple classbridges, the first analyzer is always used? I could be wrong.
Kind regards,
Marc