Hi,
I am new to Hibernate search and I have some pretty stupid question :-)...I have this object
Code:
@Indexed
@AnalyzerDef(name = "customAnalyzer",
tokenizer = @TokenizerDef(factory =
HTMLStripStandardTokenizerFactory.class),
filters = {
@TokenFilterDef(factory = ISOLatin1AccentFilterFactory.class),
@TokenFilterDef(factory = LowerCaseFilterFactory.class)
})
public class ArticleDetails extends Article {
private static final long serialVersionUID = 1L;
@Field(index = Index.TOKENIZED, boost = @Boost(5f))
@Analyzer(definition="customAnalyzer")
private String heading;
@Field(index = Index.TOKENIZED)
@Analyzer(definition="customAnalyzer")
private String text;
private String articleUntransformed;
private Set<Attachment> attachments;
.
.
.
and what I want to do is: remove diacritics...(my articles are in Czech language, so we have letters like ěščřžýáíéůú...) than I want to strip HTML tags because the articles are in HTML (<p>I am article</p>)...and I want everyting in lowercase...
But if I look to the index with Luke project, i can find there in top ranking terms words like h2, p, div, využívá (in czech uses) etc...and I think that with this mapping it should not be there...
Another problem is, when i search for word "sifra" (in czech is "šifra" cipher...so the only difference is in diacritics) the articles which are using the word "šifra" are not found.
This is the query code.
Code:
public List<ArticleDetails> find(String s){
FullTextSession fullTextSession =
Search.getFullTextSession(factory.getCurrentSession());
Transaction tx = fullTextSession.beginTransaction();
// create native Lucene query
String[] fields = new String[]{"heading", "text"};
MultiFieldQueryParser parser = new MultiFieldQueryParser(fields, fullTextSession.getSearchFactory().getAnalyzer("customAnalyzer"));
org.apache.lucene.search.Query query;
try {
query = parser.parse(s);
} catch (ParseException ex) {
Logger.getLogger(ArticleDetailsHibernateDAO.class.getName()).log(Level.SEVERE, null, ex);
throw new IllegalArgumentException(ex);
}
// wrap Lucene query in a org.hibernate.Query
org.hibernate.Query hibQuery =
fullTextSession.createFullTextQuery(query, ArticleDetails.class);
List result = hibQuery.list();
return result;
}
Thanks.
Pavel