I made several tests.
I added to my project Lucene class files and org.apache.lucene.search.highlight contribution.
Then I added to a classic HSearch search :
Code:
try {
IndexSearcher searcher = new IndexSearcher("/home/fmn/templucene/dao.modele.Entite");
Hits hits = searcher.search(luceneQuery);
Highlighter highlighter = new Highlighter(new SimpleHTMLFormatter("<b>","</b>") , new QueryScorer(luceneQuery));
String fieldName = "documents.pleinTexte";
for (int i = 0; i < hits.length(); i++) // for each document
{
String fieldText = hits.doc(i).get(fieldName);
TokenStream tokenStream = (new StandardAnalyzer()).tokenStream(fieldName, new StringReader(fieldText));
// Get 3 best fragments and seperate with a "..."
String[] results = highlighter.getBestFragments(tokenStream, fieldText, 3);
System.out.println("Result fragments :\n");
for(String result : results)
System.out.println("... "+result+" ...");
}
} catch (IOException e) {
// TODO
e.printStackTrace();
}
The result is for example :
Code:
... grande première formation Turquie première proposition <b>et</b> première victoire en grand prix le départ <b>et</b> mouvementée à la tête a été pour physique <b>et</b> la ...
... victime d' une crevaison sain <b>et</b> puis aussi mais dans le tour suivant ...
Here I want 3 index segments (max) found and I want the matched word in bold. It works very well but I think I made a search again when I use
Code:
Hits hits = searcher.search(luceneQuery);
equivalent to
Code:
luceneQuery = parser.parse(requeteLucene);