Hi,
first step is the search and highlighting itself. There you might be interested in this
thread. Make sure to have a look at the Lucene highlighter api to see how this works.
The bigger question is the PDF document. Do you really want to highlight in the pdf? Or are you indexing the pdf content and want to search and highlight this content?
Either way, neither Lucene nor Search can index pdf as is. You need to extract the actual text (eg via tika). Then you can index and search on this extracted text. The offsets (needed for highlighting) will be relative to this extracted text. I am not sure whether you could easily use them as offset into pdf text (btw, I am not even aware of a library which can manipulate pdf).
Hope this helps.
--Hardy