Quote:
For example if we have got a backup of indexes when we restarted server last time (let say a month ago) then we can restore the backup and run MassIndexer for only changes since that restart date on top of existing index...
I'm not sure which MassIndexer can do that. Not the one I implemented which is now in Hibernate Search, unless you're restricting what Hibernate can see, or mapping a filtered view for these purposes?
That might be an interesting experiment, but I'm not sure how to help you to guarantee that the resulting index will be in sync, unless you can prevent other changes from happening concurrently. We've had some discussions on the mailing list to consider using changeset ids, timestamps or transactions ids but no single solution is safe for general purpose usage. I agree you might be able to build something which works fine for your specific requirements.
Back to your original question about using CheckIndex: sure you can run it, but you'll have to restore your backup in case you find non-recoverable issues as it's not possible to identify from the segment id nor the filename which keys need to be reindexed. I suspect one possible solution would be to check - for each key in the database - if there's a matching document in any other segment, then skip it if there is as there should never be a duplicate. I'm not sure if you can implement this to be efficient enough to be faster than reindexing it all though, as you'll still need to iterate at least all ids: this pre-filtering approach could be a good idea if the indexer has to produce complex Lucene Document and/or load complex object graphs.
HTH