skalyana1006 wrote:
Can we have one indexwriter per partition (if we do shards)?
What I am trying to do is get as much parallelism as possible in indexing.
I am not sure if sharding is the solution for your problem. If you want to increase indexing speed you might want to have a look at the
hibernate.search.worker.execution option and set it to
async. Indexing happens then in a separate thread. However, this option is normally intended for the case where indexing and search are on the same machine and you want to ensure fast response times for searches. In a master slave configuration the situation is slightly different.
The same caveat really applies for sharding. In fact searches may become slower due to sharding since the index will consists of more files.
In case you are not trying to solve a particular performance problem I would recommend to start of with a simple master/slave configuration and see how it performs. Take a benchmark and use it as baseline for configuration changes.
Another thought - even if you have many entities you will probably only index all the data once (or maybe at regular time intervals). The rest of the time you will only have incremental index updates.
Quote:
Also, if we are talking 5 - 10 entity types to be indexed (with 1Million instances each), it is better, the master should be a beefy box, with lots of RAM and processing power.. correct me if I am wrong.
Sure, the bigger the better. RAM being more important though.
--Hardy