Hi Sanne, again, thanks for your prompt reply.
Maybe some more background info would be useful.
By multi-threaded search, I mean that I launch around 8 threads which will execute around 30k searches in total.
As I have an abstraction around the underlying search engine, I will execute 30k times the following code:
Code:
FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(entityManager);
FullTextQuery persistenceQuery = fullTextEntityManager.createFullTextQuery(query);
List results = persistenceQuery.getResultList();
int totalCount = persistenceQuery.getResultSize();
return results
I'm assuming doing that will not cause an open/close operation on the underlying reader but I might be wrong.
To your points:
If you see this often, I can think this could be the causes:
- a very very large index is being opened and you're optimizing too often, consider disabling optimization altogether
My index consist of 43 entities, but only 3 or 4 are searched on. There are around 100k entities for the largest one. The total disk size is around 400M so I don't think I qualify as very very large, but most of that 400M is in the entity I'm querying.
I don't do anything in particular in regards to optimization, but just in case, how can I disable it?
- you're index is extremely slow - like you're having just a couple of documents to run your test; this could make all search operations very fast and highlight this contention point
See above.
- you're not only searching but also writing a lot to the index: every time a write is performed the IndexReader cache is invalidated and the instance needs to hit the disk again to refresh very often, preventing other threads to acquire the lock. Did you try the NRT IndexManager?
I do write on the index, but much less frequently than I search. Furthermore, the write operations are on another entity than the ones I search on. I assume this should not trigger a close operation on the other readers.
I will try to make a simplified sample and I did not try NRT.
On another note, I had a question about the async operation. Part of my index is geo data (Think hierarchy of country/state/county/cities). As a result I have a lot of @ContainedIn back references.
When I add a data point to a country, it triggers the re-indexing of related objects (cities, states, counties), which is expected. In the case of a country with a large amount of cities, it can take a while.
I switched on the async operation mode, hoping not to lock the UI in the process:
Code:
hibernate.search.default.worker.execution = async
However, it seems that the async part is only async to execute the work, not in the prepare phase. See below profiler stack:
Code:
org.hibernate.search.backend.impl.PostTransactionWorkQueueSynchronization.beforeCompletion()
org.hibernate.search.backend.impl.BatchedQueueingProcessor.prepareWorks()
org.hibernate.search.backend.impl.WorkQueue.prepareWorkPlan()
org.hibernate.search.engine.impl.WorkPlan.getPlannedLuceneWork()
org.hibernate.search.engine.impl.WorkPlan$PerClassWork.enqueueLuceneWork()
org.hibernate.search.engine.impl.WorkPlan$PerEntityWork.enqueueLuceneWork()
org.hibernate.search.engine.spi.DocumentBuilderIndexedEntity.addWorkToQueue()
org.hibernate.search.engine.spi.DocumentBuilderIndexedEntity.createUpdateWork()
org.hibernate.search.engine.spi.DocumentBuilderIndexedEntity.getDocument()
org.hibernate.search.engine.spi.DocumentBuilderIndexedEntity.buildDocumentFields() <== 145s
org.hibernate.collection.internal.PersistentList.iterator() <== 92s
org.hibernate.collection.internal.PersistentBag.iterator() <== 36s
org.hibernate.search.engine.spi.DocumentBuilderIndexedEntity.buildDocumentFields() <== 16s
The prepare phase takes around 145s blocking the UI (updating a data point the the country USA, containing 15k cities...), loading all the lazy collections.
I assume the prepare phase cannot be done in the background?