-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 8 posts ] 
Author Message
 Post subject: Indexing arbitrary documents in Hibernate Search
PostPosted: Thu Mar 17, 2011 4:25 pm 
Newbie

Joined: Thu Mar 17, 2011 4:19 pm
Posts: 4
Hi there.

I'm looking for a way to add arbitrary documents into my Hibernate search application.
By "arbitrary", I mean documents that are not entities.
I understood that I'll have to handle manually the document updates and removal, and it's ok.

I looked on such facilities in the SearchFactory, but everything is related to an entity.

Any idea to performs such a thing ? I really don't want to set up another native lucene manually.

Thanks a lot.


Top
 Profile  
 
 Post subject: Re: Indexing arbitrary documents in Hibernate Search
PostPosted: Fri Mar 18, 2011 5:46 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
Hi,

I am just wondering why you wanted to do that. Wouldn't it be better from a resource management point of view and application design to have a separate Lucene index? Why adding unrelated documents to the HSearch managed index?

Not only would you have to manage the document updates manually, but the problem of concurrent access to the index. Maybe you could describe your use case a little more.

That all said, you can always gain access to the underlying Lucene resources via the SearchFactory. There you can get hold of the underlying Lucene directory, but I would not recommend going down this path.

--Hardy


Top
 Profile  
 
 Post subject: Re: Indexing arbitrary documents in Hibernate Search
PostPosted: Fri Mar 18, 2011 6:07 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
The Hibernate Search in Action book has some examples about indexing Word and PDF documents, but as Hardy said it makes mostly sense when they are structured as "binary attachments" of entities, so that they are not totally unrelated to the entities.
Ideally I'd map them as content of a lazily loaded Blob field of an entity, as that doesn't necessarily mean they are stored on the database: using a custom persister you can have the entity represent the metadata (going in index and database), and have the original document stored elsewhere (FS?), have Search handle a consistent way to index metadata+content and have easy to use pointers to both in the resultsets.
There's no secret in the book, just some examples: you still have to use a custom fieldbridge to extract the text from the binary/proprietary formats, it shows how to plug in converters such as TIKA or POI.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Indexing arbitrary documents in Hibernate Search
PostPosted: Sat Mar 19, 2011 4:00 am 
Newbie

Joined: Thu Mar 17, 2011 4:19 pm
Posts: 4
You're right, I need to give some context.

I'm writing a game engine in Java, which can be manage in a dynamic way : you define all the gaming stuff while the application is running.
I massively used Groovy for storing game rules, and JPA for storing types and objects.

So my application stores to kind of POJOs:
- classical entities, persisted in database through JPA: "types and objects"
- other ones are "actions", a POJO linked to a groovy script and representing a gaming rule

Thus I have some classical DAOs for types and objects, and other DAOs that persists actions on the file system, in groovy script.

But all that stuff is inner mechanics, and from the user's point of view, he manipulates types, objects and rules.
That is why I wanted to have in a same request results classical entities and "arbitraty documents" : my action POJO.

Unfortunatly, as I said, the SearchFactory does not allow access or creation of a DirectoryProvider unrelated to an persisted Entity class. Unless I missed a method ?


Top
 Profile  
 
 Post subject: Re: Indexing arbitrary documents in Hibernate Search
PostPosted: Sat Mar 19, 2011 2:20 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
thanks for the context, tough it's not clear to me what you need, I'm assuming you want to index annotated POJOS which are not entities:

Consider that Indexed entities don't necessarily need to be JPA entities, as long as they are POJOs with an @Indexed annotation then Search will be able to provide you with an indexed service for it (transformations into and from Lucene's documents and DirectoryProvider/Reader management). Any object though needs an identifier, when it's a JPA/Hibernate we figure it out easily via the @Id mapped property, but you can use a plain @DocumentId and avoid the @Id, or you can play with @ProvidedId.
The downside of not managing entities, is that Hibernate won't automatically create events to update your object, you'll have to fire events yourself.
Use programmatic configuration to add more POJOs to be indexed (autodiscovery is also not going to work as it won't do annotation scanning on @Indexed objects).

A great source of examples is the Infinispan Query module:
https://github.com/infinispan/infinispa ... ster/query
This project sets up it's own event listeners and indexes objects put in the Infinispan cache (a key/value store), have a look at the examples - it's still a new, small and simple project, just a bunch of classes, so I hope it gets you an idea without getting lost in it.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Indexing arbitrary documents in Hibernate Search
PostPosted: Tue Mar 22, 2011 2:41 pm 
Newbie

Joined: Thu Mar 17, 2011 4:19 pm
Posts: 4
Thank Sane.

Indeed, InfiniSpan/query seems pretty hard to understand from the first sight, but I'll have a deeper look as soon as possible.

I tried to add my annotated POJO indexed by Hibernate Search (I run Hibernate with Spring):
applicationContext.xml
Code:
<bean id="entityManagerFactory"
      class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">

      <property name="dataSource" ref="dataSource" />      
      <property name="persistenceUnitPostProcessors" ref="persistenceUnitPostProcessor"/>
      ...
   </bean>
   <bean id="persistenceUnitPostProcessor" class="org.mythicforge.chronos.dao.jpa.ConfigurationPostProcessor"/>


ConfigurationPostProcessor.java
Code:
public class ConfigurationPostProcessor implements PersistenceUnitPostProcessor {

   @Override
   public void postProcessPersistenceUnitInfo(MutablePersistenceUnitInfo pui) {
      SearchMapping mapping = new SearchMapping();
      mapping.entity(Action.class).indexed();
      pui.getProperties().put("hibernate.search.model_mapping", mapping);
   } // postProcessPersistenceUnitInfo().
}


Action.java
Code:
@Indexed
public class Action  implements Serializable {
   @DocumentId
   public Integer getId() {
   ...


But when I run th following code :
Code:
      
FullTextEntityManager searcher = Search.getFullTextEntityManager(entityManager);
searcher.purgeAll(Action.class);


I got an error indicating that the Action.class is not recognized :
Code:
java.lang.IllegalArgumentException: org.mythicforge.model.rule.Action is not an indexed entity or a subclass of an indexed entity
   at org.hibernate.search.impl.FullTextSessionImpl.purge(FullTextSessionImpl.java:148)
   at org.hibernate.search.impl.FullTextSessionImpl.purgeAll(FullTextSessionImpl.java:125)
   at org.hibernate.search.jpa.impl.FullTextEntityManagerImpl.purgeAll(FullTextEntityManagerImpl.java:117)
   at org.mythicforge.chronos.dao.jpa.SearchDao.resetIndexes(SearchDao.java:120)


Top
 Profile  
 
 Post subject: Re: Indexing arbitrary documents in Hibernate Search
PostPosted: Wed Mar 30, 2011 3:43 am 
Newbie

Joined: Thu Mar 17, 2011 4:19 pm
Posts: 4
Hi ! As explained, it really seems that Hibernate Search is unable to index POJO that are just annotated with @Indexed, and that are not @Entity or @MappedSuperclass.

Am I wrong ?

Damien.


Top
 Profile  
 
 Post subject: Re: Indexing arbitrary documents in Hibernate Search
PostPosted: Wed Mar 30, 2011 6:15 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
No it should work pretty well, all tests on Infinispan are green :)

I don't know much about Spring, but it seems you're registering your pojos as entities, don't mess with the persistenceContext and just start the SearchFactory:
Quote:
SearchConfiguration config = new SearchableCacheConfiguration(new Class[0], indexingProperties);
searchFactory = new SearchFactoryBuilder().configuration(config).buildSearchFactory();

But then it's up to you to send updates to the engine in form of "update", "delete", "add" to the index operations, and to run searches using the createHSQuery() method.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 8 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.