-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 11 posts ] 
Author Message
 Post subject: Search on multiple indexes
PostPosted: Fri May 13, 2011 1:08 pm 
Newbie

Joined: Fri May 13, 2011 1:01 pm
Posts: 16
Hi everyone,
I have to implement a full text search over different sets of tables (e.g. search "acme" in Orders, Invoices and Accounts); each set of tables, let's call it search domain, may have no relationshipts with the others.
I've implemented an object model for every domain, I have created different indexes for the different domains and, using BooleanJunction, I can build a single query which spans in multiple indexes: here's a code snippet with two fixed domains

Code:
// Domain1 and Domain2 are the root classes of the domains' object models
QueryBuilder
domain1QueryBuilder = searchFactory.buildQueryBuilder().forEntity(Domain1.class).get(),
domain2QueryBuilder = searchFactory.buildQueryBuilder().forEntity(Domain2.class).get();

...

org.apache.lucene.search.Query luceneQuery = domain1QueryBuilder.exact().onFields(domain1SearchField).matching(searchString).createQuery();
BooleanJunction<BooleanJunction> j = domain1QueryBuilder.bool().should(luceneQuery);

luceneQuery = domain2QueryBuilder.exact().onFields(domain2SearchField).matching(searchString).createQuery();
j.should(luceneQuery);

org.hibernate.search.FullTextQuery hibQuery = fullTextSession.createFullTextQuery(j.createQuery());

List result = hibQuery.list();
Iterator i = result.iterator();

while (i.hasNext()) {
    Object o = i.next();
    if (o instanceof Domain1) {
        . . .
    } else if (o instanceof Domain2) {
        . . .
    }
}


First question: I can see in the log DB round trips, even if every class attribute is saved in the index. Is this normal? Can I eliminate these DB round trips without switching to projections?

I tried with projections: I have a problem because classes have different attributes with different names (in their index), so I defined a projection with all the property names
Code:
hibQuery.setProjection( "domain1_property1", "domain1_property2", "domain2_property1", "domain3_property1", "domain3_property2");


but I get null values in output... I can get non null values for a property if I use the same name in the different indexes: is this normal? Or is there a way to use multiple-properties and multiple-indexes projections?

Last but not least: do you think this is a good approach for a multiple indexes search?

Thank you very much in advance,
Andrea


Top
 Profile  
 
 Post subject: Re: Search on multiple indexes
PostPosted: Sun May 15, 2011 10:44 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
Last but not least: do you think this is a good approach for a multiple indexes search?

I don't think it's the approach I would use, but if it works for you that's fine. I'd prefer to have some typesafety in the return value, so for example have all three types inherit from a common object, and show to the user only the common fields; this way using projection I also know exactly which fields are going to be valid, and I will show those only; then if people select on a result, then I might cast it to the appropriate type and actually show all details.

Quote:
Can I eliminate these DB round trips without switching to projections?

yes, if you use projection to only load properties which you have annotated as STORED then no database query will be executed. Another reason to have consistent fields to load.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Search on multiple indexes
PostPosted: Mon May 16, 2011 4:51 am 
Newbie

Joined: Fri May 13, 2011 1:01 pm
Posts: 16
Hi Sanne and hi everyone,
thank you for your answer.

s.grinovero wrote:
Quote:
Last but not least: do you think this is a good approach for a multiple indexes search?

I don't think it's the approach I would use, but if it works for you that's fine. I'd prefer to have some typesafety in the return value, so for example have all three types inherit from a common object, and show to the user only the common fields; this way using projection I also know exactly which fields are going to be valid, and I will show those only; then if people select on a result, then I might cast it to the appropriate type and actually show all details.


In this case I can't have (useful) common fields or a common root object because
  • every search domain has a different set of related entities, so the corresponding object models are different
  • the primary keys (that I want to show in the result) can be composed of a different number of fields
  • the other non-key fields that I want to show in the result can be different

Another option I'm considering is to "normalize" all the different entities into a single SQL view:
  • a single key field would contain the concatenated key fields
  • a field would contain the data to index
  • a field would express the domain type (PurchaseInvoice, Account, ...)
  • another field (or set of fields) would handle the relationships (e.g. the invoice line records are related to the invoice header record, the supplier record is connected to the invoice header record, ...)

With this DB view I can have a single Java class with a single index (eventually sharded): the object model would be completely application-neutral but in this way I would have common fields to use with projections.

s.grinovero wrote:
Quote:
Can I eliminate these DB round trips without switching to projections?

yes, if you use projection to only load properties which you have annotated as STORED then no database query will be executed. Another reason to have consistent fields to load.


Excuse me but maybe I didn't express my question correctly (because of my bad english); I would like to have Java objects as a result with every field corresponding to a stored property: in this case does Hibernate Search perform a DB access? In other words: is there a way to eliminate DB round trips without using projections? I've read the documentation and made some tests and the answer seems to be no... but I'm looking for a confirmation.

Thank you,
Andrea


Top
 Profile  
 
 Post subject: Re: Search on multiple indexes
PostPosted: Mon May 16, 2011 5:12 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
Excuse me but maybe I didn't express my question correctly (because of my bad english); I would like to have Java objects as a result with every field corresponding to a stored property: in this case does Hibernate Search perform a DB access? In other words: is there a way to eliminate DB round trips without using projections? I've read the documentation and made some tests and the answer seems to be no... but I'm looking for a confirmation.

Ah, sorry.
Well that's tricky; if we where to return a managed object (java entity) without interacting with Hibernate Core, you would expect this object to me in some extent "consistent" with the database state, i.e. transactionally safe, while the Index might get out of synch as Lucene is not a transactional resource: we model around that by applying index changes only as a transaction synch on the database, but still it might have a little delay (especially when using async backends which performs so much better - and is thus frequently used).
What you can do is make sure the second level cache of Hibernate is involved: read "5.1.3.6. Customizing object initialization strategies" from the reference docs. Using the second level cache is in all cases the best performing option (assuming it hits the cache), as also extracting values from the index has some performance costs.

About your primary question, I understand and you're welcome to do as is more practical for you. Just one observation: if you where to make a SQL view, you would likely want to define some common column names for the resultset. Or just to present the list to the user, if you present it all mixed in a single table, what are the column names going to be?
It looks like to me you want to actually present to the final user three different resultsets, then you should make three different queries and show each kind of result in it's more appropriate representation. This isn't much a technical question anymore, but think about user experience looking at your data.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Search on multiple indexes
PostPosted: Mon May 16, 2011 5:14 am 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
almost forgot:
also read about "5.2.3. ResultTransformer", might be useful.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Search on multiple indexes
PostPosted: Mon May 16, 2011 1:01 pm 
Newbie

Joined: Fri May 13, 2011 1:01 pm
Posts: 16
s.grinovero wrote:
Quote:
Excuse me but maybe I didn't express my question correctly ...

Ah, sorry.
Well that's tricky; ...
What you can do is make sure the second level cache of Hibernate is involved: ...


Thank you now I understand. Unfortunately the persistence layer of the application is not developer with Hibernate (no Java at all, it's a COBOL app), so I can't use caches so I have to find a way to use projections.




Quote:
About your primary question, I understand and you're welcome to do as is more practical for you. Just one observation: if you where to make a SQL view, you would likely want to define some common column names for the resultset. Or just to present the list to the user, if you present it all mixed in a single table, what are the column names going to be?

Sorry I forgot to specify that I would define a mapping layer: for every involved physical table I would define somewhere how the real columns are mapped to the view's columns, e.g. if TABLE1 has two key fields PKFIELD1 and PKFIELD2 then I write somewhere the name of the fields, their lengths, ...


Quote:
It looks like to me you want to actually present to the final user three different resultsets, then you should make three different queries and show each kind of result in it's more appropriate representation. This isn't much a technical question anymore, but think about user experience looking at your data.

Yes I want to query different domains (that the user can select) but I want to present a single result, e.g. if I search ACME then I want to present a single list with invoices, orders, accounts, ... that have ACME in their text fields (descriptions of items, accounts, firm names, ...). So the list, ordered by relevance, would be something like

Order JO0001 for customer ACME inc.
Purchase invoice JI022289 for Italian ACME ltd.
Purchase invoice JI022290 for Italian ACME ltd.
Account FA00001: ACME inc.
...


Now it's clear to me that the SQL view (with the mapping layer) can solve my problems (otherwise I should merge the results of the different queries). I will check ResultTransformers, thank you.

Thank you very much for your support Sanne!

All the best,
Andrea


Top
 Profile  
 
 Post subject: Re: Search on multiple indexes
PostPosted: Mon May 16, 2011 1:59 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
Thank you now I understand. Unfortunately the persistence layer of the application is not developer with Hibernate (no Java at all, it's a COBOL app), so I can't use caches so I have to find a way to use projections.

not totally clear to me; you're saying that you don't use Hibernate Core ? I guess you are providing a "read only view" to Hibernate Core, and the changes are made by a cobol application?
In this case it's right you can't use the second level cache as it might not be aware of changes in the data, but how could you rely on the indexes as well? will they get updated by the COBOL application? or otherwise you're going to have stale data in the indexes as well, and if that's not a problem then you could use the second level cache as well.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Search on multiple indexes
PostPosted: Wed May 18, 2011 10:53 am 
Newbie

Joined: Fri May 13, 2011 1:01 pm
Posts: 16
s.grinovero wrote:
Quote:
Thank you now I understand. Unfortunately the persistence layer of the application is not developer with Hibernate (no Java at all, it's a COBOL app), so I can't use caches so I have to find a way to use projections.

not totally clear to me; you're saying that you don't use Hibernate Core ? I guess you are providing a "read only view" to Hibernate Core, and the changes are made by a cobol application?
In this case it's right you can't use the second level cache as it might not be aware of changes in the data, but how could you rely on the indexes as well? will they get updated by the COBOL application? or otherwise you're going to have stale data in the indexes as well, and if that's not a problem then you could use the second level cache as well.


We're building a full-text search over our existing COBOL application, so
  • the changes (insert, delete, update) are made by the COBOL
  • the search engine provides a way to find the entities (invoices, orders, ...) related to a string
  • the search result page has a link to go to the "legacy" page that manages the specified entity in our app (e.g. the result line "Invoice no. I000Y71" will have a link to the page of the application showing that invoice data, and the "back-end" of this page is COBOL)

Yes indexing, in this case, is challenging... we don't want to change the COBOL code so we're considering using SQL triggers for the relevant SQL tables:
  • the trigger stores the modified data in a log table
  • a Java deamon will periodically read this log table and update the index

So the index will not be updated synchronously, but with a little delay.


Top
 Profile  
 
 Post subject: Re: Search on multiple indexes
PostPosted: Wed May 18, 2011 12:42 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
that should work, thank you for the explanation.

FYI, some databases are able to send triggers which invoke Java functions.. never used it myself but you could have it trigger an index() directly, you would just write a custom listener receiving the entity name and primary key changed. If you make that, we'd be interested in helping out in case you want to contribute it to the project.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Search on multiple indexes
PostPosted: Thu May 19, 2011 3:44 am 
Newbie

Joined: Fri May 13, 2011 1:01 pm
Posts: 16
We already use Java stored procedures called by triggers for another module of our application, for some DBMS (just DB2 and Informix), but for the search engine we have to support several others DBMS including SQL Server 2000 where there's no Java support: I should write a DLL module that would call Java via JNI... The architecture would be very clean but probably too expensive to develop if compared to the log table with the Java deamon (we already have another module with the same architecture so we can reuse ideas and code :-)

However thank you for the suggestion: we will evaluate it and let you know if we can contribute something to the project.

Andrea


Top
 Profile  
 
 Post subject: Re: Search on multiple indexes
PostPosted: Mon May 20, 2013 2:45 pm 
Newbie

Joined: Sun May 19, 2013 2:44 pm
Posts: 10
Andrea,

How did you get the relations between the tables working? I've a similar requirement where I have to search across multiple tables and show the related entitites.

Order JO0001 for customer ACME inc.
Purchase invoice JI022289 for Italian ACME ltd.
Purchase invoice JI022290 for Italian ACME ltd.
Account FA00001: ACME inc.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 11 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.