-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 4 posts ] 
Author Message
 Post subject: Search performance enhancement
PostPosted: Thu Nov 11, 2010 6:05 am 
Beginner
Beginner

Joined: Tue Sep 28, 2010 5:14 am
Posts: 25
Hi,

Could you please let me know some ways to improve the search performance. With my current implementation, the search takes around a min to bring back around 800 records. I want to know if something is wrong in my implementation or that I am missing something that it would take so long.

I am having around 93000 records so for a simple search which brings 800 records it takes 1 min, I suspect with few thousand records it might take a long time. I want to improve the performance, considering the fact that in real world my search could bring in 10000 records as well sometime.

I would appreciate if you could provide any suggestions on improving the search time.

Quote:
Configuration file :
<hibernate-configuration>
<session-factory>
<property name="hibernate.connection.driver_class">oracle.jdbc.driver.OracleDriver</property>
<property name="hibernate.connection.url">jdbc:oracle:thin:@url:1531:schema</property>
<property name="hibernate.connection.username">username</property>
<property name="hibernate.connection.password">pwd</property>
<property name="hibernate.connection.pool_size">10</property>
<property name="show_sql">true</property>
<property name="dialect">org.hibernate.dialect.Oracle9Dialect</property>
<property name="hibernate.hbm2ddl.auto">update</property>

<!-- Hibernate search configuration -->
<property name="hibernate.search.default.directory_provider">org.hibernate.search.store.FSDirectoryProvider</property>
<property name="hibernate.search.default.indexName">cp_employees</property>
<property name="hibernate.search.default.indexBase">\indexes</property>
<property name="hibernate.search.default.batch.merge_factor">10</property>
<property name="hibernate.search.default.batch.max_buffered_docs">10</property>

<!-- Mapping files -->
<mapping resource="employees.hbm.xml"/>

</session-factory>
</hibernate-configuration>


Quote:
Indexing the document

public void indexRecord(){

FullTextSession ftSession = SessionUtil.getFullTextSessionInstance();

ftSession.getTransaction().begin();
try {
long startTime = Calendar.getInstance().getTimeInMillis();

ftSession.createIndexer(CP_Employees.class).batchSizeToLoadObjects(100)
.cacheMode(CacheMode.IGNORE).threadsToLoadObjects(10).threadsForSubsequentFetching(25).startAndWait();

long endTime = Calendar.getInstance().getTimeInMillis();
long batchTime = endTime - startTime;
System.out.println(batchTime );

} catch (InterruptedException e) {

e.printStackTrace();
}

ftSession.getTransaction().commit(); //index are written at commit time
}


Quote:
Generic Search

@SuppressWarnings({ "deprecation"})
public List<?> genericCPEmployeesSearch(String searchQuery){

String[] searchFields = getSearchFields();

FullTextSession ftSession = SessionUtil.getFullTextSessionInstance();

MultiFieldQueryParser multiQueryParser = new MultiFieldQueryParser(
searchFields, ftSession.getSearchFactory().getAnalyzer(CP_Employees.class));

try {
org.apache.lucene.search.Query luceneQuery = multiQueryParser.parse(makeGenericQueryString(searchQuery));

long startTime = Calendar.getInstance().getTimeInMillis();
@SuppressWarnings("unchecked")
List<CP_Employees> results = ftSession.createFullTextQuery(luceneQuery, CP_Employees.class).list();

long endTime = Calendar.getInstance().getTimeInMillis();
long timeDiff = endTime - startTime;

System.out.println("results.size() " + results.size());
System.out.println("timeDiff " + timeDiff/(60*1000));
return results;

} catch (ParseException e) {
e.printStackTrace();
}

return null;
}


The getSearchField() method returns the list of all fields in the CP_Employees entity object.

Quote:
ENTITY Definition used for search

public class CP_Employees {

@Id
@GeneratedValue
@DocumentId
private long emplid;

@Field (index=Index.TOKENIZED, store=Store.NO)
private String emplClass;

@Field (index=Index.TOKENIZED, store=Store.NO)
private String status;

@Field (index=Index.TOKENIZED, store=Store.YES)
private String title;

@Field (index=Index.TOKENIZED, store=Store.YES)
private String firstName;

@Field (index=Index.TOKENIZED, store=Store.YES)
private String lastName;

@Field (index=Index.UN_TOKENIZED, store=Store.NO)
private String busName;

@Field (index=Index.UN_TOKENIZED, store=Store.NO)
private String name;

@Field (index=Index.TOKENIZED, store=Store.NO)
private Date birthDate;

@Field (index=Index.TOKENIZED, store=Store.NO)
private Date hireDate;

@Field (index=Index.TOKENIZED, store=Store.YES)
private long hrStatus;

@Field (index=Index.UN_TOKENIZED, store=Store.NO)
private String jobTitle;

@Field (index=Index.TOKENIZED, store=Store.NO)
private String location;

@Field (index=Index.TOKENIZED, store=Store.YES)
private String localDesc;

@Field (index=Index.TOKENIZED, store=Store.YES)
private String physicalCity;

@Field (index=Index.TOKENIZED, store=Store.YES)
private String physicalCountry;

@Field (index=Index.TOKENIZED, store=Store.YES)
private String physicalRegion;

@Field (index=Index.UN_TOKENIZED, store=Store.NO)
private String rankDesc;

@Field (index=Index.UN_TOKENIZED, store=Store.NO)
private String sex;

@Field (index=Index.UN_TOKENIZED, store=Store.NO)
private String payrollCompany;


Thanks in advance.

Cheers,
Manoj


Top
 Profile  
 
 Post subject: Re: Search performance enhancement
PostPosted: Thu Nov 11, 2010 6:26 am 
Hibernate Team
Hibernate Team

Joined: Thu Apr 05, 2007 5:52 am
Posts: 1689
Location: Sweden
1 minute to search within 93000 records for a relatively simple looking entity seems indeed long. It is hard to say what could be the cause. I would try to find out first how much time is spend in the Lucene query and how much time retrieving the objects from the db. There are several ways to do that. You can turn on debug log where you should be able to see what is happening when. Inspect the SQL statements which get executed. Maybe take one and try to run it directly against the db to see how long it takes. The same for the Lucene query. Open the Lucene index using Luke and run your query using Luke. How long does that take?
You can also attach a pofiler to see where most time is spent.
Last but not least, if you are running the latest version of Search you can use JMX to monitor Lucene vs DB query times. Refer to the online manual on how to monitor Search via JMX.
Bottom line is to figure out where the bottle neck is. Once you know that you can start focusing on the problem.\

--Hardy


Top
 Profile  
 
 Post subject: Re: Search performance enhancement
PostPosted: Mon Nov 15, 2010 10:17 am 
Beginner
Beginner

Joined: Tue Sep 28, 2010 5:14 am
Posts: 25
Hi Hardy,

Thanks for your reply. This seems to make sense but what I am confused about is the number of records that are brought back from the search. Would that also affect my time taken? To give an example, when I search for "United kingdom", I get back around 10000 records. Now since it is retrieving back such a large volume of record, it takes around 25 mins to return the result.

I did restrict this, by defining the firstResult start and max result size, but I would still need total number of record for pagination.

What I want to understand is, if I can get back a subset of total search result?

--Manoj


Top
 Profile  
 
 Post subject: Re: Search performance enhancement
PostPosted: Mon Nov 15, 2010 5:54 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
hi,
you don't need to load a single entity to know the number of results: http://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#d0e3705.
This works especially well with pagination, load only the object you need to show, no more.

If your query performs worse when you get more objects returned then it likely is not the fulltext query to be slow but the subsequent loading of objects; you might also like to use projection.
In any case, follow Hardy's suggestion as it's hard to help if you don't know what is slow.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 4 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.