-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 
Author Message
 Post subject: Hibernate Search- Define a Sorting sequence for UTF-8 langua
PostPosted: Tue Apr 20, 2010 12:53 am 
Newbie

Joined: Mon Apr 19, 2010 11:55 pm
Posts: 5
Hi,

I am using Hibernate Search for a tri lingual application which has over 2 million data volume. I am very impressed with the performance of the search operation TXN to the Hibernate Search team, :)

However, I was unable to sort data stored in UTF-8 Sinhala language. But sorting done in English works fine.


I am new to hibernate search and I don't know a way to define the sorting order sequence for the Sinhala language. (UTF-8 Language) It’s like defining an alphabet similar to in English A,B,C,D…etc.

I am thinking of a similar way where we define customized stop word file to override the default StopFilter which will remove commonly used English words. (Such as is, the, and etc)


Please kindly give me an advice on this

I am using following jar versions:

hibernate-search-3.0.0.GA.jar
Jboss 4.2.3


Top
 Profile  
 
 Post subject: Re: Hibernate Search- Define a Sorting sequence for UTF-8 langua
PostPosted: Tue Apr 20, 2010 4:29 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
I am using Hibernate Search for a tri lingual application which has over 2 million data volume. I am very impressed with the performance of the search operation TXN to the Hibernate Search team, :)

thanks, nice to hear that!
Quote:
hibernate-search-3.0.0.GA.jar

that's a very old version! I'd recommend you to upgrade at least to version 3.1.1.GA, there are very big improvements in performance and also some bugs fixed.

Quote:
However, I was unable to sort data stored in UTF-8 Sinhala language. But sorting done in English works fine.

I'm sorry, full support of UTF-8 was not implemented yet; you might have better results by updating to latest candidate release 3.2.0.CR1 but I don't know the details of Sinhala language: I'm not sure it could help.

The safest you can do is to add an additional "sortfield" in which you encode a string following the sorting rules of Sinhala.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Hibernate Search- Define a Sorting sequence for UTF-8 langua
PostPosted: Wed Apr 21, 2010 5:54 am 
Newbie

Joined: Mon Apr 19, 2010 11:55 pm
Posts: 5
Hi Sanne,
Thank your very much 4ur reply. As you have said, I have used separate columns to store tri lingual data in database and indexed columns twice in Tokenized and Un-tokenized modes (one is for searching and other for sorting). Please refer below code.

Database
----------
LAST_NAME_EN – LAST Name in English
LAST _NAME_SI– LAST Name in Sinhala etc

Entity
------
@Fields(
{@Field(index=Index.TOKENIZED),
@Field(name="childLastNameSi_sort",
index=Index.UN_TOKENIZED,store = Store.YES),
})
public String getChildLastNameSi() {
return this.childLastNameSi;
}


And also when Sorting is done as follows.

FullTextQuery query = ftSession.createFullTextQuery(lquery, BirthDeclarations.class);

Sort sort = null;
SortField sortField =new SortField(“childLastNameSi_sort”, SortField.STRING, false );
sort = new Sort( sortField );
query.setSort( sort );


But when I put “childLastNameSi_sort” as the sorting column, I will not get the correct order as the Sinhala language alphabet specifies. (Similarly if the sorting is done in English, records should be ordered according to the English alphabet.)

Anyways, I’ll upgrade Hibernate Search version to 3.1.1.GA and let you the result.

Thanks
Indika


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.