-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 
Author Message
 Post subject: cp1252 character data on Linux
PostPosted: Tue Apr 10, 2007 2:50 pm 
Newbie

Joined: Wed Jun 28, 2006 8:58 am
Posts: 9
Using Hibernate v3.2.2 with JSQLConnect(JDBC) v4.0 and java v1.6.0 with MSSQL v9.0.3042.

I have a mapped VARCHAR column in a MSSQL database, with data encoded in the MSSQL default code page of Cp1252. When running java on a windows server with jvm default code page of Cp1252, Hibernate converts this character data successully into a String object with all characters properly translated from Cp1252.

When I run java on a linux server (Linux Red Hat Enterprise v3.0), characters in the "extended ascii" range of 0x80-0x9F seem to get mis-translated when the query data is converted into a String object. Hibernate builds the String object from raw ResultSet returned by the JDBC layer. In particular, I am testing with an EM DASH character, which has Cp1252 code of 151(0x97) and Unicode code value of 8212(0x2014).

Default jvm code page on windows is Cp1252, while on linux it is not Cp1252, most likely iso-8859-1. Is there some way that I need to configure Hibernate to know that character data for this property, or for the whole database, is in Cp1252 instead of the default jvm code page on linux?

Thanks,

Greg Burcher


Top
 Profile  
 
 Post subject: Hibernate configuration of code page for character data?
PostPosted: Wed Apr 11, 2007 3:16 pm 
Newbie

Joined: Wed Jun 28, 2006 8:58 am
Posts: 9
I confirmed that the default code page on my linux server is utf-8. I also confirmed through debug that JDBC is returning the correct Cp1252 bytes from the MSSQL VARCHAR column value. I haven't seen the pertinent Hibernate code, but I presume Hibernate is constructing a String object from the bytes using the default code page of utf-8. This is where the character values get mangled.

Is there not a way for me to tell Hibernate the code page to use, either for a single property or for all connections in a session factory?

As a work-around I have created a wrapper class for a byte[] that presents the interface as String, performing conversion back and forth to byte[] underneath. I changed my mapped property from String to this new class and created a UserType subclass to read/write the data as type BINARY instead of type STRING. This avoids the problem of Hibernate mangling the character codes.

Can anyone think of a less complex solution?

Thanks,

Greg


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.