Hibernate Community • View topic - cp1252 character data on Linux

View unanswered posts | View active topics

Board index » Hibernate & Java Persistence » Hibernate Users

All times are UTC - 5 hours [ DST ]

cp1252 character data on Linux

Page 1 of 1

[ 2 posts ]

Previous topic | Next topic

Author

Message

gburcher

Post subject: cp1252 character data on Linux

Posted: Tue Apr 10, 2007 2:50 pm

Newbie

Joined: Wed Jun 28, 2006 8:58 am
Posts: 9

Using Hibernate v3.2.2 with JSQLConnect(JDBC) v4.0 and java v1.6.0 with MSSQL v9.0.3042.

I have a mapped VARCHAR column in a MSSQL database, with data encoded in the MSSQL default code page of Cp1252. When running java on a windows server with jvm default code page of Cp1252, Hibernate converts this character data successully into a String object with all characters properly translated from Cp1252.

When I run java on a linux server (Linux Red Hat Enterprise v3.0), characters in the "extended ascii" range of 0x80-0x9F seem to get mis-translated when the query data is converted into a String object. Hibernate builds the String object from raw ResultSet returned by the JDBC layer. In particular, I am testing with an EM DASH character, which has Cp1252 code of 151(0x97) and Unicode code value of 8212(0x2014).

Default jvm code page on windows is Cp1252, while on linux it is not Cp1252, most likely iso-8859-1. Is there some way that I need to configure Hibernate to know that character data for this property, or for the whole database, is in Cp1252 instead of the default jvm code page on linux?

Thanks,

Greg Burcher

Top

gburcher

Post subject: Hibernate configuration of code page for character data?

Posted: Wed Apr 11, 2007 3:16 pm

Newbie

Joined: Wed Jun 28, 2006 8:58 am
Posts: 9

I confirmed that the default code page on my linux server is utf-8. I also confirmed through debug that JDBC is returning the correct Cp1252 bytes from the MSSQL VARCHAR column value. I haven't seen the pertinent Hibernate code, but I presume Hibernate is constructing a String object from the bytes using the default code page of utf-8. This is where the character values get mangled.

Is there not a way for me to tell Hibernate the code page to use, either for a single property or for all connections in a session factory?

As a work-around I have created a wrapper class for a byte[] that presents the interface as String, performing conversion back and forth to byte[] underneath. I changed my mapped property from String to this new class and created a UserType subclass to read/write the data as type BINARY instead of type STRING. This avoids the problem of Hibernate mangling the character codes.

Can anyone think of a less complex solution?

Thanks,

Greg

Top

Page 1 of 1

[ 2 posts ]

Board index » Hibernate & Java Persistence » Hibernate Users

All times are UTC - 5 hours [ DST ]

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum