-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 
Author Message
 Post subject: Is Hibernate able to read UTF-8 4-byte chars from database?
PostPosted: Mon Mar 29, 2010 7:20 pm 
Regular
Regular

Joined: Mon Mar 10, 2008 6:40 pm
Posts: 114
This is a fairly simple problem that I have to imagine many people have run into, but a basic search surprisingly has not led me to any answers...

We're able to save UTF-8 characters and read them as well with hibernate and mysql. But only when using varchar columns. The problem, as anyone who has had to deal with asian and international characters can attest to, is you can't use varchar columns if you want more than the basic multilingual plane (BMP) set of characters. Mysql's varchar columns only support BMP. If you want 4-byte UTF-8 characters (or a more full set of international characters), you have to use varbinary columns... Here's where hibernate has trouble and I'm hoping there's a configuration option or a workaround. According to Oracle, Mysql 6, if you compile from the latest sources, actually does support the full UTF-8 character set in varchar columns, but that's not an option for us.

So my question is basically, can I get hibernate to work with UTF-8 4-byte characters from mysql? It doesn't work out of the box..

What happens is, everything works fine during the hibernate saving operation. Mysql contains the correct values (sequence of bytes) in the varbinary columns. If I query for an entity using Hibernate by the String that I saved I get the correct entities (select cat where name=:catName). But when Hibernate fills the entity to which with the Strings from the DB I get different values then the ones I used in the query that retrieved them. Instead of receiving the correct values I receive a bunch of FFFD unicode replacement characters.

For example: if I store "하늘" in the DB as the cat name and then I query for it (select cat where name='하늘'), the resulting cat returned will have the correct id, but the name String will be \uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD.

* the DB connection has the following parameters set useUnicode=true&characterEncoding=UTF-8,
* I've tried using the following configurations for Hibernate but that didn't solve the problem:
- connection.useUnicode = true
- connection.characterEncoding = UTF-8

So hibernate has a problem filling in an entity's string from a varbinary column. What can we do? Is there a workaround? Or do we need to make all of our entity fields byte arrays instead of Strings (far from ideal for us and is a last resort, but this does work)?


Top
 Profile  
 
 Post subject: Re: Is Hibernate able to read UTF-8 4-byte chars from database?
PostPosted: Tue Mar 30, 2010 10:16 pm 
Regular
Regular

Joined: Mon Mar 10, 2008 6:40 pm
Posts: 114
Nobody reading this forum has ever had to deal with hibernate and mysql along with international characters (including supplementary characters)? We're about to change all of our Strings to byte arrays, but would love a better solution.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.