This is a fairly simple problem that I have to imagine many people have run into, but a basic search surprisingly has not led me to any answers...
We're able to save UTF-8 characters and read them as well with hibernate and mysql. But only when using varchar columns. The problem, as anyone who has had to deal with asian and international characters can attest to, is you can't use varchar columns if you want more than the basic multilingual plane (BMP) set of characters. Mysql's varchar columns only support BMP. If you want 4-byte UTF-8 characters (or a more full set of international characters), you have to use varbinary columns... Here's where hibernate has trouble and I'm hoping there's a configuration option or a workaround. According to Oracle, Mysql 6, if you compile from the latest sources, actually does support the full UTF-8 character set in varchar columns, but that's not an option for us.
So my question is basically, can I get hibernate to work with UTF-8 4-byte characters from mysql? It doesn't work out of the box..
What happens is, everything works fine during the hibernate saving operation. Mysql contains the correct values (sequence of bytes) in the varbinary columns. If I query for an entity using Hibernate by the String that I saved I get the correct entities (select cat where name=:catName). But when Hibernate fills the entity to which with the Strings from the DB I get different values then the ones I used in the query that retrieved them. Instead of receiving the correct values I receive a bunch of FFFD unicode replacement characters.
For example: if I store "하늘" in the DB as the cat name and then I query for it (select cat where name='하늘'), the resulting cat returned will have the correct id, but the name String will be \uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD.
* the DB connection has the following parameters set useUnicode=true&characterEncoding=UTF-8, * I've tried using the following configurations for Hibernate but that didn't solve the problem: - connection.useUnicode = true - connection.characterEncoding = UTF-8
So hibernate has a problem filling in an entity's string from a varbinary column. What can we do? Is there a workaround? Or do we need to make all of our entity fields byte arrays instead of Strings (far from ideal for us and is a last resort, but this does work)?
|