Thursday, August 03, 2006

I'm getting different bytes when pulling a Shift_JIS string from a POST

I'm creating a servlet that receives Japanese characters in Shift_JIS character encoding. When I pull a parameter out of the request object, the bytes look right in GET but anomalous when use POST. (Why I'm pulling out raw bytes is another story... a long one... anyways...)

If you're able to view Japanese characters, the String I'm supposed to be receiving is this:
ミーティングルーム (...don't ask me what it means, I don't know either.)
If I use GET, the bytes I receive are these (hex):

[83] [7e] [81] [5b] [83] [65] [83] [42] [83] [93] [83] [4b] [83] [8b] [81] [5b] [83] [80]
These are the values I expect from Shift_JIS encoding. However, using POST, I get these (hex):
[ff] [fd] [7e] [ff] [fd] [5b] [ff] [fd] [65] [ff] [fd] [42] [ff] [fd] [ff] [fd] [ff] [fd] [4f] [ff] [fd] [ff] [fd] [ff] [fd] [5b] [ff] [fd] [ff] [fd]
Does anyone here know why the POST bytes are the way they are?

By the way, in case you were not able to view the Japanese characters, the UTF-8 (erm... Unicode?) values of the characters in the String I'm supposed to be receiving is this:
[30df] [30fc] [30c6] [30a3] [30f3] [30b0] [30eb] [30fc] [30e0] (You can tell I'm a newbie to charsets and encodings).

Looking forward to your insights! Thanks!

No comments:

Post a Comment