By Yogananda

“Remain calm, serene, always in command of yourself. You will then find out how easy it is to get along.”
-By Yogananda

Sunday, December 2, 2012

Encoding and Decoding
=================

You can use encoding in order to render a specific language properly in any browser. When you support encoding from your front end, you need to make sure that you adhere to the following in order to avoid upcoming future issues.

1. OS (windows, ubuntu and etc)
2. Browser
3. Code (whole flow from JSP to the back end)

1. OS

Different OS supports different encoding type as default. For example windows 7 supports cp1252 which is a western char set and ubuntu supports UTF-8 which unicode char set. If we take java, jvm by default pick up the char set from the OS. But you can change the char set to jvm by adding an environmental variable in windows for example to support UTF-8 in windows:

JAVA_TOOL_OPTIONS
Variable value : - -Dfile.encoding=UTF-8

FYI:- Windows XP and previous versions supports unicode. So that you don't need to add this. But after windows 7 only, you may need to add this only if the default char set is different from UTF-8.

The reason why i'm taken UTF-8 is most of the time we support in our code UTF-8 as it is unicode. So making whole flow into unique charset is important when it comes to encoding and decoding.

2. Browser

To view a particular language, you may have to switch to correct encoding from the browser. Browser precedence  when choosing right encoding is from:

          *  The encoding send by the server (in JSP eg: encoding specified in meta tag)
          *  User preference from the broswer

Since we include UTF-8 or some other encoding in JSP's, browser tends to choose them as the encoding every time a page getting loaded. In order to to choose different encoding from JSP's meta tag, we will have to choose it in every page reloads which is a pain for a user. So that we have to support such different encoding from JSP itself. Then only users can go ahead without facing any problem.

3. Code (whole flow from JSP to the back end)

Every where encoding and decoding should be taken as the correct charset. So in order to make sure, we might have to develop the back end supporting correct encoding and decoding. For an example:
Encoding in JSP's
Encoding to query parameters
Encoding to DB input values

Like wise, you will have to check each and every area where encoding decoding is handled in order to avoid loosing details by following wrong encoding type in the middle.

This post is to just to give an idea about encoding and it's issues so that we can think better to support internationalization of a developing web site.