Deprecated: Assigning the return value of new by reference is deprecated in /home/techmasa/public_html/wp-content/plugins/sem-cosmos-link/sem-cosmos-link.php on line 90

Deprecated: Assigning the return value of new by reference is deprecated in /home/techmasa/public_html/wp-content/plugins/sem-recent-posts.php on line 942

Warning: session_start() [function.session-start]: Cannot send session cookie - headers already sent by (output started at /home/techmasa/public_html/wp-content/plugins/sem-cosmos-link/sem-cosmos-link.php:90) in /home/techmasa/public_html/wp-content/plugins/wordpress-automatic-upgrade/wordpress-automatic-upgrade.php on line 121

Warning: session_start() [function.session-start]: Cannot send session cache limiter - headers already sent (output started at /home/techmasa/public_html/wp-content/plugins/sem-cosmos-link/sem-cosmos-link.php:90) in /home/techmasa/public_html/wp-content/plugins/wordpress-automatic-upgrade/wordpress-automatic-upgrade.php on line 121
TechMasala - Technology Spice Rack » 2007 » March » 09

Archive for March 9, 2007

Unicode Transformation Formats

Text formats and representing language character has ever been in the focus ever since computers were invented. For obvious reasons we wanted to interact with the computer in the language that we understand better rather than in the binaries. Clearly the focus initially was to build a representation method for the international language English. But as we evolved more sophisticated in the Internet space global applications are more looking at systems that can enable users to work in their specific locale (language, currency, date and time formats etc.). As far as language is concerned the old formats such as ASCII and EBCDIC will not help in representing the characters of languages around the world.

UnicodeThe Unicode Consortium, a non profit organization developed the standards Unicode Transformation Formats that help in representing the characters of any language in the world. The Unicode Standard defines three encoding forms that allow the same data to be transmitted in a byte, word or double word oriented format (i.e. in 8, 16 or 32-bits per code unit). All three encoding forms encode the same common character and can be efficiently transformed into one another without loss of data. UTF-8 (Unicode Transformation Format 8 ) is the standard format that is used for web applications that is applications that use HTML for visual representation of text. “The UnicodeĀ® Standard: A Technical Introduction” in the Unicode site gives an introduction to the technical details of UTF.


Deprecated: Function ereg_replace() is deprecated in /home/techmasa/public_html/wp-content/plugins/sociable/sociable.php on line 64
Blink this Unicode Transformation Formats at blinklist.com    Bookmark Unicode Transformation Formats at blogmarks    Bookmark Unicode Transformation Formats at del.icio.us    Digg Unicode Transformation Formats at Digg.com    Fark Unicode Transformation Formats at Fark.com    Bookmark Unicode Transformation Formats at Furl.net    Bookmark Unicode Transformation Formats at NewsVine    Bookmark Unicode Transformation Formats at reddit.com    Bookmark Unicode Transformation Formats at Simpy.com    Bookmark Unicode Transformation Formats at Spurl.net    Bookmark Unicode Transformation Formats with wists    Bookmark Unicode Transformation Formats at YahooMyWeb

Comments      Cosmos


Creative Commons License  This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.