Blog dedicated to Oracle Applications (E-Business Suite) Technology; covers Apps Architecture, Administration and third party bolt-ons to Apps

Tuesday, October 23, 2007

Difference between UTF8 and AL32UTF8

UTF8 and AL32UTF8 are encodings of the Unicode character set and include all the characters in all modern languages. UTF8 and AL32UTF8 allow Oracle Applications to be run from one database instance using any combination of supported languages. The advantage of AL32UTF8 over UTF8 is in the handling of supplementary characters, which are increasingly used in certain languages. AL16UTF16 is the current default database character set for Oracle databases 10g and 11g and Oracle E-Business Suite R12.

If you happen to create a fresh Apps Production instance, be sure to select the characterset as AL16UTF16, if all the clients and servers connecting to your instance are 9i and above. This is because 8i databases have trouble connecting to AL32UTF8 databases as per metalink note 237593.1

Caution:

AL32UTF8 is the Oracle Database character set that is appropriate for XMLType data. It is equivalent to the IANA registered standard UTF-8 encoding, which supports all valid XML characters.

Do not confuse Oracle Database database character set UTF8 (no hyphen) with database character set AL32UTF8 or with character encoding UTF-8. Database character set UTF8 has been superseded by AL32UTF8. Do not use UTF8 for XML data. UTF8 supports only Unicode version 3.1 and earlier; it does not support all valid XML characters. AL32UTF8 has no such limitation.

Using database character set UTF8 for XML data could cause a fatal error or affect security negatively. If a character that is not supported by the database character set appears in an input-document element name, a replacement character (usually a question mark) is substituted for it. This will terminate parsing and raise an exception.

7 comments:

Anonymous said...

Thanks, that helped :)

meitham said...

Thanks for the article. Could you please provide an example of a character supported by XML but not in the standard utf-8 encoding of the unicode?

meitham said...

Thank you for the article. Could you please provide an example of a character supported by XML but not in the standard utf-8 encoding of the unicode?

sreejith said...

Hi ,

Could you please tell me whether the character set AL32UTF8 will support pound symbol or not ?

Unknown said...

Hi All,

I Agrees with this one. but my job is still getting failed even i used the same as above. i am getting some symbol as thita values and Pi symbol type of data is coming from source. due to that our job is getting failed while loading data from flat file to TD database.

could you please suggest me how i can i resolve this issue.
i am using the \006 as delimiter for UTF8 encoding page.

Thanks ,
Srinivas

Unknown said...

Hi All,

I am facing same type of issue while loading data from falt file to database.

we are using the ecap as delimiter while using MS window latin code page and \006 as delimiter while using UTF8 encoding page.we are getting some different symbols like THITA / PI etc. i am using the utf8 encoding . event though job is getting failed due to data is breaking.
Could you please suggest me how to resolve this issue.

kishore said...

This is a very good post...it's an aid to my undertanding...