Showing posts with label Globalization Support. Show all posts
Showing posts with label Globalization Support. Show all posts

Wednesday, March 11, 2009

Difference between WE8MSWIN1252 and WE8ISO8859P15 characterset

The lists of characters along with their code points used in oracle database character set WE8ISO8859P15 is defined in the http://msdn.microsoft.com/en-us/goglobal/cc305176.aspx.

Also, the lists of characters along with their code points used in oracle database character set WE8MSWIN1252 is defined in the http://msdn.microsoft.com/en-us/goglobal/cc305145.aspx.

If we look for WE8MSWIN1252 and WE8ISO8859P15 character set then 28 code points are not existed in WE8ISO8859P15 but they are used/filled in WE8MSWIN1252.

Also all of the characters exist in WE8ISO8859P15 are also exists in WE8MSWIN1252. So we can say WE8MSWIN1252 is a logical superset of character set WE8ISO8859P15 but not a binary superset.

Also we see 8 codepoints have a different symbol in WE8MSWIN1252 than in P15 for the same physical codepoint.

Below is the lists of all characters under both character sets along with their code points.


Dec. Unico. Charac. WE8ISO8859P15 Character Description
---- ------ ------- (States if different) -----------

0x00 0x0000 [ ] [ ] NULL
0x01 0x0001 [ ] [ ] START OF HEADING
0x02 0x0002 [ ] [ ] START OF TEXT
0x03 0x0003 [ ] [ ] END OF TEXT
0x04 0x0004 [ ] [ ] END OF TRANSMISSION
0x05 0x0005 [ ] [ ] ENQUIRY
0x06 0x0006 [ ] [ ] ACKNOWLEDGE
0x07 0x0007 [ ] [ ] BELL
0x08 0x0008 [ ] [ ] BACKSPACE
0x09 0x0009 [ ] [ ] HORIZONTAL TABULATION
0x0A 0x000A [ ] [ ] LINE FEED
0x0B 0x000B [ ] [ ] VERTICAL TABULATION
0x0C 0x000C [ ] [ ] FORM FEED
0x0D 0x000D [ ] [ ] CARRIAGE RETURN
0x0E 0x000E [ ] [ ] SHIFT OUT
0x0F 0x000F [ ] [ ] SHIFT IN
0x10 0x0010 [ ] [ ] DATA LINK ESCAPE
0x11 0x0011 [ ] [ ] DEVICE CONTROL ONE
0x12 0x0012 [ ] [ ] DEVICE CONTROL TWO
0x13 0x0013 [ ] [ ] DEVICE CONTROL THREE
0x14 0x0014 [ ] [ ] DEVICE CONTROL FOUR
0x15 0x0015 [ ] [ ] NEGATIVE ACKNOWLEDGE
0x16 0x0016 [ ] [ ] SYNCHRONOUS IDLE
0x17 0x0017 [ ] [ ] END OF TRANSMISSION BLOCK
0x18 0x0018 [ ] [ ] CANCEL
0x19 0x0019 [ ] [ ] END OF MEDIUM
0x1A 0x001A [ ] [ ] SUBSTITUTE
0x1B 0x001B [ ] [ ] ESCAPE
0x1C 0x001C [ ] [ ] FILE SEPARATOR
0x1D 0x001D [ ] [ ] GROUP SEPARATOR
0x1E 0x001E [ ] [ ] RECORD SEPARATOR
0x1F 0x001F [ ] [ ] UNIT SEPARATOR
0x20 0x0020 [ ] [ ] SPACE
0x21 0x0021 [!] [!] EXCLAMATION MARK
0x22 0x0022 ["] ["] QUOTATION MARK
0x23 0x0023 [#] [#] NUMBER SIGN
0x24 0x0024 [$] [$] DOLLAR SIGN
0x25 0x0025 [%] [%] PERCENT SIGN
0x26 0x0026 [&] [&] AMPERSAND
0x27 0x0027 ['] ['] APOSTROPHE
0x28 0x0028 [(] [(] LEFT PARENTHESIS
0x29 0x0029 [)] [)] RIGHT PARENTHESIS
0x2A 0x002A [*] [*] ASTERISK
0x2B 0x002B [+] [+] PLUS SIGN
0x2C 0x002C [,] [,] COMMA
0x2D 0x002D [-] [-] HYPHEN-MINUS
0x2E 0x002E [.] [.] FULL STOP
0x2F 0x002F [/] [/] SOLIDUS
0x30 0x0030 [0] [0] DIGIT ZERO
0x31 0x0031 [1] [1] DIGIT ONE
0x32 0x0032 [2] [2] DIGIT TWO
0x33 0x0033 [3] [3] DIGIT THREE
0x34 0x0034 [4] [4] DIGIT FOUR
0x35 0x0035 [5] [5] DIGIT FIVE
0x36 0x0036 [6] [6] DIGIT SIX
0x37 0x0037 [7] [7] DIGIT SEVEN
0x38 0x0038 [8] [8] DIGIT EIGHT
0x39 0x0039 [9] [9] DIGIT NINE
0x3A 0x003A [:] [:] COLON
0x3B 0x003B [;] [;] SEMICOLON
0x3C 0x003C [<] [<] LESS-THAN SIGN 0x3D 0x003D [=] [=] EQUALS SIGN 0x3E 0x003E [>] [>] GREATER-THAN SIGN
0x3F 0x003F [?] [?] QUESTION MARK
0x40 0x0040 [@] [@] COMMERCIAL AT
0x41 0x0041 [A] [A] LATIN CAPITAL LETTER A
0x42 0x0042 [B] [B] LATIN CAPITAL LETTER B
0x43 0x0043 [C] [C] LATIN CAPITAL LETTER C
0x44 0x0044 [D] [D] LATIN CAPITAL LETTER D
0x45 0x0045 [E] [E] LATIN CAPITAL LETTER E
0x46 0x0046 [F] [F] LATIN CAPITAL LETTER F
0x47 0x0047 [G] [G] LATIN CAPITAL LETTER G
0x48 0x0048 [H] [H] LATIN CAPITAL LETTER H
0x49 0x0049 [I] [I] LATIN CAPITAL LETTER I
0x4A 0x004A [J] [J] LATIN CAPITAL LETTER J
0x4B 0x004B [K] [K] LATIN CAPITAL LETTER K
0x4C 0x004C [L] [L] LATIN CAPITAL LETTER L
0x4D 0x004D [M] [M] LATIN CAPITAL LETTER M
0x4E 0x004E [N] [N] LATIN CAPITAL LETTER N
0x4F 0x004F [O] [O] LATIN CAPITAL LETTER O
0x50 0x0050 [P] [P] LATIN CAPITAL LETTER P
0x51 0x0051 [Q] [Q] LATIN CAPITAL LETTER Q
0x52 0x0052 [R] [R] LATIN CAPITAL LETTER R
0x53 0x0053 [S] [S] LATIN CAPITAL LETTER S
0x54 0x0054 [T] [T] LATIN CAPITAL LETTER T
0x55 0x0055 [U] [U] LATIN CAPITAL LETTER U
0x56 0x0056 [V] [V] LATIN CAPITAL LETTER V
0x57 0x0057 [W] [W] LATIN CAPITAL LETTER W
0x58 0x0058 [X] [X] LATIN CAPITAL LETTER X
0x59 0x0059 [Y] [Y] LATIN CAPITAL LETTER Y
0x5A 0x005A [Z] [Z] LATIN CAPITAL LETTER Z
0x5B 0x005B [[] [[] LEFT SQUARE BRACKET
0x5C 0x005C [\] [\] REVERSE SOLIDUS
0x5D 0x005D []] []] RIGHT SQUARE BRACKET
0x5E 0x005E [^] [^] CIRCUMFLEX ACCENT
0x5F 0x005F [_] [_] LOW LINE
0x60 0x0060 [`] [`] GRAVE ACCENT
0x61 0x0061 [a] [a] LATIN SMALL LETTER A
0x62 0x0062 [b] [b] LATIN SMALL LETTER B
0x63 0x0063 [c] [c] LATIN SMALL LETTER C
0x64 0x0064 [d] [d] LATIN SMALL LETTER D
0x65 0x0065 [e] [e] LATIN SMALL LETTER E
0x66 0x0066 [f] [f] LATIN SMALL LETTER F
0x67 0x0067 [g] [g] LATIN SMALL LETTER G
0x68 0x0068 [h] [h] LATIN SMALL LETTER H
0x69 0x0069 [i] [i] LATIN SMALL LETTER I
0x6A 0x006A [j] [j] LATIN SMALL LETTER J
0x6B 0x006B [k] [k] LATIN SMALL LETTER K
0x6C 0x006C [l] [l] LATIN SMALL LETTER L
0x6D 0x006D [m] [m] LATIN SMALL LETTER M
0x6E 0x006E [n] [n] LATIN SMALL LETTER N
0x6F 0x006F [o] [o] LATIN SMALL LETTER O
0x70 0x0070 [p] [p] LATIN SMALL LETTER P
0x71 0x0071 [q] [q] LATIN SMALL LETTER Q
0x72 0x0072 [r] [r] LATIN SMALL LETTER R
0x73 0x0073 [s] [s] LATIN SMALL LETTER S
0x74 0x0074 [t] [t] LATIN SMALL LETTER T
0x75 0x0075 [u] [u] LATIN SMALL LETTER U
0x76 0x0076 [v] [v] LATIN SMALL LETTER V
0x77 0x0077 [w] [w] LATIN SMALL LETTER W
0x78 0x0078 [x] [x] LATIN SMALL LETTER X
0x79 0x0079 [y] [y] LATIN SMALL LETTER Y
0x7A 0x007A [z] [z] LATIN SMALL LETTER Z
0x7B 0x007B [{] [{] LEFT CURLY BRACKET
0x7C 0x007C [|] [|] VERTICAL LINE
0x7D 0x007D [}] [}] RIGHT CURLY BRACKET
0x7E 0x007E [~] [~] TILDE
0x7F 0x007F [ ] [ ] DELETE
0x80 0x20AC [€] [€] UNDEFINED EURO SIGN
0x81 [ ] [ ] UNDEFINED UNDEFINED
0x82 0x201A [‚] [‚] UNDEFINED SINGLE LOW-9 QUOTATION MARK
0x83 0x0192 [ƒ] [ƒ] UNDEFINED LATIN SMALL LETTER F WITH HOOK
0x84 0x201E [„] [„] UNDEFINED DOUBLE LOW-9 QUOTATION MARK
0x85 0x2026 […] […] UNDEFINED HORIZONTAL ELLIPSIS
0x86 0x2020 [†] [†] UNDEFINED DAGGER
0x87 0x2021 [‡] [‡] UNDEFINED DOUBLE DAGGER
0x88 0x02C6 [ˆ] [ˆ] UNDEFINED MODIFIER LETTER CIRCUMFLEX ACCENT
0x89 0x2030 [‰] [‰] UNDEFINED PER MILLE SIGN
0x8A 0x0160 [Š] [Š] UNDEFINED LATIN CAPITAL LETTER S WITH CARON
0x8B 0x2039 [‹] [‹] UNDEFINED SINGLE LEFT-POINTING ANGLE QUOTATION MARK
0x8C 0x0152 [Œ] [Œ] UNDEFINED LATIN CAPITAL LIGATURE OE
0x8D [ ] [ ] UNDEFINED UNDEFINED
0x8E 0x017D [Ž] [Ž] UNDEFINED LATIN CAPITAL LETTER Z WITH CARON
0x8F [ ] [ ] UNDEFINED UNDEFINED
0x90 [ ] [ ] UNDEFINED UNDEFINED
0x91 0x2018 [‘] [‘] UNDEFINED LEFT SINGLE QUOTATION MARK
0x92 0x2019 [’] [’] UNDEFINED RIGHT SINGLE QUOTATION MARK
0x93 0x201C [“] [“] UNDEFINED LEFT DOUBLE QUOTATION MARK
0x94 0x201D [”] [”] UNDEFINED RIGHT DOUBLE QUOTATION MARK
0x95 0x2022 [•] [•] UNDEFINED BULLET
0x96 0x2013 [–] [–] UNDEFINED EN DASH
0x97 0x2014 [—] [—] UNDEFINED EM DASH
0x98 0x02DC [˜] [˜] UNDEFINED SMALL TILDE
0x99 0x2122 [™] [™] UNDEFINED TRADE MARK SIGN
0x9A 0x0161 [š] [š] UNDEFINED LATIN SMALL LETTER S WITH CARON
0x9B 0x203A [›] [›] UNDEFINED SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
0x9C 0x0153 [œ] [œ] UNDEFINED LATIN SMALL LIGATURE OE
0x9D [ ] [ ] UNDEFINED UNDEFINED
0x9E 0x017E [ž] [ž] UNDEFINED LATIN SMALL LETTER Z WITH CARON
0x9F 0x0178 [Ÿ] [Ÿ] UNDEFINED LATIN CAPITAL LETTER Y WITH DIAERESIS
0xA0 0x00A0 [ ] [ ] NO-BREAK SPACE
0xA1 0x00A1 [¡] [¡] INVERTED EXCLAMATION MARK
0xA2 0x00A2 [¢] [¢] CENT SIGN
0xA3 0x00A3 [£] [£] POUND SIGN
0xA4 0x00A4 [¤] [¤] Euro Sign(€) MS1252 code point 80 CURRENCY SIGN
0xA5 0x00A5 [¥] [¥] YEN SIGN
0xA6 0x00A6 [¦] [¦] LATIN CAPITAL LETTER S WITH CARON(Š) 8A BROKEN BAR
0xA7 0x00A7 [§] [§] SECTION SIGN
0xA8 0x00A8 [¨] [¨] LATIN SMALL LETTER S WITH CARON(š) 9A DIAERESIS
0xA9 0x00A9 [©] [©] COPYRIGHT SIGN
0xAA 0x00AA [ª] [ª] FEMININE ORDINAL INDICATOR
0xAB 0x00AB [«] [«] LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
0xAC 0x00AC [¬] [¬] NOT SIGN
0xAD 0x00AD [ ] [ ] SOFT HYPHEN
0xAE 0x00AE [®] [®] REGISTERED SIGN
0xAF 0x00AF [¯] [¯] MACRON
0xB0 0x00B0 [°] [°] DEGREE SIGN
0xB1 0x00B1 [±] [±] PLUS-MINUS SIGN
0xB2 0x00B2 [²] [²] SUPERSCRIPT TWO
0xB3 0x00B3 [³] [³] SUPERSCRIPT THREE
0xB4 0x00B4 [´] [´] LATIN CAPITAL LETTER Z WITH CARON(Ž) 8E ACUTE ACCENT
0xB5 0x00B5 [µ] [µ] MICRO SIGN
0xB6 0x00B6 [¶] [¶] PILCROW SIGN
0xB7 0x00B7 [·] [·] MIDDLE DOT
0xB8 0x00B8 [¸] [¸] LATIN SMALL LETTER Z WITH CARON(ž) 9E CEDILLA
0xB9 0x00B9 [¹] [¹] SUPERSCRIPT ONE
0xBA 0x00BA [º] [º] MASCULINE ORDINAL INDICATOR
0xBB 0x00BB [»] [»] RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
0xBC 0x00BC [¼] [¼] LATIN CAPITAL LIGATURE OE(Œ) 8C VULGAR FRACTION ONE QUARTER
0xBD 0x00BD [½] [½] LATIN SMALL LIGATURE OE(œ) 9C VULGAR FRACTION ONE HALF
0xBE 0x00BE [¾] [¾] LATIN CAPITAL LETTER Y WITH DIAERESIS(Ÿ) 9F VULGAR FRACTION THREE QUARTERS
0xBF 0x00BF [¿] [¿] INVERTED QUESTION MARK
0xC0 0x00C0 [À] [À] LATIN CAPITAL LETTER A WITH GRAVE
0xC1 0x00C1 [Á] [Á] LATIN CAPITAL LETTER A WITH ACUTE
0xC2 0x00C2 [Â] [Â] LATIN CAPITAL LETTER A WITH CIRCUMFLEX
0xC3 0x00C3 [Ã] [Ã] LATIN CAPITAL LETTER A WITH TILDE
0xC4 0x00C4 [Ä] [Ä] LATIN CAPITAL LETTER A WITH DIAERESIS
0xC5 0x00C5 [Å] [Å] LATIN CAPITAL LETTER A WITH RING ABOVE
0xC6 0x00C6 [Æ] [Æ] LATIN CAPITAL LETTER AE
0xC7 0x00C7 [Ç] [Ç] LATIN CAPITAL LETTER C WITH CEDILLA
0xC8 0x00C8 [È] [È] LATIN CAPITAL LETTER E WITH GRAVE
0xC9 0x00C9 [É] [É] LATIN CAPITAL LETTER E WITH ACUTE
0xCA 0x00CA [Ê] [Ê] LATIN CAPITAL LETTER E WITH CIRCUMFLEX
0xCB 0x00CB [Ë] [Ë] LATIN CAPITAL LETTER E WITH DIAERESIS
0xCC 0x00CC [Ì] [Ì] LATIN CAPITAL LETTER I WITH GRAVE
0xCD 0x00CD [Í] [Í] LATIN CAPITAL LETTER I WITH ACUTE
0xCE 0x00CE [Î] [Î] LATIN CAPITAL LETTER I WITH CIRCUMFLEX
0xCF 0x00CF [Ï] [Ï] LATIN CAPITAL LETTER I WITH DIAERESIS
0xD0 0x00D0 [Ð] [Ð] LATIN CAPITAL LETTER ETH
0xD1 0x00D1 [Ñ] [Ñ] LATIN CAPITAL LETTER N WITH TILDE
0xD2 0x00D2 [Ò] [Ò] LATIN CAPITAL LETTER O WITH GRAVE
0xD3 0x00D3 [Ó] [Ó] LATIN CAPITAL LETTER O WITH ACUTE
0xD4 0x00D4 [Ô] [Ô] LATIN CAPITAL LETTER O WITH CIRCUMFLEX
0xD5 0x00D5 [Õ] [Õ] LATIN CAPITAL LETTER O WITH TILDE
0xD6 0x00D6 [Ö] [Ö] LATIN CAPITAL LETTER O WITH DIAERESIS
0xD7 0x00D7 [×] [×] MULTIPLICATION SIGN
0xD8 0x00D8 [Ø] [Ø] LATIN CAPITAL LETTER O WITH STROKE
0xD9 0x00D9 [Ù] [Ù] LATIN CAPITAL LETTER U WITH GRAVE
0xDA 0x00DA [Ú] [Ú] LATIN CAPITAL LETTER U WITH ACUTE
0xDB 0x00DB [Û] [Û] LATIN CAPITAL LETTER U WITH CIRCUMFLEX
0xDC 0x00DC [Ü] [Ü] LATIN CAPITAL LETTER U WITH DIAERESIS
0xDD 0x00DD [Ý] [Ý] LATIN CAPITAL LETTER Y WITH ACUTE
0xDE 0x00DE [Þ] [Þ] LATIN CAPITAL LETTER THORN
0xDF 0x00DF [ß] [ß] LATIN SMALL LETTER SHARP S
0xE0 0x00E0 [à] [à] LATIN SMALL LETTER A WITH GRAVE
0xE1 0x00E1 [á] [á] LATIN SMALL LETTER A WITH ACUTE
0xE2 0x00E2 [â] [â] LATIN SMALL LETTER A WITH CIRCUMFLEX
0xE3 0x00E3 [ã] [ã] LATIN SMALL LETTER A WITH TILDE
0xE4 0x00E4 [ä] [ä] LATIN SMALL LETTER A WITH DIAERESIS
0xE5 0x00E5 [å] [å] LATIN SMALL LETTER A WITH RING ABOVE
0xE6 0x00E6 [æ] [æ] LATIN SMALL LETTER AE
0xE7 0x00E7 [ç] [ç] LATIN SMALL LETTER C WITH CEDILLA
0xE8 0x00E8 [è] [è] LATIN SMALL LETTER E WITH GRAVE
0xE9 0x00E9 [é] [é] LATIN SMALL LETTER E WITH ACUTE
0xEA 0x00EA [ê] [ê] LATIN SMALL LETTER E WITH CIRCUMFLEX
0xEB 0x00EB [ë] [ë] LATIN SMALL LETTER E WITH DIAERESIS
0xEC 0x00EC [ì] [ì] LATIN SMALL LETTER I WITH GRAVE
0xED 0x00ED [í] [í] LATIN SMALL LETTER I WITH ACUTE
0xEE 0x00EE [î] [î] LATIN SMALL LETTER I WITH CIRCUMFLEX
0xEF 0x00EF [ï] [ï] LATIN SMALL LETTER I WITH DIAERESIS
0xF0 0x00F0 [ð] [ð] LATIN SMALL LETTER ETH
0xF1 0x00F1 [ñ] [ñ] LATIN SMALL LETTER N WITH TILDE
0xF2 0x00F2 [ò] [ò] LATIN SMALL LETTER O WITH GRAVE
0xF3 0x00F3 [ó] [ó] LATIN SMALL LETTER O WITH ACUTE
0xF4 0x00F4 [ô] [ô] LATIN SMALL LETTER O WITH CIRCUMFLEX
0xF5 0x00F5 [õ] [õ] LATIN SMALL LETTER O WITH TILDE
0xF6 0x00F6 [ö] [ö] LATIN SMALL LETTER O WITH DIAERESIS
0xF7 0x00F7 [÷] [÷] DIVISION SIGN
0xF8 0x00F8 [ø] [ø] LATIN SMALL LETTER O WITH STROKE
0xF9 0x00F9 [ù] [ù] LATIN SMALL LETTER U WITH GRAVE
0xFA 0x00FA [ú] [ú] LATIN SMALL LETTER U WITH ACUTE
0xFB 0x00FB [û] [û] LATIN SMALL LETTER U WITH CIRCUMFLEX
0xFC 0x00FC [ü] [ü] LATIN SMALL LETTER U WITH DIAERESIS
0xFD 0x00FD [ý] [ý] LATIN SMALL LETTER Y WITH ACUTE
0xFE 0x00FE [þ] [þ] LATIN SMALL LETTER THORN
0xFF 0x00FF [ÿ] [ÿ] LATIN SMALL LETTER Y WITH DIAERESIS


Related Documents

http://arjudba.blogspot.com/2009/03/difference-between-we8iso8859p1-and.html
http://arjudba.blogspot.com/2009/03/difference-between-we8iso8859p1-and_11.html

Difference between WE8ISO8859P1 and WE8ISO8859P15 characterset

The lists of characters along with their code points used in oracle database character set WE8ISO8859P1 is defined in the http://msdn.microsoft.com/en-us/goglobal/cc305167.aspx.

And the lists of characters along with their code points used in oracle database character set WE8ISO8859P15 is defined in the http://msdn.microsoft.com/en-us/goglobal/cc305176.aspx.

The oracle database character set WE8ISO8859P15 differs from WE8ISO8859P1 in a few positions only.

In the oracle database character set WE8ISO8859P15 the euro sign and some national letters used in French and Finnish have been introduced and some rarely used special characters omitted that was exist in WE8ISO8859P1.

Below is the lists of WE8ISO8859P1 and WE8ISO8859P15 character sets that differ by code position only.


Code | WE8ISO8859P1 (ISO Latin 1) | WE8ISO8859P15 (ISO Latin 9)
in | |
hex | name | name
------+------------------------------+------------------------------------

A4 | general currency symbol(¤) | euro sign (€)
| |
A6 | broken vertical bar (¦) | latin capital letter s with caron (Š)
| |
A8 | umlaut (diaeresis) accent(¨)| latin small letter s with caron (š)
| |
B4 | acute accent (´) | latin capital letter z with caron (Ž)
| |
B8 | cedilla (¸) | latin small letter z with caron (ž)
| |
BC | one fourth (one quarter) (¼)| latin capital ligature oe (Œ)
| |
BD | one half (½) | latin small ligature oe (œ)
| |
BE | three quarters (¾) | latin capital letter y with diaeresis (Ÿ)



Except the above characters and the characters that are undefined, rest of the characters in WE8ISO8859P15 has the same code point in WE8ISO8859P1.

Note that in both WE8ISO8859P15 and WE8ISO8859P1 the code points from 0x80 to 0x9F are undefined. So whenever you want to find different between these two the undefined characters also appear in the list.

SQL>set serveroutput on
declare
i number;
begin
for i in 0..255 loop
declare
ch varchar2(1);
begin
ch := chr(i);
if convert( ch, 'WE8ISO8859P1', 'WE8ISO8859P15') != ch
then
dbms_output.put_line('Difference- Decimal:'|| i ||' Hexa:'|| to_char(i,'XXXX'));
end if;
end;
end loop;
end;
/

Difference- Decimal:128 Hexa: 80
Difference- Decimal:129 Hexa: 81
Difference- Decimal:130 Hexa: 82
Difference- Decimal:131 Hexa: 83
Difference- Decimal:132 Hexa: 84
Difference- Decimal:133 Hexa: 85
Difference- Decimal:134 Hexa: 86
Difference- Decimal:135 Hexa: 87
Difference- Decimal:136 Hexa: 88
Difference- Decimal:137 Hexa: 89
Difference- Decimal:138 Hexa: 8A
Difference- Decimal:139 Hexa: 8B
Difference- Decimal:140 Hexa: 8C
Difference- Decimal:141 Hexa: 8D
Difference- Decimal:142 Hexa: 8E
Difference- Decimal:143 Hexa: 8F
Difference- Decimal:144 Hexa: 90
Difference- Decimal:145 Hexa: 91
Difference- Decimal:146 Hexa: 92
Difference- Decimal:147 Hexa: 93
Difference- Decimal:148 Hexa: 94
Difference- Decimal:149 Hexa: 95
Difference- Decimal:150 Hexa: 96
Difference- Decimal:151 Hexa: 97
Difference- Decimal:152 Hexa: 98
Difference- Decimal:153 Hexa: 99
Difference- Decimal:154 Hexa: 9A
Difference- Decimal:155 Hexa: 9B
Difference- Decimal:156 Hexa: 9C
Difference- Decimal:157 Hexa: 9D
Difference- Decimal:158 Hexa: 9E
Difference- Decimal:159 Hexa: 9F
Difference- Decimal:164 Hexa: A4
Difference- Decimal:166 Hexa: A6
Difference- Decimal:168 Hexa: A8
Difference- Decimal:180 Hexa: B4
Difference- Decimal:184 Hexa: B8
Difference- Decimal:188 Hexa: BC
Difference- Decimal:189 Hexa: BD
Difference- Decimal:190 Hexa: BE

PL/SQL procedure successfully completed.

In both character set from 0x80 to 0x9F all the code points are undefined. And the rest 8 characters are different between the two.

Also the WE8ISO8859P1 and WE8ISO8859P15 character sets are not binary super sets of each other.
Related Documents
Difference between WE8MSWIN1252 and WE8ISO8859P15 characterset
Difference between WE8ISO8859P1 and WE8MSWIN1252 characterset
CSSCAN fails with CSS-00151: failed to enumerate user tables CSS-00120

CSSCAN fails with error while loading shared libraries: libclntsh.so.10.1

How to run csscan in the background as a sysdba

CSSCAN fails with CSS-00107: Character set migration utility schema not installed
ORA-00904: "CNVTYPE" CSS-08888: failed to update conversion type
CSSCAN fails with ORA-00600, CSS-00152, CSS-00120

Tuesday, March 10, 2009

Difference between WE8ISO8859P1 and WE8MSWIN1252 characterset

The lists of characters along with their code points used in oracle database character set WE8ISO8859P1 is defined in the http://msdn.microsoft.com/en-us/goglobal/cc305167.aspx.

And the lists of characters along with their code points used in oracle database character set WE8MSWIN1252 is defined in the http://msdn.microsoft.com/en-us/goglobal/cc305145.aspx

  • If we look for the characters and code points for both character sets then we will find that every characters defined under WE8ISO8859P1 exists in character set WE8MSWIN1252 plus WE8MSWIN1252 contains some additions characters. So we can say WE8MSWIN1252 is logical super set of WE8ISO8859P1.

  • If we look further details, we see total 27 code points are not existing in P1 that are filled in / used in WE8MSWIN1252.

  • Also, no code points have a different symbol in WE8MSWIN1252 than in WE8ISO8859P1 soWE8MSWIN1252 is a binary super set of WE8ISO8859P1.

We see in WE8MSWIN1252 the euro symbol (€) is defined as code point 80(hex number). But in WE8ISO8859P1 the euro symbol is unassigned/ not defined. Below is the list of characters that is defined in WE8MSWIN1252 but undefined in WE8ISO8859P1.

In the list,
Column #1 is the WE8MSWIN1252 characters table code in hexadecimal.
Column #2 is the Unicode code in hexadecimal.
Column #3 is the list of characters displayed by numerical call and by their value.
Column #4 is the Description of the character.


WIN- Unicod Charact Description
1251 e Char ers
---- ------ ------- ----------------------------

0x80 0x20AC [€] [€] EURO SIGN
0x82 0x201A [‚] [‚] SINGLE LOW-9 QUOTATION MARK
0x83 0x0192 [ƒ] [ƒ] LATIN SMALL LETTER F WITH HOOK
0x84 0x201E [„] [„] DOUBLE LOW-9 QUOTATION MARK
0x85 0x2026 […] […] HORIZONTAL ELLIPSIS
0x86 0x2020 [†] [†] DAGGER
0x87 0x2021 [‡] [‡] DOUBLE DAGGER
0x88 0x02C6 [ˆ] [ˆ] MODIFIER LETTER CIRCUMFLEX ACCENT
0x89 0x2030 [‰] [‰] PER MILLE SIGN
0x8A 0x0160 [Š] [Š] LATIN CAPITAL LETTER S WITH CARON
0x8B 0x2039 [‹] [‹] SINGLE LEFT-POINTING ANGLE QUOTATION MARK
0x8C 0x0152 [Œ] [Œ] LATIN CAPITAL LIGATURE OE
0x8E 0x017D [Ž] [Ž] LATIN CAPITAL LETTER Z WITH CARON
0x91 0x2018 [‘] [‘] LEFT SINGLE QUOTATION MARK
0x92 0x2019 [’] [’] RIGHT SINGLE QUOTATION MARK
0x93 0x201C [“] [“] LEFT DOUBLE QUOTATION MARK
0x94 0x201D [”] [”] RIGHT DOUBLE QUOTATION MARK
0x95 0x2022 [•] [•] BULLET
0x96 0x2013 [–] [–] EN DASH
0x97 0x2014 [—] [—] EM DASH
0x98 0x02DC [˜] [˜] SMALL TILDE
0x99 0x2122 [™] [™] TRADE MARK SIGN
0x9A 0x0161 [š] [š] LATIN SMALL LETTER S WITH CARON
0x9B 0x203A [›] [›] SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
0x9C 0x0153 [œ] [œ] LATIN SMALL LIGATURE OE
0x9E 0x017E [ž] [ž] LATIN SMALL LETTER Z WITH CARON
0x9F 0x0178 [Ÿ] [Ÿ] LATIN CAPITAL LETTER Y WITH DIAERESIS

Below is the list of characters under WE8ISO8859P1 and WE8MSWIN1252 along with their code points.

Dec. Unico. Charac. WE8ISO8859P1 Character Description
---- ------ ------- (States if different) -----------
0x00 0x0000 [ ] [ ] NULL
0x01 0x0001 [ ] [ ] START OF HEADING
0x02 0x0002 [ ] [ ] START OF TEXT
0x03 0x0003 [ ] [ ] END OF TEXT
0x04 0x0004 [ ] [ ] END OF TRANSMISSION
0x05 0x0005 [ ] [ ] ENQUIRY
0x06 0x0006 [ ] [ ] ACKNOWLEDGE
0x07 0x0007 [ ] [ ] BELL
0x08 0x0008 [ ] [ ] BACKSPACE
0x09 0x0009 [ ] [ ] HORIZONTAL TABULATION
0x0A 0x000A [ ] [ ] LINE FEED
0x0B 0x000B [ ] [ ] VERTICAL TABULATION
0x0C 0x000C [ ] [ ] FORM FEED
0x0D 0x000D [ ] [ ] CARRIAGE RETURN
0x0E 0x000E [ ] [ ] SHIFT OUT
0x0F 0x000F [ ] [ ] SHIFT IN
0x10 0x0010 [ ] [ ] DATA LINK ESCAPE
0x11 0x0011 [ ] [ ] DEVICE CONTROL ONE
0x12 0x0012 [ ] [ ] DEVICE CONTROL TWO
0x13 0x0013 [ ] [ ] DEVICE CONTROL THREE
0x14 0x0014 [ ] [ ] DEVICE CONTROL FOUR
0x15 0x0015 [ ] [ ] NEGATIVE ACKNOWLEDGE
0x16 0x0016 [ ] [ ] SYNCHRONOUS IDLE
0x17 0x0017 [ ] [ ] END OF TRANSMISSION BLOCK
0x18 0x0018 [ ] [ ] CANCEL
0x19 0x0019 [ ] [ ] END OF MEDIUM
0x1A 0x001A [ ] [ ] SUBSTITUTE
0x1B 0x001B [ ] [ ] ESCAPE
0x1C 0x001C [ ] [ ] FILE SEPARATOR
0x1D 0x001D [ ] [ ] GROUP SEPARATOR
0x1E 0x001E [ ] [ ] RECORD SEPARATOR
0x1F 0x001F [ ] [ ] UNIT SEPARATOR
0x20 0x0020 [ ] [ ] SPACE
0x21 0x0021 [!] [!] EXCLAMATION MARK
0x22 0x0022 ["] ["] QUOTATION MARK
0x23 0x0023 [#] [#] NUMBER SIGN
0x24 0x0024 [$] [$] DOLLAR SIGN
0x25 0x0025 [%] [%] PERCENT SIGN
0x26 0x0026 [&] [&] AMPERSAND
0x27 0x0027 ['] ['] APOSTROPHE
0x28 0x0028 [(] [(] LEFT PARENTHESIS
0x29 0x0029 [)] [)] RIGHT PARENTHESIS
0x2A 0x002A [*] [*] ASTERISK
0x2B 0x002B [+] [+] PLUS SIGN
0x2C 0x002C [,] [,] COMMA
0x2D 0x002D [-] [-] HYPHEN-MINUS
0x2E 0x002E [.] [.] FULL STOP
0x2F 0x002F [/] [/] SOLIDUS
0x30 0x0030 [0] [0] DIGIT ZERO
0x31 0x0031 [1] [1] DIGIT ONE
0x32 0x0032 [2] [2] DIGIT TWO
0x33 0x0033 [3] [3] DIGIT THREE
0x34 0x0034 [4] [4] DIGIT FOUR
0x35 0x0035 [5] [5] DIGIT FIVE
0x36 0x0036 [6] [6] DIGIT SIX
0x37 0x0037 [7] [7] DIGIT SEVEN
0x38 0x0038 [8] [8] DIGIT EIGHT
0x39 0x0039 [9] [9] DIGIT NINE
0x3A 0x003A [:] [:] COLON
0x3B 0x003B [;] [;] SEMICOLON
0x3C 0x003C [<] [<] LESS-THAN SIGN
0x3D 0x003D [=] [=] EQUALS SIGN
0x3E 0x003E [>] [>] GREATER-THAN SIGN
0x3F 0x003F [?] [?] QUESTION MARK
0x40 0x0040 [@] [@] COMMERCIAL AT
0x41 0x0041 [A] [A] LATIN CAPITAL LETTER A
0x42 0x0042 [B] [B] LATIN CAPITAL LETTER B
0x43 0x0043 [C] [C] LATIN CAPITAL LETTER C
0x44 0x0044 [D] [D] LATIN CAPITAL LETTER D
0x45 0x0045 [E] [E] LATIN CAPITAL LETTER E
0x46 0x0046 [F] [F] LATIN CAPITAL LETTER F
0x47 0x0047 [G] [G] LATIN CAPITAL LETTER G
0x48 0x0048 [H] [H] LATIN CAPITAL LETTER H
0x49 0x0049 [I] [I] LATIN CAPITAL LETTER I
0x4A 0x004A [J] [J] LATIN CAPITAL LETTER J
0x4B 0x004B [K] [K] LATIN CAPITAL LETTER K
0x4C 0x004C [L] [L] LATIN CAPITAL LETTER L
0x4D 0x004D [M] [M] LATIN CAPITAL LETTER M
0x4E 0x004E [N] [N] LATIN CAPITAL LETTER N
0x4F 0x004F [O] [O] LATIN CAPITAL LETTER O
0x50 0x0050 [P] [P] LATIN CAPITAL LETTER P
0x51 0x0051 [Q] [Q] LATIN CAPITAL LETTER Q
0x52 0x0052 [R] [R] LATIN CAPITAL LETTER R
0x53 0x0053 [S] [S] LATIN CAPITAL LETTER S
0x54 0x0054 [T] [T] LATIN CAPITAL LETTER T
0x55 0x0055 [U] [U] LATIN CAPITAL LETTER U
0x56 0x0056 [V] [V] LATIN CAPITAL LETTER V
0x57 0x0057 [W] [W] LATIN CAPITAL LETTER W
0x58 0x0058 [X] [X] LATIN CAPITAL LETTER X
0x59 0x0059 [Y] [Y] LATIN CAPITAL LETTER Y
0x5A 0x005A [Z] [Z] LATIN CAPITAL LETTER Z
0x5B 0x005B [[] [[] LEFT SQUARE BRACKET
0x5C 0x005C [\] [\] REVERSE SOLIDUS
0x5D 0x005D []] []] RIGHT SQUARE BRACKET
0x5E 0x005E [^] [^] CIRCUMFLEX ACCENT
0x5F 0x005F [_] [_] LOW LINE
0x60 0x0060 [`] [`] GRAVE ACCENT
0x61 0x0061 [a] [a] LATIN SMALL LETTER A
0x62 0x0062 [b] [b] LATIN SMALL LETTER B
0x63 0x0063 [c] [c] LATIN SMALL LETTER C
0x64 0x0064 [d] [d] LATIN SMALL LETTER D
0x65 0x0065 [e] [e] LATIN SMALL LETTER E
0x66 0x0066 [f] [f] LATIN SMALL LETTER F
0x67 0x0067 [g] [g] LATIN SMALL LETTER G
0x68 0x0068 [h] [h] LATIN SMALL LETTER H
0x69 0x0069 [i] [i] LATIN SMALL LETTER I
0x6A 0x006A [j] [j] LATIN SMALL LETTER J
0x6B 0x006B [k] [k] LATIN SMALL LETTER K
0x6C 0x006C [l] [l] LATIN SMALL LETTER L
0x6D 0x006D [m] [m] LATIN SMALL LETTER M
0x6E 0x006E [n] [n] LATIN SMALL LETTER N
0x6F 0x006F [o] [o] LATIN SMALL LETTER O
0x70 0x0070 [p] [p] LATIN SMALL LETTER P
0x71 0x0071 [q] [q] LATIN SMALL LETTER Q
0x72 0x0072 [r] [r] LATIN SMALL LETTER R
0x73 0x0073 [s] [s] LATIN SMALL LETTER S
0x74 0x0074 [t] [t] LATIN SMALL LETTER T
0x75 0x0075 [u] [u] LATIN SMALL LETTER U
0x76 0x0076 [v] [v] LATIN SMALL LETTER V
0x77 0x0077 [w] [w] LATIN SMALL LETTER W
0x78 0x0078 [x] [x] LATIN SMALL LETTER X
0x79 0x0079 [y] [y] LATIN SMALL LETTER Y
0x7A 0x007A [z] [z] LATIN SMALL LETTER Z
0x7B 0x007B [{] [{] LEFT CURLY BRACKET
0x7C 0x007C [|] [|] VERTICAL LINE
0x7D 0x007D [}] [}] RIGHT CURLY BRACKET
0x7E 0x007E [~] [~] TILDE
0x7F 0x007F [ ] [ ] DELETE
0x80 0x20AC [€] [€] UNDEFINED EURO SIGN
0x81 [ ] [ ] UNDEFINED UNDEFINED
0x82 0x201A [‚] [‚] UNDEFINED SINGLE LOW-9 QUOTATION MARK
0x83 0x0192 [ƒ] [ƒ] UNDEFINED LATIN SMALL LETTER F WITH HOOK
0x84 0x201E [„] [„] UNDEFINED DOUBLE LOW-9 QUOTATION MARK
0x85 0x2026 […] […] UNDEFINED HORIZONTAL ELLIPSIS
0x86 0x2020 [†] [†] UNDEFINED DAGGER
0x87 0x2021 [‡] [‡] UNDEFINED DOUBLE DAGGER
0x88 0x02C6 [ˆ] [ˆ] UNDEFINED MODIFIER LETTER CIRCUMFLEX ACCENT
0x89 0x2030 [‰] [‰] UNDEFINED PER MILLE SIGN
0x8A 0x0160 [Š] [Š] UNDEFINED LATIN CAPITAL LETTER S WITH CARON
0x8B 0x2039 [‹] [‹] UNDEFINED SINGLE LEFT-POINTING ANGLE QUOTATION MARK
0x8C 0x0152 [Œ] [Œ] UNDEFINED LATIN CAPITAL LIGATURE OE
0x8D [ ] [ ] UNDEFINED UNDEFINED
0x8E 0x017D [Ž] [Ž] UNDEFINED LATIN CAPITAL LETTER Z WITH CARON
0x8F [ ] [ ] UNDEFINED UNDEFINED
0x90 [ ] [ ] UNDEFINED UNDEFINED
0x91 0x2018 [‘] [‘] UNDEFINED LEFT SINGLE QUOTATION MARK
0x92 0x2019 [’] [’] UNDEFINED RIGHT SINGLE QUOTATION MARK
0x93 0x201C [“] [“] UNDEFINED LEFT DOUBLE QUOTATION MARK
0x94 0x201D [”] [”] UNDEFINED RIGHT DOUBLE QUOTATION MARK
0x95 0x2022 [•] [•] UNDEFINED BULLET
0x96 0x2013 [–] [–] UNDEFINED EN DASH
0x97 0x2014 [—] [—] UNDEFINED EM DASH
0x98 0x02DC [˜] [˜] UNDEFINED SMALL TILDE
0x99 0x2122 [™] [™] UNDEFINED TRADE MARK SIGN
0x9A 0x0161 [š] [š] UNDEFINED LATIN SMALL LETTER S WITH CARON
0x9B 0x203A [›] [›] UNDEFINED SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
0x9C 0x0153 [œ] [œ] UNDEFINED LATIN SMALL LIGATURE OE
0x9D [ ] [ ] UNDEFINED UNDEFINED
0x9E 0x017E [ž] [ž] UNDEFINED LATIN SMALL LETTER Z WITH CARON
0x9F 0x0178 [Ÿ] [Ÿ] UNDEFINED LATIN CAPITAL LETTER Y WITH DIAERESIS
0xA0 0x00A0 [ ] [ ] NO-BREAK SPACE
0xA1 0x00A1 [¡] [¡] INVERTED EXCLAMATION MARK
0xA2 0x00A2 [¢] [¢] CENT SIGN
0xA3 0x00A3 [£] [£] POUND SIGN
0xA4 0x00A4 [¤] [¤] CURRENCY SIGN
0xA5 0x00A5 [¥] [¥] YEN SIGN
0xA6 0x00A6 [¦] [¦] BROKEN BAR
0xA7 0x00A7 [§] [§] SECTION SIGN
0xA8 0x00A8 [¨] [¨] DIAERESIS
0xA9 0x00A9 [©] [©] COPYRIGHT SIGN
0xAA 0x00AA [ª] [ª] FEMININE ORDINAL INDICATOR
0xAB 0x00AB [«] [«] LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
0xAC 0x00AC [¬] [¬] NOT SIGN
0xAD 0x00AD [ ] [ ] SOFT HYPHEN
0xAE 0x00AE [®] [®] REGISTERED SIGN
0xAF 0x00AF [¯] [¯] MACRON
0xB0 0x00B0 [°] [°] DEGREE SIGN
0xB1 0x00B1 [±] [±] PLUS-MINUS SIGN
0xB2 0x00B2 [²] [²] SUPERSCRIPT TWO
0xB3 0x00B3 [³] [³] SUPERSCRIPT THREE
0xB4 0x00B4 [´] [´] ACUTE ACCENT
0xB5 0x00B5 [µ] [µ] MICRO SIGN
0xB6 0x00B6 [¶] [¶] PILCROW SIGN
0xB7 0x00B7 [·] [·] MIDDLE DOT
0xB8 0x00B8 [¸] [¸] CEDILLA
0xB9 0x00B9 [¹] [¹] SUPERSCRIPT ONE
0xBA 0x00BA [º] [º] MASCULINE ORDINAL INDICATOR
0xBB 0x00BB [»] [»] RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
0xBC 0x00BC [¼] [¼] VULGAR FRACTION ONE QUARTER
0xBD 0x00BD [½] [½] VULGAR FRACTION ONE HALF
0xBE 0x00BE [¾] [¾] VULGAR FRACTION THREE QUARTERS
0xBF 0x00BF [¿] [¿] INVERTED QUESTION MARK
0xC0 0x00C0 [À] [À] LATIN CAPITAL LETTER A WITH GRAVE
0xC1 0x00C1 [Á] [Á] LATIN CAPITAL LETTER A WITH ACUTE
0xC2 0x00C2 [Â] [Â] LATIN CAPITAL LETTER A WITH CIRCUMFLEX
0xC3 0x00C3 [Ã] [Ã] LATIN CAPITAL LETTER A WITH TILDE
0xC4 0x00C4 [Ä] [Ä] LATIN CAPITAL LETTER A WITH DIAERESIS
0xC5 0x00C5 [Å] [Å] LATIN CAPITAL LETTER A WITH RING ABOVE
0xC6 0x00C6 [Æ] [Æ] LATIN CAPITAL LETTER AE
0xC7 0x00C7 [Ç] [Ç] LATIN CAPITAL LETTER C WITH CEDILLA
0xC8 0x00C8 [È] [È] LATIN CAPITAL LETTER E WITH GRAVE
0xC9 0x00C9 [É] [É] LATIN CAPITAL LETTER E WITH ACUTE
0xCA 0x00CA [Ê] [Ê] LATIN CAPITAL LETTER E WITH CIRCUMFLEX
0xCB 0x00CB [Ë] [Ë] LATIN CAPITAL LETTER E WITH DIAERESIS
0xCC 0x00CC [Ì] [Ì] LATIN CAPITAL LETTER I WITH GRAVE
0xCD 0x00CD [Í] [Í] LATIN CAPITAL LETTER I WITH ACUTE
0xCE 0x00CE [Î] [Î] LATIN CAPITAL LETTER I WITH CIRCUMFLEX
0xCF 0x00CF [Ï] [Ï] LATIN CAPITAL LETTER I WITH DIAERESIS
0xD0 0x00D0 [Ð] [Ð] LATIN CAPITAL LETTER ETH
0xD1 0x00D1 [Ñ] [Ñ] LATIN CAPITAL LETTER N WITH TILDE
0xD2 0x00D2 [Ò] [Ò] LATIN CAPITAL LETTER O WITH GRAVE
0xD3 0x00D3 [Ó] [Ó] LATIN CAPITAL LETTER O WITH ACUTE
0xD4 0x00D4 [Ô] [Ô] LATIN CAPITAL LETTER O WITH CIRCUMFLEX
0xD5 0x00D5 [Õ] [Õ] LATIN CAPITAL LETTER O WITH TILDE
0xD6 0x00D6 [Ö] [Ö] LATIN CAPITAL LETTER O WITH DIAERESIS
0xD7 0x00D7 [×] [×] MULTIPLICATION SIGN
0xD8 0x00D8 [Ø] [Ø] LATIN CAPITAL LETTER O WITH STROKE
0xD9 0x00D9 [Ù] [Ù] LATIN CAPITAL LETTER U WITH GRAVE
0xDA 0x00DA [Ú] [Ú] LATIN CAPITAL LETTER U WITH ACUTE
0xDB 0x00DB [Û] [Û] LATIN CAPITAL LETTER U WITH CIRCUMFLEX
0xDC 0x00DC [Ü] [Ü] LATIN CAPITAL LETTER U WITH DIAERESIS
0xDD 0x00DD [Ý] [Ý] LATIN CAPITAL LETTER Y WITH ACUTE
0xDE 0x00DE [Þ] [Þ] LATIN CAPITAL LETTER THORN
0xDF 0x00DF [ß] [ß] LATIN SMALL LETTER SHARP S
0xE0 0x00E0 [à] [à] LATIN SMALL LETTER A WITH GRAVE
0xE1 0x00E1 [á] [á] LATIN SMALL LETTER A WITH ACUTE
0xE2 0x00E2 [â] [â] LATIN SMALL LETTER A WITH CIRCUMFLEX
0xE3 0x00E3 [ã] [ã] LATIN SMALL LETTER A WITH TILDE
0xE4 0x00E4 [ä] [ä] LATIN SMALL LETTER A WITH DIAERESIS
0xE5 0x00E5 [å] [å] LATIN SMALL LETTER A WITH RING ABOVE
0xE6 0x00E6 [æ] [æ] LATIN SMALL LETTER AE
0xE7 0x00E7 [ç] [ç] LATIN SMALL LETTER C WITH CEDILLA
0xE8 0x00E8 [è] [è] LATIN SMALL LETTER E WITH GRAVE
0xE9 0x00E9 [é] [é] LATIN SMALL LETTER E WITH ACUTE
0xEA 0x00EA [ê] [ê] LATIN SMALL LETTER E WITH CIRCUMFLEX
0xEB 0x00EB [ë] [ë] LATIN SMALL LETTER E WITH DIAERESIS
0xEC 0x00EC [ì] [ì] LATIN SMALL LETTER I WITH GRAVE
0xED 0x00ED [í] [í] LATIN SMALL LETTER I WITH ACUTE
0xEE 0x00EE [î] [î] LATIN SMALL LETTER I WITH CIRCUMFLEX
0xEF 0x00EF [ï] [ï] LATIN SMALL LETTER I WITH DIAERESIS
0xF0 0x00F0 [ð] [ð] LATIN SMALL LETTER ETH
0xF1 0x00F1 [ñ] [ñ] LATIN SMALL LETTER N WITH TILDE
0xF2 0x00F2 [ò] [ò] LATIN SMALL LETTER O WITH GRAVE
0xF3 0x00F3 [ó] [ó] LATIN SMALL LETTER O WITH ACUTE
0xF4 0x00F4 [ô] [ô] LATIN SMALL LETTER O WITH CIRCUMFLEX
0xF5 0x00F5 [õ] [õ] LATIN SMALL LETTER O WITH TILDE
0xF6 0x00F6 [ö] [ö] LATIN SMALL LETTER O WITH DIAERESIS
0xF7 0x00F7 [÷] [÷] DIVISION SIGN
0xF8 0x00F8 [ø] [ø] LATIN SMALL LETTER O WITH STROKE
0xF9 0x00F9 [ù] [ù] LATIN SMALL LETTER U WITH GRAVE
0xFA 0x00FA [ú] [ú] LATIN SMALL LETTER U WITH ACUTE
0xFB 0x00FB [û] [û] LATIN SMALL LETTER U WITH CIRCUMFLEX
0xFC 0x00FC [ü] [ü] LATIN SMALL LETTER U WITH DIAERESIS
0xFD 0x00FD [ý] [ý] LATIN SMALL LETTER Y WITH ACUTE
0xFE 0x00FE [þ] [þ] LATIN SMALL LETTER THORN
0xFF 0x00FF [ÿ] [ÿ] LATIN SMALL LETTER Y WITH DIAERESIS
Related Documents
CSSCAN fails with CSS-00151: failed to enumerate user tables CSS-00120
CSSCAN fails with error while loading shared libraries: libclntsh.so.10.1
How to run csscan in the background as a sysdba
CSSCAN fails with CSS-00107: Character set migration utility schema not installed
ORA-00904: "CNVTYPE" CSS-08888: failed to update conversion type
Difference between WE8ISO8859P1 and WE8ISO8859P15 characterset
Difference between WE8MSWIN1252 and WE8ISO8859P15 characterset

Saturday, March 7, 2009

CSSCAN fails with ORA-00600, CSS-00152, CSS-00120

Problem Description
While running csscan it fails with error message ORA-600, CSS-00152: failed to enumerate all tables and CSS-00120 as below.

$ csscan system/a FULL=Y FROMCHAR=WE8ISO8859P1 TOCHAR=WE8MSWIN1252 LOG=csscanwin1252
ARRAY=1000000 PROCESS=2



Character Set Scanner v2.1 : Release 10.2.0.0.0 - Production on Sat Mar 7 21:10:05 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options

Enumerating tables to scan...
Warning: Entry/Exit code is optimized. Cannot restore context (UNWIND 22)

ORA-00600: internal error code, arguments: [15160], [], [], [], [], [], [], []
CSS-00152: failed to enumerate all tables
CSS-00120: failed to enumerate tables to scan

Scanner terminated unsuccessfully.

Cause of the Problem
Scan fails because of the existence of tables in the recyclebin.

Solution of the Problem
1)Purge Recyclebin Objects: Query from dba_recyclebin and be sure you need those objects ever. If not purge them. To do this as sys as sysdba issue,

SQL>conn sys as sysdba

SQL>purge dba_recyclebin;


2)Run csscan again.
$ csscan system/a FULL=Y FROMCHAR=WE8ISO8859P1 TOCHAR=WE8MSWIN1252 LOG=csscanwin1252
ARRAY=1000000 PROCESS=2


Related Documents
CSSCAN fails with CSS-00151: failed to enumerate user tables CSS-00120
CSSCAN fails with error while loading shared libraries: libclntsh.so.10.1
How to run csscan in the background as a sysdba
CSSCAN fails with CSS-00107: Character set migration utility schema not installed
ORA-00904: "CNVTYPE" CSS-08888: failed to update conversion type

CSSCAN fails with error while loading shared libraries: libclntsh.so.10.1

Problem Description
While running csscan in order to check all character data in the database and tests for the effects and problems of changing the character set, it fails with error while loading shared libraries: libclntsh.so.10.1: cannot open shared object file as below.

[oracle@dbsoft ~]$ csscan sys/a as sysdba full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4
csscan: error while loading shared libraries: libclntsh.so.10.1: cannot open shared object file: No such file or directory

Cause and solution of the Problem
The problem happens due to missing entry of LD_LIBRARY_PATH environmental variable. Proper setting of the parameter will solve the problem. On my 32 bit Red hat linux system setting,
$export LD_LIBRARY_PATH=$ORACLE_HOME/lib
will solve the problem.
Details about this problem is discussed on,

http://arjudba.blogspot.com/2008/09/on-solaris-64-bit-rman-fails-with.html

How to run csscan in the background as a sysdba

With a simple command,
csscan system/test full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4
you can run csscan in order to check all character data in the database and tests for the effects and problems of changing the character set encoding.

As csscan runs in the foreground (by default), so if you exits the terminal from which you run csscan, csscan also stops there. This is quite a pain task whenever you run csscan to another remote computer via ssh or any terminal software as you can't ensure network connectivity. So if network goes your terminal terminates and csscan terminates as well.

In order to solve the problem unix nohup tool is a great rid of our pain. With help of nohup we can run the process in the background and send the output to a text file; thus exiting the terminal remains the process running in the backend. After hours/days we can check the process whether it completed or not.

To run csscan in the background issue following command,
$nohup csscan system/a full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4 &

Note that at the end you have to append an ampersand to send the process in the background.

Later we can check the status of our csscan by,
$ps -ef |grep csscan
to be sure whether scanning is completed or not.

As we know in order to character set scanning process we need to scan full database. And sys is the most powerful user. So to access everything always run csscan as "sys as sysdba." Oracle also recommends to run csscan as a sys user. Thus running csscan as a sys user you might face difficulties. Like ,

[oracle@dbsoft ~]$ csscan sys/a as sysdba full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4
LRM-00108: invalid positional parameter value 'as'
failed to process command line parameters

Scanner terminated unsuccessfully.

[oracle@dbsoft ~]$ csscan userid="sys/a as sysdba" full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4
[1] 8042
LRM-00112: multiple values not allowed for parameter 'userid'

[oracle@dbsoft ~]$ csscan sys/a full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4


Character Set Scanner v2.1 : Release 10.2.0.0.0 - Production on Sat Mar 7 18:03:59 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.


ORA-28009: connection as SYS should be as SYSDBA or SYSOPER

Scanner terminated unsuccessfully.
Though you can avoid the lastest erro by following http://arjudba.blogspot.com/2008/05/ora-28009-connection-as-sys-should-be.html that is by setting O7_DICTIONARY_ACCESSIBILITY=TRUE but this is not recommended.

So the issue stands how to run csscan as "sys as sysdba" and the process need to run the background. To do this below is the steps.

Step 01: Run csscan with nohup but only without any userid parameter(username+password).
$ nohup csscan <All options except username/password go here<

For example:
$nohup csscan full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4

Note that at the end there is no ampersand.

Step 02: Press Enter button from keyboard.

Step 03: At this steps all terminal output is being redirected to nohup.out and so, you can't see it, but your terminal is waiting for a username and password input.
So give the password of sys and connect as sysdba.
Like enter following words,
sys/a as sysdba
where the password of user sys is a.

Step 04: Press Enter button from keyboard.

Step 05: At this stage, csscan should be running, in the foreground, and all terminal
output is redirecting to nohup.out.

You still see your shell prompt is there but it takes no keyword. Just
press ctrl+z

Step 06: In the shell prompt type,
bg

You have done it. Now your csscan will run in the background and you may quit your current window or disconnect network or log off the terminal. Process will keep running and you will get the output of terminal in the file nohup.out. Check the status of the process by,

$ps -ef |grep csscan

Note: Make sure that you type the "bg" at the shell after the ctrl+z. If you don't do that process will remain suspended and will not do anything.

From my terminal here is the sample screenshot.

[oracle@dbsoft ~]$ nohup csscan full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4
nohup: appending output to `nohup.out'
sys/a as sysdba
bg


After this I press the cross button to close window.
In a new session I login and I got my progress inside nohup.out.
[oracle@dbsoft ~]$ cat nohup.out
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

Enumerating tables to scan...

. process 2 scanning MAXIMSG.SUBSCRIBERS_CARDS[AAAM6NAAGAAAXwJAAA]
. process 1 scanning MAXIMSG.FIRST_LEG_ACC[AAAM4IAAGAAAD+JAAA]


Remember the alternative. Also you can do above tasks by simply using escape characters with your options like below.

$nohup $ORACLE_HOME/bin/csscan userid=\'sys/a as sysdba\' full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4 &

Enjoy the post. Keep reading my blog.

Related Documents
CSSCAN fails with CSS-00151: failed to enumerate user tables CSS-00120
CSSCAN fails with error while loading shared libraries: libclntsh.so.10.1
CSSCAN fails with CSS-00107: Character set migration utility schema not installed
ORA-00904: "CNVTYPE" CSS-08888: failed to update conversion type
CSSCAN fails with ORA-00600, CSS-00152, CSS-00120

Wednesday, March 4, 2009

Unicode characterset in Oracle database.

Before starting this post let's have an idea about unicode. Unicode is a Universal encoding scheme which is designed to include far more characters than the normal character set, in fact, Unicode wants to be able to list ALL characters. So, with unicode support in oracle data from any languages can be stored and retrieved from oracle.

Oracle supports unicode within many of the character sets starting from Oracle 7.

Below is the list of character sets that is used to support unicode in oracle.

1)AL24UTFFSS: This character set was the first Unicode character set supported by Oracle. The AL24UTFFSS encoding scheme was based on the Unicode 1.1 standard, which is now obsolete. This unicode character set was used between oracle version 7.2 to 8.1.


2)UTF8:
UTF8 was the UTF-8 encoded character set in Oracle8 and 8i. It followed the
Unicode 2.1 standard between Oracle 8.0 and 8.1.6, and was upgraded to Unicode
version 3.0 for oracle versions 8.1.7, 9i, 10g and 11g. If supplementary characters are inserted into in a UTF8 database encoded with Unicode version 3.0, then the actual data will be treated as 2 separate undefined characters, occupying 6 bytes in storage. So for fully support of supplementary characters use AL32UTF8 character set instead of UTF8.

3)UTFE: UTFE has the same properties as UTF8 on ASCII based platforms. As of UTF8 it is used in different oracle versions.


4)AL32UTF8:
This is the UTF-8 encoded character set introduced in Oracle9i.
In Oracle 9.2 AL32UTF8 implemented unicode 3.1,
in 10.1 it implemented the Unicode 3.2 standard,
in Oracle 10.2 it supports the Unicode 4.01 standard and
in Oracle 11.1 it supports the Unicode 5.0.


AL32UTF8 was introduced to provide support for the newly defined supplementary characters. All supplementary characters are stored as 4 bytes in AL32UTF8. As while designed UTF-8 there was no concept of supplementary characters therefore UTF8 has a maximum of 3 bytes per character.

5)AL16UTF16: This is the first UTF-16 encoded character set in Oracle. It was introduced in Oracle9i as the default national character set (NLS_NCHAR_CHARACTERSET). It also provides support for the newly defined supplementary characters. All supplementary characters are stored as 4 bytes.
As with AL32UTF8, the plan is to keep enhancing AL16UTF16 as
necessary to support future version of the Unicode standard.

AL16UTF16 cannot be used as a database character set (NLS_CHARACTERSET), it is only used as the national character set (NLS_NCHAR_CHARACTERSET).

Like, AL32UTF8
In Oracle 9.0 AL16UTF16 implemented unicode 3.0,
in Oracle 9.2 it implemented unicode 3.1,
in 10.1 it implemented the Unicode 3.2 standard,
in Oracle 10.2 it supports the Unicode 4.01 standard and
in Oracle 11.1 it supports the Unicode 5.0.


Related Documents
What is NLS_LANG environmental variable?What is database character set and how to check it
Different ways to set up NLS parameters
What is national character set / NLS_NCHAR_CHARACTERSET?
Which datatypes use the National Character Set?
What is character set and character set encoding

Saturday, February 28, 2009

What is NLS_LANG environmental variable?

NLS_LANG is a client side environmental variable. To specify the locale behavior- setting the NLS_LANG environment parameter is the simplest way.

With the setting of NLS_LANG parameter on client machine it is specified the language, territory and character set used by the client application. As through NLS_LANG parameter, client character set is also specified so oracle has an idea which is the character set for data entered or displayed by a client program as well as Oracle can do (if needed) conversion from the client's character set to the database character set.


On UNIX machine NLS_LANG parameter is an environmental variable and on windows machine this value comes from registry settings.

The parameter NLS_LANG holds the following format.
NLS_LANG=[Language]_[Territory].[clients character set]
The default value of NLS_LANG is AMERICAN_AMERICA.US7ASCII which indicates that
The language is AMERICAN,
the territory is AMERICA, and
the character set is US7ASCII.


The first part of NLS_LANG parameter is language and it is used for Oracle Database messages, sorting, day names, and month names. Each language has a unique name.
The language specifies default values for territory and character set so if language is specified then the other two arguments can be omitted. Language can have the value like AMERICAN, GERMAN, FRENCH, JAPANESE etc. The default value is AMERICAN.

The second part of NLS_LANG parameter is territory and it is used for default date, monetary, and numeric formats. Each territory has a unique name. Territory can have the value like AMERICA, FRANCE, JAPAN, CANADA etc. If the territory is not specified, then the value is derived from the language value.

The third part of NLS_LANG parameter is the client character set. It specifies the character set that is used by the client application. The client character set used for Oracle should be equivalent to the character set supported for the client machine. This character set should also be equivalent to or a subset of the character set used for your database so that every character input through the terminal has a matching character to map to in the database. Example of client character set is US7ASCII, WE8ISO8859P1, WE8DEC, WE8MSWIN1252 etc.

It is important to note that all three parts of NLS_LANG environmental variable/parameter are optional. This means if any of the parts are not specified then default value is used- may be the default value is derived value. You can specify Territory and/or character set without language value; in this case your must include the preceding delimiter -underscore (_) for territory and period (.) for character set. If you don't include the delimiter then the whole value is parsed as a language name.

For example you can only set territory portion by,
NLS_LANG=_FRANCE

You can only set client character set portion by,
NLS_LANG=.WE8MSWIN1252

The three parts of NLS_LANG can be specified in many combination but all of the combination may not work properly. Like,
NLS_LANG = JAPANESE_JAPAN.WE8ISO8859P1

This combination can be will not work properly. Beacuse the specification will try to support Japanese by using a Western European character set but WE8ISO8859P1 character set does not support any Japanese characters.

So if you set your NLS_LANG environmental variable above then you can't store or display Japanese character.

Some logical combination,
NLS_LANG = AMERICAN_AMERICA.WE8MSWIN1252
NLS_LANG = FRENCH_CANADA.WE8ISO8859P1
NLS_LANG = JAPANESE_JAPAN.JA16EUC


In server machine there is no need to set NLS_LANG environmental variable. This variable is only needed for client machine. The character set defined for NLS_LANG environmental variable should be the subset or equal to the database character set so that oracle can aware of each character set and thus can convert client character set correctly. It is also important that character set value of NLS_LANG variable should reflect client machine supported character set so that client machine can display that properly. For example if japanese character set is not installed in client machine but NLS_LANG parameter is set as JAPANESE_JAPAN.JA16EUC then client will not be able to see JAPANESE characters properly.

Important Notes About NLS_LANG Parameter
1)NLS_LANG is used to let Oracle know what character set client's OS is using so that Oracle can do (if needed) conversion from the client's character set to the database characterset.

2)Don't think that NLS_LANG needs to be the same as the database characterset.

3)The characterset defined with the NLS_LANG parameter does not change your client's character set. You cannot change the characterset of your client by using a different NLS_LANG setting. NLS_LANG is used to let Oracle know what characterset you are using on the client side.

4)Don't think that, if you don't set the NLS_LANG on the client it uses the NLS_LANG of the server (which is not true). If you don't set it then default NLS_LANG as described earlier in this post is used.

5)If the NLS_LANG variable match with database character set then oracle will perform no validation on the character set; and thus incorrect NLS_LANG settings may cause to enter garbage data into the database.

Related Documents
Unicode characterset in Oracle database.
What is database character set and how to check it
Different ways to set up NLS parameters
What is national character set / NLS_NCHAR_CHARACTERSET?
Which datatypes use the National Character Set?
What is character set and character set encoding

Different ways to set up NLS parameters

The word NLS means National Language Support. The NLS_* parameters determine the
locale-specific behavior on both the client and the server; where * of NLS_* is for various strings which make various NLS parameters.

There are many NLS_* parameters like NLS_SORT, NLS_LANGUAGE, NLS_CHARACTERSET, NLS_DATE_LANGUAGE etc. In this post I will show how the NLS parameters can be set based on their setting of priority.

1)In SQL functions:
If you set NLS_* parameters inside SQL functions then that setting has the highest priority.

You can set in SQL functions like,
TO_CHAR(sysdate, 'DD/MON/YYYY', 'nls_date_language = FRENCH')

Below is an example. Note that in my client machine FRENCH language is not installed so it might not display properly.

SQL> select sysdate from dual;

SYSDATE
---------
07-FEB-09

SQL> select TO_CHAR(sysdate, 'DD/MON/YYYY', 'nls_date_language = FRENCH') from dual;

TO_CHAR(SYSDA
-------------
07/F╔VR./2009

Setting in this way (inside sql functions) overrides the default values that are set for the session in the initialization parameter file, set for the client with environment variables, or set for the session by the ALTER SESSION statement.

2)With the ALTER SESSION statement:
Setting through ALTER SESSION parameter has the second highest priority. Setting by an ALTER SESSION statement override the default values that are set for the session in the initialization parameter file or set by the client with environment variables.

Below is an example. As in my client machine Japanese language is not installed so displaying in Japanese character might not work properly.

SQL> select sysdate from dual;

SYSDATE
---------
07-FEB-09

SQL> alter session set NLS_DATE_LANGUAGE=JAPANESE;

Session altered.

SQL> select sysdate from dual;

SYSDATE
----------
07-2┐ -09

3)Through Environmental variable on the client machine:
This setting has the third highest priority. Through OS environmental variable you can set NLS_* parameters. Setting of environmental variable is platform specific. On windows machine you can set by,
C:>set NLS_*=value;
On unix machine
$export NLS_*=value (bash shell)
$setenv NLS_*=value (c shell)


Below is an example on my windows client machine.
C:\>set NLS_SORT=FRENCH

4)As initialization parameters on the server:
You can set the NLS_* parameters in the server machine inside the initialization parameter file. Setting in the initialization parameter specify a default session NLS environment. Setting in this way has no effect on the client side, they control only the server's behavior.
For example, if you use spfile then you can set NLS_TERRITORY parameter by below,

SQL> ALTER SYSTEM SET NLS_TERRITORY = "CZECH REPUBLIC" scope=spfile;

System altered.
Then in order to effect bounce database.

If I draw a table based on priority and ways to do then it will be like,

Priority Ways to do the task.
----------- -----------------------------------------

1 (highest) Set in SQL functions
2 Set by an ALTER SESSION statement
3 Set as an environment variable
4 Specified in the initialization parameter file
5 (lowest) Default

Related Documents
Unicode characterset in Oracle database.
What is NLS_LANG environmental variable?
What is database character set and how to check it
What is national character set / NLS_NCHAR_CHARACTERSET?
Which datatypes use the National Character Set?
What is character set and character set encoding

How to know whether there is N-type columns on database

Below query will return the name of the owner and the table whether there is N-type columns in the database.

SQL> select distinct OWNER, TABLE_NAME from DBA_TAB_COLUMNS where DATA_TYPE
in ('NCHAR','NVARCHAR2', 'NCLOB') order by 1;


OWNER TABLE_NAME
------------------------------ ------------------------------
SYS ALL_REPPRIORITY
SYS DBA_AUDIT_EXISTS
SYS DBA_AUDIT_OBJECT
SYS DBA_AUDIT_STATEMENT
SYS DBA_AUDIT_TRAIL
SYS DBA_COMMON_AUDIT_TRAIL
SYS DBA_FGA_AUDIT_TRAIL
SYS DBA_REPPRIORITY
SYS DEFLOB
SYS STREAMS$_DEF_PROC
SYS USER_AUDIT_OBJECT
SYS USER_AUDIT_STATEMENT
SYS USER_AUDIT_TRAIL
SYS USER_REPPRIORITY
SYSTEM DEF$_LOB
SYSTEM DEF$_TEMP$LOB
SYSTEM REPCAT$_PRIORITY

17 rows selected.


The DBA_FGA_AUDIT_TRAIL comes for Fine Grained Auditing.

ALL_REPPRIORITY, DBA_REPPRIORITY, USER_REPPRIORITY, DEF$_TEMP$LOB , DEF$_TEMP$LOB and REPCAT$_PRIORITY comes for Advanced Replication.

DEFLOB comes for Deferred Transactions functionality.

STREAMS$_DEF_PROC comes for Oracle Streams.

Related Documents
Unicode characterset in Oracle database.
What is NLS_LANG environmental variable?
What is database character set and how to check it
Different ways to set up NLS parameters
What is national character set / NLS_NCHAR_CHARACTERSET?
Which datatypes use the National Character Set?
What is character set and character set encoding

Friday, February 27, 2009

Which datatypes use the National Character Set?

There are three datatypes which can store data in the national character set.

1)NCHAR: It is fixed length national character set- character datatype. This datatype uses CHAR length semantics, that is, the length of the NCHAR datatype column is defined in characters.

2)NVARCHAR2: It is variable length national character set- character datatype. This datatype uses CHAR length semantics, that is, the length of the NVARCHAR2 datatype column is defined in characters.

3)NCLOB: It stores national character set data up to four gigabytes. Data is always stored in UCS2 or AL16UTF16, even if the NLS_NCHAR_CHARACTERSET is UTF8.

If you use NCHAR/NVARCHAR2/NCLOB data type then, use the (N'...') syntax when coding these data type so that literals are denoted as being in the national character set by prefixing letter 'N'.

Below is an example.


SQL> create table t_test(col1 NVARCHAR2(30));

Table created.

SQL> insert into t_test values(N'This is NLS_NCHAR_CHARACTERSET');

1 row created.

Related Documents
Unicode characterset in Oracle database.
What is NLS_LANG environmental variable?
What is database character set and how to check it
Different ways to set up NLS parameters
What is national character set / NLS_NCHAR_CHARACTERSET?
What is character set and character set encoding

What is national character set / NLS_NCHAR_CHARACTERSET?

  • The national character set is the character set which is defined in oracle database in addition to normal character set.

  • The normal character set is defined by the parameter NLS_CHARACTERSET and the national character set is defined by the parameter NLS_NCHAR_CHARACTERSET.

  • The national character set is used for data stored in NCHAR, NVARCHAR2 and NCLOB columns while the normal character set is used for data stored in CHAR, VARCHAR2, CLOB columns.

  • You can get the value of national character set or NLS_NCHAR_CHARACTERSET by,


SQL> select value from nls_database_parameters where parameter='NLS_NCHAR_CHARACTERSET';

VALUE
----------------------------------------
AL16UTF16

SQL> select value$ from sys.props$ where name='NLS_NCHAR_CHARACTERSET';

VALUE$
--------------------------------------------------------------------------------
AL16UTF16

SQL> select property_value from database_properties where property_name
='NLS_NCHAR_CHARACTERSET';


PROPERTY_VALUE
--------------------------------------------------------------------------------
AL16UTF16

  • NLS_NCHAR_CHARACTERSET is defined when the database is created and specified with the CREATE DATABASE command.


  • The default value of NLS_NCHAR_CHARACTERSET is AL16UTF16.


  • From Oracle 9i onwards the NLS_NCHAR_CHARACTERSET can have only 2 values, either UTF8 or AL16UTF16 and both are unicode character sets.


  • National character set are always defined in CHAR length semantics and you cannot define them in BYTE. That means if you defines NCHAR(5) then 5 maximum characters can be stored regardless of how many bytes they can hold.


  • Many one thinks that they need to use the NLS_NCHAR_CHARACTERSET to have UNICODE support in oracle but this is not true. One can always use UNICODE in either two ways. Storing data into NCHAR, NVARCHAR2 or NCLOB columns or you can perfectly use "normal" CHAR and VARCHAR2 columns for storing unicode in a database who has a AL32UTF8 / UTF8 NLS_CHARACTERSET.
Related Documents
Unicode characterset in Oracle database.
What is NLS_LANG environmental variable?
What is database character set and how to check it
Different ways to set up NLS parameters
Which datatypes use the National Character Set?
What is character set and character set encoding

Thursday, February 26, 2009

What is Oracle Globalization Support

The term Oracle Globalization Support is used for oracle database as oracle database now support to store, process, and retrieve data from all languages. It also ensures that database utilities, error messages, date, time, monetary, numeric, and calendar conventions automatically adapt to any native language and locale in oracle.

Before 9i the term Oracle Globalization Support term was referred as National Language Support(NLS) features. From 9i onwards, NLS is actually a subset of globalization support. NLS is the ability to choose a national language and store data in a specific character set.

The oracle globalization support feature enables you to develop multilingual applications and software products which can be accessed from anywhere in the world and in any languages. In the database you can now store any language you wish.

Related Documents
Unicode characterset in Oracle database.
What is NLS_LANG environmental variable?
What is database character set and how to check it
Different ways to set up NLS parameters
What is national character set / NLS_NCHAR_CHARACTERSET?
Which datatypes use the National Character Set?
What is character set and character set encoding

What is database character set and how to check it

Note that database character set refers to the term character set encoding and in oracle database the terms character set and character set encoding are often used interchangeably.

The database character set in oracle determines the set of characters can be stored in the database. It is also used to determine the character set to be used for object identifiers and PL/SQL variables and for storing PL/SQL program source.

The database character set information is stored in the data dictionary tables named SYS.PROPS$.

You can get the character set used in the database by SYS.PROPS$ table or any other views (like database_properties/ nls_database_parameters) exist in the database. The parameter NLS_CHARACTERSET value contains the database character set name. Get it from,


SQL> select value$ from sys.props$ where name='NLS_CHARACTERSET';

VALUE$
--------------------------------------------------------------------------------
WE8MSWIN1252

SQL> select property_value from database_properties where property_name=
'NLS_CHARACTERSET';


PROPERTY_VALUE
--------------------------------------------------------------------------------
WE8MSWIN1252

SQL> select value from nls_database_parameters where parameter='NLS_CHARACTERSET';

VALUE
----------------------------------------
WE8MSWIN1252

Related Documents

Unicode characterset in Oracle database.

What is NLS_LANG environmental variable?

Different ways to set up NLS parameters

What is national character set / NLS_NCHAR_CHARACTERSET?

Which datatypes use the National Character Set?

What is character set and character set encoding