Oracle in World: Globalization Support

Showing posts with label Globalization Support. Show all posts

Wednesday, March 11, 2009

Difference between WE8MSWIN1252 and WE8ISO8859P15 characterset

The lists of characters along with their code points used in oracle database character set WE8ISO8859P15 is defined in the http://msdn.microsoft.com/en-us/goglobal/cc305176.aspx.

Also, the lists of characters along with their code points used in oracle database character set WE8MSWIN1252 is defined in the http://msdn.microsoft.com/en-us/goglobal/cc305145.aspx.

If we look for WE8MSWIN1252 and WE8ISO8859P15 character set then 28 code points are not existed in WE8ISO8859P15 but they are used/filled in WE8MSWIN1252.

Also all of the characters exist in WE8ISO8859P15 are also exists in WE8MSWIN1252. So we can say WE8MSWIN1252 is a logical superset of character set WE8ISO8859P15 but not a binary superset.

Also we see 8 codepoints have a different symbol in WE8MSWIN1252 than in P15 for the same physical codepoint.

Below is the lists of all characters under both character sets along with their code points.


Dec.  Unico.  Charac.   WE8ISO8859P15 Character                         Description
----  ------  -------   (States if different)                           -----------
0x00  0x0000  [ ] [ ]                                                   NULL
0x01  0x0001  [ ] [ ]                                                   START OF HEADING
0x02  0x0002  [ ] [ ]                                                   START OF TEXT
0x03  0x0003  [ ] [ ]                                                   END OF TEXT
0x04  0x0004  [ ] [ ]                                                   END OF TRANSMISSION
0x05  0x0005  [ ] [ ]                                                   ENQUIRY
0x06  0x0006  [ ] [ ]                                                   ACKNOWLEDGE
0x07  0x0007  [ ] [ ]                                                   BELL
0x08  0x0008  [ ] [ ]                                                   BACKSPACE
0x09  0x0009  [ ] [ ]                                                   HORIZONTAL TABULATION
0x0A  0x000A  [ ] [ ]                                                   LINE FEED
0x0B  0x000B  [ ] [ ]                                                   VERTICAL TABULATION
0x0C  0x000C  [ ] [ ]                                                   FORM FEED
0x0D  0x000D  [ ] [ ]                                                   CARRIAGE RETURN
0x0E  0x000E  [ ] [ ]                                                   SHIFT OUT
0x0F  0x000F  [ ] [ ]                                                   SHIFT IN
0x10  0x0010  [ ] [ ]                                                   DATA LINK ESCAPE
0x11  0x0011  [ ] [ ]                                                   DEVICE CONTROL ONE
0x12  0x0012  [ ] [ ]                                                   DEVICE CONTROL TWO
0x13  0x0013  [ ] [ ]                                                   DEVICE CONTROL THREE
0x14  0x0014  [ ] [ ]                                                   DEVICE CONTROL FOUR
0x15  0x0015  [ ] [ ]                                                   NEGATIVE ACKNOWLEDGE
0x16  0x0016  [ ] [ ]                                                   SYNCHRONOUS IDLE
0x17  0x0017  [ ] [ ]                                                   END OF TRANSMISSION BLOCK
0x18  0x0018  [ ] [ ]                                                   CANCEL
0x19  0x0019  [ ] [ ]                                                   END OF MEDIUM
0x1A  0x001A  [ ] [ ]                                                   SUBSTITUTE
0x1B  0x001B  [ ] [ ]                                                   ESCAPE
0x1C  0x001C  [ ] [ ]                                                   FILE SEPARATOR
0x1D  0x001D  [ ] [ ]                                                   GROUP SEPARATOR
0x1E  0x001E  [ ] [ ]                                                   RECORD SEPARATOR
0x1F  0x001F  [ ] [ ]                                                   UNIT SEPARATOR
0x20  0x0020  [ ] [ ]                                                   SPACE
0x21  0x0021  [!] [!]                                                   EXCLAMATION MARK
0x22  0x0022  ["] ["]                                                   QUOTATION MARK
0x23  0x0023  [#] [#]                                                   NUMBER SIGN
0x24  0x0024  [$] [$]                                                   DOLLAR SIGN
0x25  0x0025  [%] [%]                                                   PERCENT SIGN
0x26  0x0026  [&] [&]                                                   AMPERSAND
0x27  0x0027  ['] [']                                                   APOSTROPHE
0x28  0x0028  [(] [(]                                                   LEFT PARENTHESIS
0x29  0x0029  [)] [)]                                                   RIGHT PARENTHESIS
0x2A  0x002A  [*] [*]                                                   ASTERISK
0x2B  0x002B  [+] [+]                                                   PLUS SIGN
0x2C  0x002C  [,] [,]                                                   COMMA
0x2D  0x002D  [-] [-]                                                   HYPHEN-MINUS
0x2E  0x002E  [.] [.]                                                   FULL STOP
0x2F  0x002F  [/] [/]                                                   SOLIDUS
0x30  0x0030  [0] [0]                                                   DIGIT ZERO
0x31  0x0031  [1] [1]                                                   DIGIT ONE
0x32  0x0032  [2] [2]                                                   DIGIT TWO
0x33  0x0033  [3] [3]                                                   DIGIT THREE
0x34  0x0034  [4] [4]                                                   DIGIT FOUR
0x35  0x0035  [5] [5]                                                   DIGIT FIVE
0x36  0x0036  [6] [6]                                                   DIGIT SIX
0x37  0x0037  [7] [7]                                                   DIGIT SEVEN
0x38  0x0038  [8] [8]                                                   DIGIT EIGHT
0x39  0x0039  [9] [9]                                                   DIGIT NINE
0x3A  0x003A  [:] [:]                                                   COLON
0x3B  0x003B  [;] [;]                                                   SEMICOLON
0x3C  0x003C  [<] [<]                                                   LESS-THAN SIGN  0x3D  0x003D  [=] [=]                                                   EQUALS SIGN         0x3E  0x003E  [>] [>]                                                   GREATER-THAN SIGN
0x3F  0x003F  [?] [?]                                                   QUESTION MARK
0x40  0x0040  [@] [@]                                                   COMMERCIAL AT
0x41  0x0041  [A] [A]                                                   LATIN CAPITAL LETTER A
0x42  0x0042  [B] [B]                                                   LATIN CAPITAL LETTER B
0x43  0x0043  [C] [C]                                                   LATIN CAPITAL LETTER C
0x44  0x0044  [D] [D]                                                   LATIN CAPITAL LETTER D
0x45  0x0045  [E] [E]                                                   LATIN CAPITAL LETTER E
0x46  0x0046  [F] [F]                                                   LATIN CAPITAL LETTER F
0x47  0x0047  [G] [G]                                                   LATIN CAPITAL LETTER G
0x48  0x0048  [H] [H]                                                   LATIN CAPITAL LETTER H
0x49  0x0049  [I] [I]                                                   LATIN CAPITAL LETTER I
0x4A  0x004A  [J] [J]                                                   LATIN CAPITAL LETTER J
0x4B  0x004B  [K] [K]                                                   LATIN CAPITAL LETTER K
0x4C  0x004C  [L] [L]                                                   LATIN CAPITAL LETTER L
0x4D  0x004D  [M] [M]                                                   LATIN CAPITAL LETTER M
0x4E  0x004E  [N] [N]                                                   LATIN CAPITAL LETTER N
0x4F  0x004F  [O] [O]                                                   LATIN CAPITAL LETTER O
0x50  0x0050  [P] [P]                                                   LATIN CAPITAL LETTER P
0x51  0x0051  [Q] [Q]                                                   LATIN CAPITAL LETTER Q
0x52  0x0052  [R] [R]                                                   LATIN CAPITAL LETTER R
0x53  0x0053  [S] [S]                                                   LATIN CAPITAL LETTER S
0x54  0x0054  [T] [T]                                                   LATIN CAPITAL LETTER T
0x55  0x0055  [U] [U]                                                   LATIN CAPITAL LETTER U
0x56  0x0056  [V] [V]                                                   LATIN CAPITAL LETTER V
0x57  0x0057  [W] [W]                                                   LATIN CAPITAL LETTER W
0x58  0x0058  [X] [X]                                                   LATIN CAPITAL LETTER X
0x59  0x0059  [Y] [Y]                                                   LATIN CAPITAL LETTER Y
0x5A  0x005A  [Z] [Z]                                                   LATIN CAPITAL LETTER Z
0x5B  0x005B  [[] [[]                                                   LEFT SQUARE BRACKET
0x5C  0x005C  [\] [\]                                                   REVERSE SOLIDUS
0x5D  0x005D  []] []]                                                   RIGHT SQUARE BRACKET
0x5E  0x005E  [^] [^]                                                   CIRCUMFLEX ACCENT
0x5F  0x005F  [_] [_]                                                   LOW LINE
0x60  0x0060  [`] [`]                                                   GRAVE ACCENT
0x61  0x0061  [a] [a]                                                   LATIN SMALL LETTER A
0x62  0x0062  [b] [b]                                                   LATIN SMALL LETTER B
0x63  0x0063  [c] [c]                                                   LATIN SMALL LETTER C
0x64  0x0064  [d] [d]                                                   LATIN SMALL LETTER D
0x65  0x0065  [e] [e]                                                   LATIN SMALL LETTER E
0x66  0x0066  [f] [f]                                                   LATIN SMALL LETTER F
0x67  0x0067  [g] [g]                                                   LATIN SMALL LETTER G
0x68  0x0068  [h] [h]                                                   LATIN SMALL LETTER H
0x69  0x0069  [i] [i]                                                   LATIN SMALL LETTER I
0x6A  0x006A  [j] [j]                                                   LATIN SMALL LETTER J
0x6B  0x006B  [k] [k]                                                   LATIN SMALL LETTER K
0x6C  0x006C  [l] [l]                                                   LATIN SMALL LETTER L
0x6D  0x006D  [m] [m]                                                   LATIN SMALL LETTER M
0x6E  0x006E  [n] [n]                                                   LATIN SMALL LETTER N
0x6F  0x006F  [o] [o]                                                   LATIN SMALL LETTER O
0x70  0x0070  [p] [p]                                                   LATIN SMALL LETTER P
0x71  0x0071  [q] [q]                                                   LATIN SMALL LETTER Q
0x72  0x0072  [r] [r]                                                   LATIN SMALL LETTER R
0x73  0x0073  [s] [s]                                                   LATIN SMALL LETTER S
0x74  0x0074  [t] [t]                                                   LATIN SMALL LETTER T
0x75  0x0075  [u] [u]                                                   LATIN SMALL LETTER U
0x76  0x0076  [v] [v]                                                   LATIN SMALL LETTER V
0x77  0x0077  [w] [w]                                                   LATIN SMALL LETTER W
0x78  0x0078  [x] [x]                                                   LATIN SMALL LETTER X
0x79  0x0079  [y] [y]                                                   LATIN SMALL LETTER Y
0x7A  0x007A  [z] [z]                                                   LATIN SMALL LETTER Z
0x7B  0x007B  [{] [{]                                                   LEFT CURLY BRACKET
0x7C  0x007C  [|] [|]                                                   VERTICAL LINE
0x7D  0x007D  [}] [}]                                                   RIGHT CURLY BRACKET
0x7E  0x007E  [~] [~]                                                   TILDE
0x7F  0x007F  [ ] [ ]                                                   DELETE
0x80  0x20AC  [€] [€]      UNDEFINED                                    EURO SIGN
0x81         [ ] [ ]       UNDEFINED                                    UNDEFINED
0x82  0x201A  [‚] [‚]      UNDEFINED                                    SINGLE LOW-9 QUOTATION MARK
0x83  0x0192  [ƒ] [ƒ]      UNDEFINED                                    LATIN SMALL LETTER F WITH HOOK
0x84  0x201E  [„] [„]      UNDEFINED                                    DOUBLE LOW-9 QUOTATION MARK
0x85  0x2026  […] […]      UNDEFINED                                    HORIZONTAL ELLIPSIS
0x86  0x2020  [†] [†]      UNDEFINED                                    DAGGER
0x87  0x2021  [‡] [‡]      UNDEFINED                                    DOUBLE DAGGER
0x88  0x02C6  [ˆ] [ˆ]      UNDEFINED                                    MODIFIER LETTER CIRCUMFLEX ACCENT
0x89  0x2030  [‰] [‰]      UNDEFINED                                    PER MILLE SIGN
0x8A  0x0160  [Š] [Š]      UNDEFINED                                    LATIN CAPITAL LETTER S WITH CARON
0x8B  0x2039  [‹] [‹]      UNDEFINED                                    SINGLE LEFT-POINTING ANGLE QUOTATION MARK
0x8C  0x0152  [Œ] [Œ]      UNDEFINED                                    LATIN CAPITAL LIGATURE OE
0x8D         [ ] [ ]       UNDEFINED                                    UNDEFINED
0x8E  0x017D  [Ž] [Ž]      UNDEFINED                                    LATIN CAPITAL LETTER Z WITH CARON
0x8F         [ ] [ ]       UNDEFINED                                    UNDEFINED
0x90         [ ] [ ]       UNDEFINED                                    UNDEFINED
0x91  0x2018  [‘] [‘]      UNDEFINED                                    LEFT SINGLE QUOTATION MARK
0x92  0x2019  [’] [’]      UNDEFINED                                    RIGHT SINGLE QUOTATION MARK
0x93  0x201C  [“] [“]      UNDEFINED                                    LEFT DOUBLE QUOTATION MARK
0x94  0x201D  [”] [”]      UNDEFINED                                    RIGHT DOUBLE QUOTATION MARK
0x95  0x2022  [•] [•]      UNDEFINED                                    BULLET
0x96  0x2013  [–] [–]      UNDEFINED                                    EN DASH
0x97  0x2014  [—] [—]      UNDEFINED                                    EM DASH
0x98  0x02DC  [˜] [˜]      UNDEFINED                                    SMALL TILDE
0x99  0x2122  [™] [™]      UNDEFINED                                    TRADE MARK SIGN
0x9A  0x0161  [š] [š]      UNDEFINED                                    LATIN SMALL LETTER S WITH CARON
0x9B  0x203A  [›] [›]      UNDEFINED                                    SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
0x9C  0x0153  [œ] [œ]      UNDEFINED                                    LATIN SMALL LIGATURE OE
0x9D         [ ] [ ]       UNDEFINED                                    UNDEFINED
0x9E  0x017E  [ž] [ž]      UNDEFINED                                    LATIN SMALL LETTER Z WITH CARON
0x9F  0x0178  [Ÿ] [Ÿ]      UNDEFINED                                    LATIN CAPITAL LETTER Y WITH DIAERESIS
0xA0  0x00A0  [ ] [ ]                                                   NO-BREAK SPACE
0xA1  0x00A1  [¡] [¡]                                                   INVERTED EXCLAMATION MARK
0xA2  0x00A2  [¢] [¢]                                                   CENT SIGN
0xA3  0x00A3  [£] [£]                                                   POUND SIGN
0xA4  0x00A4  [¤] [¤]      Euro Sign(€) MS1252 code point 80            CURRENCY SIGN
0xA5  0x00A5  [¥] [¥]                                                   YEN SIGN
0xA6  0x00A6  [¦] [¦]      LATIN CAPITAL LETTER S WITH CARON(Š) 8A      BROKEN BAR
0xA7  0x00A7  [§] [§]                                                   SECTION SIGN
0xA8  0x00A8  [¨] [¨]      LATIN SMALL LETTER S WITH CARON(š)   9A      DIAERESIS
0xA9  0x00A9  [©] [©]                                                   COPYRIGHT SIGN
0xAA  0x00AA  [ª] [ª]                                                   FEMININE ORDINAL INDICATOR
0xAB  0x00AB  [«] [«]                                                   LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
0xAC  0x00AC  [¬] [¬]                                                   NOT SIGN
0xAD  0x00AD  [ ] [ ]                                                   SOFT HYPHEN
0xAE  0x00AE  [®] [®]                                                   REGISTERED SIGN
0xAF  0x00AF  [¯] [¯]                                                   MACRON
0xB0  0x00B0  [°] [°]                                                   DEGREE SIGN
0xB1  0x00B1  [±] [±]                                                   PLUS-MINUS SIGN
0xB2  0x00B2  [²] [²]                                                   SUPERSCRIPT TWO
0xB3  0x00B3  [³] [³]                                                   SUPERSCRIPT THREE
0xB4  0x00B4  [´] [´]      LATIN CAPITAL LETTER Z WITH CARON(Ž) 8E      ACUTE ACCENT
0xB5  0x00B5  [µ] [µ]                                                   MICRO SIGN
0xB6  0x00B6  [¶] [¶]                                                   PILCROW SIGN
0xB7  0x00B7  [·] [·]                                                   MIDDLE DOT
0xB8  0x00B8  [¸] [¸]      LATIN SMALL LETTER Z WITH CARON(ž)   9E      CEDILLA
0xB9  0x00B9  [¹] [¹]                                                   SUPERSCRIPT ONE
0xBA  0x00BA  [º] [º]                                                   MASCULINE ORDINAL INDICATOR
0xBB  0x00BB  [»] [»]                                                   RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
0xBC  0x00BC  [¼] [¼]      LATIN CAPITAL LIGATURE OE(Œ)  8C             VULGAR FRACTION ONE QUARTER
0xBD  0x00BD  [½] [½]      LATIN SMALL LIGATURE OE(œ)  9C               VULGAR FRACTION ONE HALF
0xBE  0x00BE  [¾] [¾]      LATIN CAPITAL LETTER Y WITH DIAERESIS(Ÿ) 9F  VULGAR FRACTION THREE QUARTERS
0xBF  0x00BF  [¿] [¿]                                                   INVERTED QUESTION MARK
0xC0  0x00C0  [À] [À]                                                   LATIN CAPITAL LETTER A WITH GRAVE
0xC1  0x00C1  [Á] [Á]                                                   LATIN CAPITAL LETTER A WITH ACUTE
0xC2  0x00C2  [Â] [Â]                                                   LATIN CAPITAL LETTER A WITH CIRCUMFLEX
0xC3  0x00C3  [Ã] [Ã]                                                   LATIN CAPITAL LETTER A WITH TILDE
0xC4  0x00C4  [Ä] [Ä]                                                   LATIN CAPITAL LETTER A WITH DIAERESIS
0xC5  0x00C5  [Å] [Å]                                                   LATIN CAPITAL LETTER A WITH RING ABOVE
0xC6  0x00C6  [Æ] [Æ]                                                   LATIN CAPITAL LETTER AE
0xC7  0x00C7  [Ç] [Ç]                                                   LATIN CAPITAL LETTER C WITH CEDILLA
0xC8  0x00C8  [È] [È]                                                   LATIN CAPITAL LETTER E WITH GRAVE
0xC9  0x00C9  [É] [É]                                                   LATIN CAPITAL LETTER E WITH ACUTE
0xCA  0x00CA  [Ê] [Ê]                                                   LATIN CAPITAL LETTER E WITH CIRCUMFLEX
0xCB  0x00CB  [Ë] [Ë]                                                   LATIN CAPITAL LETTER E WITH DIAERESIS
0xCC  0x00CC  [Ì] [Ì]                                                   LATIN CAPITAL LETTER I WITH GRAVE
0xCD  0x00CD  [Í] [Í]                                                   LATIN CAPITAL LETTER I WITH ACUTE
0xCE  0x00CE  [Î] [Î]                                                   LATIN CAPITAL LETTER I WITH CIRCUMFLEX
0xCF  0x00CF  [Ï] [Ï]                                                   LATIN CAPITAL LETTER I WITH DIAERESIS
0xD0  0x00D0  [Ð] [Ð]                                                   LATIN CAPITAL LETTER ETH
0xD1  0x00D1  [Ñ] [Ñ]                                                   LATIN CAPITAL LETTER N WITH TILDE
0xD2  0x00D2  [Ò] [Ò]                                                   LATIN CAPITAL LETTER O WITH GRAVE
0xD3  0x00D3  [Ó] [Ó]                                                   LATIN CAPITAL LETTER O WITH ACUTE
0xD4  0x00D4  [Ô] [Ô]                                                   LATIN CAPITAL LETTER O WITH CIRCUMFLEX
0xD5  0x00D5  [Õ] [Õ]                                                   LATIN CAPITAL LETTER O WITH TILDE
0xD6  0x00D6  [Ö] [Ö]                                                   LATIN CAPITAL LETTER O WITH DIAERESIS
0xD7  0x00D7  [×] [×]                                                   MULTIPLICATION SIGN
0xD8  0x00D8  [Ø] [Ø]                                                   LATIN CAPITAL LETTER O WITH STROKE
0xD9  0x00D9  [Ù] [Ù]                                                   LATIN CAPITAL LETTER U WITH GRAVE
0xDA  0x00DA  [Ú] [Ú]                                                   LATIN CAPITAL LETTER U WITH ACUTE
0xDB  0x00DB  [Û] [Û]                                                   LATIN CAPITAL LETTER U WITH CIRCUMFLEX
0xDC  0x00DC  [Ü] [Ü]                                                   LATIN CAPITAL LETTER U WITH DIAERESIS
0xDD  0x00DD  [Ý] [Ý]                                                   LATIN CAPITAL LETTER Y WITH ACUTE
0xDE  0x00DE  [Þ] [Þ]                                                   LATIN CAPITAL LETTER THORN
0xDF  0x00DF  [ß] [ß]                                                   LATIN SMALL LETTER SHARP S
0xE0  0x00E0  [à] [à]                                                   LATIN SMALL LETTER A WITH GRAVE
0xE1  0x00E1  [á] [á]                                                   LATIN SMALL LETTER A WITH ACUTE
0xE2  0x00E2  [â] [â]                                                   LATIN SMALL LETTER A WITH CIRCUMFLEX
0xE3  0x00E3  [ã] [ã]                                                   LATIN SMALL LETTER A WITH TILDE
0xE4  0x00E4  [ä] [ä]                                                   LATIN SMALL LETTER A WITH DIAERESIS
0xE5  0x00E5  [å] [å]                                                   LATIN SMALL LETTER A WITH RING ABOVE
0xE6  0x00E6  [æ] [æ]                                                   LATIN SMALL LETTER AE
0xE7  0x00E7  [ç] [ç]                                                   LATIN SMALL LETTER C WITH CEDILLA
0xE8  0x00E8  [è] [è]                                                   LATIN SMALL LETTER E WITH GRAVE
0xE9  0x00E9  [é] [é]                                                   LATIN SMALL LETTER E WITH ACUTE
0xEA  0x00EA  [ê] [ê]                                                   LATIN SMALL LETTER E WITH CIRCUMFLEX
0xEB  0x00EB  [ë] [ë]                                                   LATIN SMALL LETTER E WITH DIAERESIS
0xEC  0x00EC  [ì] [ì]                                                   LATIN SMALL LETTER I WITH GRAVE
0xED  0x00ED  [í] [í]                                                   LATIN SMALL LETTER I WITH ACUTE
0xEE  0x00EE  [î] [î]                                                   LATIN SMALL LETTER I WITH CIRCUMFLEX
0xEF  0x00EF  [ï] [ï]                                                   LATIN SMALL LETTER I WITH DIAERESIS
0xF0  0x00F0  [ð] [ð]                                                   LATIN SMALL LETTER ETH
0xF1  0x00F1  [ñ] [ñ]                                                   LATIN SMALL LETTER N WITH TILDE
0xF2  0x00F2  [ò] [ò]                                                   LATIN SMALL LETTER O WITH GRAVE
0xF3  0x00F3  [ó] [ó]                                                   LATIN SMALL LETTER O WITH ACUTE
0xF4  0x00F4  [ô] [ô]                                                   LATIN SMALL LETTER O WITH CIRCUMFLEX
0xF5  0x00F5  [õ] [õ]                                                   LATIN SMALL LETTER O WITH TILDE
0xF6  0x00F6  [ö] [ö]                                                   LATIN SMALL LETTER O WITH DIAERESIS
0xF7  0x00F7  [÷] [÷]                                                   DIVISION SIGN
0xF8  0x00F8  [ø] [ø]                                                   LATIN SMALL LETTER O WITH STROKE
0xF9  0x00F9  [ù] [ù]                                                   LATIN SMALL LETTER U WITH GRAVE
0xFA  0x00FA  [ú] [ú]                                                   LATIN SMALL LETTER U WITH ACUTE
0xFB  0x00FB  [û] [û]                                                   LATIN SMALL LETTER U WITH CIRCUMFLEX
0xFC  0x00FC  [ü] [ü]                                                   LATIN SMALL LETTER U WITH DIAERESIS
0xFD  0x00FD  [ý] [ý]                                                   LATIN SMALL LETTER Y WITH ACUTE
0xFE  0x00FE  [þ] [þ]                                                   LATIN SMALL LETTER THORN
0xFF  0x00FF  [ÿ] [ÿ]                                                   LATIN SMALL LETTER Y WITH DIAERESIS

Difference between WE8ISO8859P1 and WE8ISO8859P15 characterset

The lists of characters along with their code points used in oracle database character set WE8ISO8859P1 is defined in the http://msdn.microsoft.com/en-us/goglobal/cc305167.aspx.

And the lists of characters along with their code points used in oracle database character set WE8ISO8859P15 is defined in the http://msdn.microsoft.com/en-us/goglobal/cc305176.aspx.

The oracle database character set WE8ISO8859P15 differs from WE8ISO8859P1 in a few positions only.

In the oracle database character set WE8ISO8859P15 the euro sign and some national letters used in French and Finnish have been introduced and some rarely used special characters omitted that was exist in WE8ISO8859P1.

Below is the lists of WE8ISO8859P1 and WE8ISO8859P15 character sets that differ by code position only.


Code  |  WE8ISO8859P1 (ISO Latin 1)  |  WE8ISO8859P15 (ISO Latin 9)
in   |                              |
hex  |  name                        |  name
------+------------------------------+------------------------------------
A4   |  general currency symbol(¤)  |  euro sign (€)
    |                              |
A6   |  broken vertical bar (¦)     |  latin capital letter s with caron (Š)
    |                              |
A8   |  umlaut (diaeresis) accent(¨)|  latin small letter s with caron (š)
    |                              |
B4   |  acute accent  (´)           |  latin capital letter z with caron (Ž)
    |                              |
B8   |  cedilla  (¸)                |  latin small letter z with caron (ž)
    |                              |
BC   |  one fourth (one quarter) (¼)|  latin capital ligature oe (Œ)
    |                              |
BD   |  one half   (½)              |  latin small ligature oe (œ)
    |                              |
BE   |  three quarters  (¾)         |  latin capital letter y with diaeresis (Ÿ)

Except the above characters and the characters that are undefined, rest of the characters in WE8ISO8859P15 has the same code point in WE8ISO8859P1.

Note that in both WE8ISO8859P15 and WE8ISO8859P1 the code points from 0x80 to 0x9F are undefined. So whenever you want to find different between these two the undefined characters also appear in the list.


SQL>set serveroutput on
declare
i number;
begin
for i in 0..255 loop
  declare
      ch varchar2(1);
  begin
      ch := chr(i);
      if  convert( ch, 'WE8ISO8859P1', 'WE8ISO8859P15') != ch
      then
        dbms_output.put_line('Difference- Decimal:'|| i ||' Hexa:'|| to_char(i,'XXXX'));
      end if;
  end;
end loop;
end;
/
Difference- Decimal:128 Hexa:   80
Difference- Decimal:129 Hexa:   81
Difference- Decimal:130 Hexa:   82
Difference- Decimal:131 Hexa:   83
Difference- Decimal:132 Hexa:   84
Difference- Decimal:133 Hexa:   85
Difference- Decimal:134 Hexa:   86
Difference- Decimal:135 Hexa:   87
Difference- Decimal:136 Hexa:   88
Difference- Decimal:137 Hexa:   89
Difference- Decimal:138 Hexa:   8A
Difference- Decimal:139 Hexa:   8B
Difference- Decimal:140 Hexa:   8C
Difference- Decimal:141 Hexa:   8D
Difference- Decimal:142 Hexa:   8E
Difference- Decimal:143 Hexa:   8F
Difference- Decimal:144 Hexa:   90
Difference- Decimal:145 Hexa:   91
Difference- Decimal:146 Hexa:   92
Difference- Decimal:147 Hexa:   93
Difference- Decimal:148 Hexa:   94
Difference- Decimal:149 Hexa:   95
Difference- Decimal:150 Hexa:   96
Difference- Decimal:151 Hexa:   97
Difference- Decimal:152 Hexa:   98
Difference- Decimal:153 Hexa:   99
Difference- Decimal:154 Hexa:   9A
Difference- Decimal:155 Hexa:   9B
Difference- Decimal:156 Hexa:   9C
Difference- Decimal:157 Hexa:   9D
Difference- Decimal:158 Hexa:   9E
Difference- Decimal:159 Hexa:   9F
Difference- Decimal:164 Hexa:   A4
Difference- Decimal:166 Hexa:   A6
Difference- Decimal:168 Hexa:   A8
Difference- Decimal:180 Hexa:   B4
Difference- Decimal:184 Hexa:   B8
Difference- Decimal:188 Hexa:   BC
Difference- Decimal:189 Hexa:   BD
Difference- Decimal:190 Hexa:   BE

PL/SQL procedure successfully completed.

In both character set from 0x80 to 0x9F all the code points are undefined. And the rest 8 characters are different between the two.

Also the WE8ISO8859P1 and WE8ISO8859P15 character sets are not binary super sets of each other.
Related Documents
Difference between WE8MSWIN1252 and WE8ISO8859P15 characterset
Difference between WE8ISO8859P1 and WE8MSWIN1252 characterset
CSSCAN fails with CSS-00151: failed to enumerate user tables CSS-00120
CSSCAN fails with error while loading shared libraries: libclntsh.so.10.1
How to run csscan in the background as a sysdba
CSSCAN fails with CSS-00107: Character set migration utility schema not installed
ORA-00904: "CNVTYPE" CSS-08888: failed to update conversion type
CSSCAN fails with ORA-00600, CSS-00152, CSS-00120

Tuesday, March 10, 2009

Difference between WE8ISO8859P1 and WE8MSWIN1252 characterset

If we look for the characters and code points for both character sets then we will find that every characters defined under WE8ISO8859P1 exists in character set WE8MSWIN1252 plus WE8MSWIN1252 contains some additions characters. So we can say WE8MSWIN1252 is logical super set of WE8ISO8859P1.

If we look further details, we see total 27 code points are not existing in P1 that are filled in / used in WE8MSWIN1252.

Also, no code points have a different symbol in WE8MSWIN1252 than in WE8ISO8859P1 soWE8MSWIN1252 is a binary super set of WE8ISO8859P1.

We see in WE8MSWIN1252 the euro symbol (€) is defined as code point 80(hex number). But in WE8ISO8859P1 the euro symbol is unassigned/ not defined. Below is the list of characters that is defined in WE8MSWIN1252 but undefined in WE8ISO8859P1.

In the list,
Column #1 is the WE8MSWIN1252 characters table code in hexadecimal.
Column #2 is the Unicode code in hexadecimal.
Column #3 is the list of characters displayed by numerical call and by their value.
Column #4 is the Description of the character.


WIN-  Unicod  Charact  Description
1251  e Char  ers
----  ------  -------  ----------------------------
0x80  0x20AC  [€] [€]  EURO SIGN
0x82  0x201A  [‚] [‚]  SINGLE LOW-9 QUOTATION MARK
0x83  0x0192  [ƒ] [ƒ]  LATIN SMALL LETTER F WITH HOOK
0x84  0x201E  [„] [„]  DOUBLE LOW-9 QUOTATION MARK
0x85  0x2026  […] […]  HORIZONTAL ELLIPSIS
0x86  0x2020  [†] [†]  DAGGER
0x87  0x2021  [‡] [‡]  DOUBLE DAGGER
0x88  0x02C6  [ˆ] [ˆ]  MODIFIER LETTER CIRCUMFLEX ACCENT
0x89  0x2030  [‰] [‰]  PER MILLE SIGN
0x8A  0x0160  [Š] [Š]  LATIN CAPITAL LETTER S WITH CARON
0x8B  0x2039  [‹] [‹]  SINGLE LEFT-POINTING ANGLE QUOTATION MARK
0x8C  0x0152  [Œ] [Œ]  LATIN CAPITAL LIGATURE OE
0x8E  0x017D  [Ž] [Ž]  LATIN CAPITAL LETTER Z WITH CARON
0x91  0x2018  [‘] [‘]  LEFT SINGLE QUOTATION MARK
0x92  0x2019  [’] [’]  RIGHT SINGLE QUOTATION MARK
0x93  0x201C  [“] [“]  LEFT DOUBLE QUOTATION MARK
0x94  0x201D  [”] [”]  RIGHT DOUBLE QUOTATION MARK
0x95  0x2022  [•] [•]  BULLET
0x96  0x2013  [–] [–]  EN DASH
0x97  0x2014  [—] [—]  EM DASH
0x98  0x02DC  [˜] [˜]  SMALL TILDE
0x99  0x2122  [™] [™]  TRADE MARK SIGN
0x9A  0x0161  [š] [š]  LATIN SMALL LETTER S WITH CARON
0x9B  0x203A  [›] [›]  SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
0x9C  0x0153  [œ] [œ]  LATIN SMALL LIGATURE OE
0x9E  0x017E  [ž] [ž]  LATIN SMALL LETTER Z WITH CARON
0x9F  0x0178  [Ÿ] [Ÿ]  LATIN CAPITAL LETTER Y WITH DIAERESIS

Below is the list of characters under WE8ISO8859P1 and WE8MSWIN1252 along with their code points.


Dec.  Unico.  Charac.   WE8ISO8859P1 Character   Description
----  ------  -------   (States if different)    -----------
0x00  0x0000  [ ] [ ]                            NULL
0x01  0x0001  [ ] [ ]                            START OF HEADING
0x02  0x0002  [ ] [ ]                            START OF TEXT
0x03  0x0003  [ ] [ ]                            END OF TEXT
0x04  0x0004  [ ] [ ]                            END OF TRANSMISSION
0x05  0x0005  [ ] [ ]                            ENQUIRY
0x06  0x0006  [ ] [ ]                            ACKNOWLEDGE
0x07  0x0007  [ ] [ ]                            BELL
0x08  0x0008  [ ] [ ]                            BACKSPACE
0x09  0x0009  [ ] [ ]                            HORIZONTAL TABULATION
0x0A  0x000A  [ ] [ ]                            LINE FEED
0x0B  0x000B  [ ] [ ]                            VERTICAL TABULATION
0x0C  0x000C  [ ] [ ]                            FORM FEED
0x0D  0x000D  [ ] [ ]                            CARRIAGE RETURN
0x0E  0x000E  [ ] [ ]                            SHIFT OUT
0x0F  0x000F  [ ] [ ]                            SHIFT IN
0x10  0x0010  [ ] [ ]                            DATA LINK ESCAPE
0x11  0x0011  [ ] [ ]                            DEVICE CONTROL ONE
0x12  0x0012  [ ] [ ]                            DEVICE CONTROL TWO
0x13  0x0013  [ ] [ ]                            DEVICE CONTROL THREE
0x14  0x0014  [ ] [ ]                            DEVICE CONTROL FOUR
0x15  0x0015  [ ] [ ]                            NEGATIVE ACKNOWLEDGE
0x16  0x0016  [ ] [ ]                            SYNCHRONOUS IDLE
0x17  0x0017  [ ] [ ]                            END OF TRANSMISSION BLOCK
0x18  0x0018  [ ] [ ]                            CANCEL
0x19  0x0019  [ ] [ ]                            END OF MEDIUM
0x1A  0x001A  [ ] [ ]                            SUBSTITUTE
0x1B  0x001B  [ ] [ ]                            ESCAPE
0x1C  0x001C  [ ] [ ]                            FILE SEPARATOR
0x1D  0x001D  [ ] [ ]                            GROUP SEPARATOR
0x1E  0x001E  [ ] [ ]                            RECORD SEPARATOR
0x1F  0x001F  [ ] [ ]                            UNIT SEPARATOR
0x20  0x0020  [ ] [ ]                            SPACE
0x21  0x0021  [!] [!]                            EXCLAMATION MARK
0x22  0x0022  ["] ["]                            QUOTATION MARK
0x23  0x0023  [#] [#]                            NUMBER SIGN
0x24  0x0024  [$] [$]                            DOLLAR SIGN
0x25  0x0025  [%] [%]                            PERCENT SIGN
0x26  0x0026  [&] [&]                            AMPERSAND
0x27  0x0027  ['] [']                            APOSTROPHE
0x28  0x0028  [(] [(]                            LEFT PARENTHESIS
0x29  0x0029  [)] [)]                            RIGHT PARENTHESIS
0x2A  0x002A  [*] [*]                            ASTERISK
0x2B  0x002B  [+] [+]                            PLUS SIGN
0x2C  0x002C  [,] [,]                            COMMA
0x2D  0x002D  [-] [-]                            HYPHEN-MINUS
0x2E  0x002E  [.] [.]                            FULL STOP
0x2F  0x002F  [/] [/]                            SOLIDUS
0x30  0x0030  [0] [0]                            DIGIT ZERO
0x31  0x0031  [1] [1]                            DIGIT ONE
0x32  0x0032  [2] [2]                            DIGIT TWO
0x33  0x0033  [3] [3]                            DIGIT THREE
0x34  0x0034  [4] [4]                            DIGIT FOUR
0x35  0x0035  [5] [5]                            DIGIT FIVE
0x36  0x0036  [6] [6]                            DIGIT SIX
0x37  0x0037  [7] [7]                            DIGIT SEVEN
0x38  0x0038  [8] [8]                            DIGIT EIGHT
0x39  0x0039  [9] [9]                            DIGIT NINE
0x3A  0x003A  [:] [:]                            COLON
0x3B  0x003B  [;] [;]                            SEMICOLON
0x3C  0x003C  [<] [<]                            LESS-THAN SIGN  
0x3D  0x003D  [=] [=]                            EQUALS SIGN         
0x3E  0x003E  [>] [>]                            GREATER-THAN SIGN
0x3F  0x003F  [?] [?]                            QUESTION MARK
0x40  0x0040  [@] [@]                            COMMERCIAL AT
0x41  0x0041  [A] [A]                            LATIN CAPITAL LETTER A
0x42  0x0042  [B] [B]                            LATIN CAPITAL LETTER B
0x43  0x0043  [C] [C]                            LATIN CAPITAL LETTER C
0x44  0x0044  [D] [D]                            LATIN CAPITAL LETTER D
0x45  0x0045  [E] [E]                            LATIN CAPITAL LETTER E
0x46  0x0046  [F] [F]                            LATIN CAPITAL LETTER F
0x47  0x0047  [G] [G]                            LATIN CAPITAL LETTER G
0x48  0x0048  [H] [H]                            LATIN CAPITAL LETTER H
0x49  0x0049  [I] [I]                            LATIN CAPITAL LETTER I
0x4A  0x004A  [J] [J]                            LATIN CAPITAL LETTER J
0x4B  0x004B  [K] [K]                            LATIN CAPITAL LETTER K
0x4C  0x004C  [L] [L]                            LATIN CAPITAL LETTER L
0x4D  0x004D  [M] [M]                            LATIN CAPITAL LETTER M
0x4E  0x004E  [N] [N]                            LATIN CAPITAL LETTER N
0x4F  0x004F  [O] [O]                            LATIN CAPITAL LETTER O
0x50  0x0050  [P] [P]                            LATIN CAPITAL LETTER P
0x51  0x0051  [Q] [Q]                            LATIN CAPITAL LETTER Q
0x52  0x0052  [R] [R]                            LATIN CAPITAL LETTER R
0x53  0x0053  [S] [S]                            LATIN CAPITAL LETTER S
0x54  0x0054  [T] [T]                            LATIN CAPITAL LETTER T
0x55  0x0055  [U] [U]                            LATIN CAPITAL LETTER U
0x56  0x0056  [V] [V]                            LATIN CAPITAL LETTER V
0x57  0x0057  [W] [W]                            LATIN CAPITAL LETTER W
0x58  0x0058  [X] [X]                            LATIN CAPITAL LETTER X
0x59  0x0059  [Y] [Y]                            LATIN CAPITAL LETTER Y
0x5A  0x005A  [Z] [Z]                            LATIN CAPITAL LETTER Z
0x5B  0x005B  [[] [[]                            LEFT SQUARE BRACKET
0x5C  0x005C  [\] [\]                            REVERSE SOLIDUS
0x5D  0x005D  []] []]                            RIGHT SQUARE BRACKET
0x5E  0x005E  [^] [^]                            CIRCUMFLEX ACCENT
0x5F  0x005F  [_] [_]                            LOW LINE
0x60  0x0060  [`] [`]                            GRAVE ACCENT
0x61  0x0061  [a] [a]                            LATIN SMALL LETTER A
0x62  0x0062  [b] [b]                            LATIN SMALL LETTER B
0x63  0x0063  [c] [c]                            LATIN SMALL LETTER C
0x64  0x0064  [d] [d]                            LATIN SMALL LETTER D
0x65  0x0065  [e] [e]                            LATIN SMALL LETTER E
0x66  0x0066  [f] [f]                            LATIN SMALL LETTER F
0x67  0x0067  [g] [g]                            LATIN SMALL LETTER G
0x68  0x0068  [h] [h]                            LATIN SMALL LETTER H
0x69  0x0069  [i] [i]                            LATIN SMALL LETTER I
0x6A  0x006A  [j] [j]                            LATIN SMALL LETTER J
0x6B  0x006B  [k] [k]                            LATIN SMALL LETTER K
0x6C  0x006C  [l] [l]                            LATIN SMALL LETTER L
0x6D  0x006D  [m] [m]                            LATIN SMALL LETTER M
0x6E  0x006E  [n] [n]                            LATIN SMALL LETTER N
0x6F  0x006F  [o] [o]                            LATIN SMALL LETTER O
0x70  0x0070  [p] [p]                            LATIN SMALL LETTER P
0x71  0x0071  [q] [q]                            LATIN SMALL LETTER Q
0x72  0x0072  [r] [r]                            LATIN SMALL LETTER R
0x73  0x0073  [s] [s]                            LATIN SMALL LETTER S
0x74  0x0074  [t] [t]                            LATIN SMALL LETTER T
0x75  0x0075  [u] [u]                            LATIN SMALL LETTER U
0x76  0x0076  [v] [v]                            LATIN SMALL LETTER V
0x77  0x0077  [w] [w]                            LATIN SMALL LETTER W
0x78  0x0078  [x] [x]                            LATIN SMALL LETTER X
0x79  0x0079  [y] [y]                            LATIN SMALL LETTER Y
0x7A  0x007A  [z] [z]                            LATIN SMALL LETTER Z
0x7B  0x007B  [{] [{]                            LEFT CURLY BRACKET
0x7C  0x007C  [|] [|]                            VERTICAL LINE
0x7D  0x007D  [}] [}]                            RIGHT CURLY BRACKET
0x7E  0x007E  [~] [~]                            TILDE
0x7F  0x007F  [ ] [ ]                            DELETE
0x80  0x20AC  [€] [€]      UNDEFINED             EURO SIGN
0x81         [ ] [ ]       UNDEFINED             UNDEFINED
0x82  0x201A  [‚] [‚]      UNDEFINED             SINGLE LOW-9 QUOTATION MARK
0x83  0x0192  [ƒ] [ƒ]      UNDEFINED             LATIN SMALL LETTER F WITH HOOK
0x84  0x201E  [„] [„]      UNDEFINED             DOUBLE LOW-9 QUOTATION MARK
0x85  0x2026  […] […]      UNDEFINED             HORIZONTAL ELLIPSIS
0x86  0x2020  [†] [†]      UNDEFINED             DAGGER
0x87  0x2021  [‡] [‡]      UNDEFINED             DOUBLE DAGGER
0x88  0x02C6  [ˆ] [ˆ]      UNDEFINED             MODIFIER LETTER CIRCUMFLEX ACCENT
0x89  0x2030  [‰] [‰]      UNDEFINED             PER MILLE SIGN
0x8A  0x0160  [Š] [Š]      UNDEFINED             LATIN CAPITAL LETTER S WITH CARON
0x8B  0x2039  [‹] [‹]      UNDEFINED             SINGLE LEFT-POINTING ANGLE QUOTATION MARK
0x8C  0x0152  [Œ] [Œ]      UNDEFINED             LATIN CAPITAL LIGATURE OE
0x8D         [ ] [ ]       UNDEFINED             UNDEFINED
0x8E  0x017D  [Ž] [Ž]      UNDEFINED             LATIN CAPITAL LETTER Z WITH CARON
0x8F         [ ] [ ]       UNDEFINED             UNDEFINED
0x90         [ ] [ ]       UNDEFINED             UNDEFINED
0x91  0x2018  [‘] [‘]      UNDEFINED             LEFT SINGLE QUOTATION MARK
0x92  0x2019  [’] [’]      UNDEFINED             RIGHT SINGLE QUOTATION MARK
0x93  0x201C  [“] [“]      UNDEFINED             LEFT DOUBLE QUOTATION MARK
0x94  0x201D  [”] [”]      UNDEFINED             RIGHT DOUBLE QUOTATION MARK
0x95  0x2022  [•] [•]      UNDEFINED             BULLET
0x96  0x2013  [–] [–]      UNDEFINED             EN DASH
0x97  0x2014  [—] [—]      UNDEFINED             EM DASH
0x98  0x02DC  [˜] [˜]      UNDEFINED             SMALL TILDE
0x99  0x2122  [™] [™]      UNDEFINED             TRADE MARK SIGN
0x9A  0x0161  [š] [š]      UNDEFINED             LATIN SMALL LETTER S WITH CARON
0x9B  0x203A  [›] [›]      UNDEFINED             SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
0x9C  0x0153  [œ] [œ]      UNDEFINED             LATIN SMALL LIGATURE OE
0x9D         [ ] [ ]       UNDEFINED             UNDEFINED
0x9E  0x017E  [ž] [ž]      UNDEFINED             LATIN SMALL LETTER Z WITH CARON
0x9F  0x0178  [Ÿ] [Ÿ]      UNDEFINED             LATIN CAPITAL LETTER Y WITH DIAERESIS
0xA0  0x00A0  [ ] [ ]                            NO-BREAK SPACE
0xA1  0x00A1  [¡] [¡]                            INVERTED EXCLAMATION MARK
0xA2  0x00A2  [¢] [¢]                            CENT SIGN
0xA3  0x00A3  [£] [£]                            POUND SIGN
0xA4  0x00A4  [¤] [¤]                            CURRENCY SIGN
0xA5  0x00A5  [¥] [¥]                            YEN SIGN
0xA6  0x00A6  [¦] [¦]                            BROKEN BAR
0xA7  0x00A7  [§] [§]                            SECTION SIGN
0xA8  0x00A8  [¨] [¨]                            DIAERESIS
0xA9  0x00A9  [©] [©]                            COPYRIGHT SIGN
0xAA  0x00AA  [ª] [ª]                            FEMININE ORDINAL INDICATOR
0xAB  0x00AB  [«] [«]                            LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
0xAC  0x00AC  [¬] [¬]                            NOT SIGN
0xAD  0x00AD  [ ] [ ]                            SOFT HYPHEN
0xAE  0x00AE  [®] [®]                            REGISTERED SIGN
0xAF  0x00AF  [¯] [¯]                            MACRON
0xB0  0x00B0  [°] [°]                            DEGREE SIGN
0xB1  0x00B1  [±] [±]                            PLUS-MINUS SIGN
0xB2  0x00B2  [²] [²]                            SUPERSCRIPT TWO
0xB3  0x00B3  [³] [³]                            SUPERSCRIPT THREE
0xB4  0x00B4  [´] [´]                            ACUTE ACCENT
0xB5  0x00B5  [µ] [µ]                            MICRO SIGN
0xB6  0x00B6  [¶] [¶]                            PILCROW SIGN
0xB7  0x00B7  [·] [·]                            MIDDLE DOT
0xB8  0x00B8  [¸] [¸]                            CEDILLA
0xB9  0x00B9  [¹] [¹]                            SUPERSCRIPT ONE
0xBA  0x00BA  [º] [º]                            MASCULINE ORDINAL INDICATOR
0xBB  0x00BB  [»] [»]                            RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
0xBC  0x00BC  [¼] [¼]                            VULGAR FRACTION ONE QUARTER
0xBD  0x00BD  [½] [½]                            VULGAR FRACTION ONE HALF
0xBE  0x00BE  [¾] [¾]                            VULGAR FRACTION THREE QUARTERS
0xBF  0x00BF  [¿] [¿]                            INVERTED QUESTION MARK
0xC0  0x00C0  [À] [À]                            LATIN CAPITAL LETTER A WITH GRAVE
0xC1  0x00C1  [Á] [Á]                            LATIN CAPITAL LETTER A WITH ACUTE
0xC2  0x00C2  [Â] [Â]                            LATIN CAPITAL LETTER A WITH CIRCUMFLEX
0xC3  0x00C3  [Ã] [Ã]                            LATIN CAPITAL LETTER A WITH TILDE
0xC4  0x00C4  [Ä] [Ä]                            LATIN CAPITAL LETTER A WITH DIAERESIS
0xC5  0x00C5  [Å] [Å]                            LATIN CAPITAL LETTER A WITH RING ABOVE
0xC6  0x00C6  [Æ] [Æ]                            LATIN CAPITAL LETTER AE
0xC7  0x00C7  [Ç] [Ç]                            LATIN CAPITAL LETTER C WITH CEDILLA
0xC8  0x00C8  [È] [È]                            LATIN CAPITAL LETTER E WITH GRAVE
0xC9  0x00C9  [É] [É]                            LATIN CAPITAL LETTER E WITH ACUTE
0xCA  0x00CA  [Ê] [Ê]                            LATIN CAPITAL LETTER E WITH CIRCUMFLEX
0xCB  0x00CB  [Ë] [Ë]                            LATIN CAPITAL LETTER E WITH DIAERESIS
0xCC  0x00CC  [Ì] [Ì]                            LATIN CAPITAL LETTER I WITH GRAVE
0xCD  0x00CD  [Í] [Í]                            LATIN CAPITAL LETTER I WITH ACUTE
0xCE  0x00CE  [Î] [Î]                            LATIN CAPITAL LETTER I WITH CIRCUMFLEX
0xCF  0x00CF  [Ï] [Ï]                            LATIN CAPITAL LETTER I WITH DIAERESIS
0xD0  0x00D0  [Ð] [Ð]                            LATIN CAPITAL LETTER ETH
0xD1  0x00D1  [Ñ] [Ñ]                            LATIN CAPITAL LETTER N WITH TILDE
0xD2  0x00D2  [Ò] [Ò]                            LATIN CAPITAL LETTER O WITH GRAVE
0xD3  0x00D3  [Ó] [Ó]                            LATIN CAPITAL LETTER O WITH ACUTE
0xD4  0x00D4  [Ô] [Ô]                            LATIN CAPITAL LETTER O WITH CIRCUMFLEX
0xD5  0x00D5  [Õ] [Õ]                            LATIN CAPITAL LETTER O WITH TILDE
0xD6  0x00D6  [Ö] [Ö]                            LATIN CAPITAL LETTER O WITH DIAERESIS
0xD7  0x00D7  [×] [×]                            MULTIPLICATION SIGN
0xD8  0x00D8  [Ø] [Ø]                            LATIN CAPITAL LETTER O WITH STROKE
0xD9  0x00D9  [Ù] [Ù]                            LATIN CAPITAL LETTER U WITH GRAVE
0xDA  0x00DA  [Ú] [Ú]                            LATIN CAPITAL LETTER U WITH ACUTE
0xDB  0x00DB  [Û] [Û]                            LATIN CAPITAL LETTER U WITH CIRCUMFLEX
0xDC  0x00DC  [Ü] [Ü]                            LATIN CAPITAL LETTER U WITH DIAERESIS
0xDD  0x00DD  [Ý] [Ý]                            LATIN CAPITAL LETTER Y WITH ACUTE
0xDE  0x00DE  [Þ] [Þ]                            LATIN CAPITAL LETTER THORN
0xDF  0x00DF  [ß] [ß]                            LATIN SMALL LETTER SHARP S
0xE0  0x00E0  [à] [à]                            LATIN SMALL LETTER A WITH GRAVE
0xE1  0x00E1  [á] [á]                            LATIN SMALL LETTER A WITH ACUTE
0xE2  0x00E2  [â] [â]                            LATIN SMALL LETTER A WITH CIRCUMFLEX
0xE3  0x00E3  [ã] [ã]                            LATIN SMALL LETTER A WITH TILDE
0xE4  0x00E4  [ä] [ä]                            LATIN SMALL LETTER A WITH DIAERESIS
0xE5  0x00E5  [å] [å]                            LATIN SMALL LETTER A WITH RING ABOVE
0xE6  0x00E6  [æ] [æ]                            LATIN SMALL LETTER AE
0xE7  0x00E7  [ç] [ç]                            LATIN SMALL LETTER C WITH CEDILLA
0xE8  0x00E8  [è] [è]                            LATIN SMALL LETTER E WITH GRAVE
0xE9  0x00E9  [é] [é]                            LATIN SMALL LETTER E WITH ACUTE
0xEA  0x00EA  [ê] [ê]                            LATIN SMALL LETTER E WITH CIRCUMFLEX
0xEB  0x00EB  [ë] [ë]                            LATIN SMALL LETTER E WITH DIAERESIS
0xEC  0x00EC  [ì] [ì]                            LATIN SMALL LETTER I WITH GRAVE
0xED  0x00ED  [í] [í]                            LATIN SMALL LETTER I WITH ACUTE
0xEE  0x00EE  [î] [î]                            LATIN SMALL LETTER I WITH CIRCUMFLEX
0xEF  0x00EF  [ï] [ï]                            LATIN SMALL LETTER I WITH DIAERESIS
0xF0  0x00F0  [ð] [ð]                            LATIN SMALL LETTER ETH
0xF1  0x00F1  [ñ] [ñ]                            LATIN SMALL LETTER N WITH TILDE
0xF2  0x00F2  [ò] [ò]                            LATIN SMALL LETTER O WITH GRAVE
0xF3  0x00F3  [ó] [ó]                            LATIN SMALL LETTER O WITH ACUTE
0xF4  0x00F4  [ô] [ô]                            LATIN SMALL LETTER O WITH CIRCUMFLEX
0xF5  0x00F5  [õ] [õ]                            LATIN SMALL LETTER O WITH TILDE
0xF6  0x00F6  [ö] [ö]                            LATIN SMALL LETTER O WITH DIAERESIS
0xF7  0x00F7  [÷] [÷]                            DIVISION SIGN
0xF8  0x00F8  [ø] [ø]                            LATIN SMALL LETTER O WITH STROKE
0xF9  0x00F9  [ù] [ù]                            LATIN SMALL LETTER U WITH GRAVE
0xFA  0x00FA  [ú] [ú]                            LATIN SMALL LETTER U WITH ACUTE
0xFB  0x00FB  [û] [û]                            LATIN SMALL LETTER U WITH CIRCUMFLEX
0xFC  0x00FC  [ü] [ü]                            LATIN SMALL LETTER U WITH DIAERESIS
0xFD  0x00FD  [ý] [ý]                            LATIN SMALL LETTER Y WITH ACUTE
0xFE  0x00FE  [þ] [þ]                            LATIN SMALL LETTER THORN
0xFF  0x00FF  [ÿ] [ÿ]                            LATIN SMALL LETTER Y WITH DIAERESIS

Saturday, March 7, 2009

CSSCAN fails with ORA-00600, CSS-00152, CSS-00120

Problem Description
While running csscan it fails with error message ORA-600, CSS-00152: failed to enumerate all tables and CSS-00120 as below.

$ csscan system/a FULL=Y FROMCHAR=WE8ISO8859P1 TOCHAR=WE8MSWIN1252 LOG=csscanwin1252
ARRAY=1000000 PROCESS=2

Character Set Scanner v2.1 : Release 10.2.0.0.0 - Production on Sat Mar 7 21:10:05 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options

Enumerating tables to scan...
Warning: Entry/Exit code is optimized. Cannot restore context (UNWIND 22)

ORA-00600: internal error code, arguments: [15160], [], [], [], [], [], [], []
CSS-00152: failed to enumerate all tables
CSS-00120: failed to enumerate tables to scan

Scanner terminated unsuccessfully.

Cause of the Problem
Scan fails because of the existence of tables in the recyclebin.

Solution of the Problem
1)Purge Recyclebin Objects: Query from dba_recyclebin and be sure you need those objects ever. If not purge them. To do this as sys as sysdba issue,

SQL>conn sys as sysdba

SQL>purge dba_recyclebin;

2)Run csscan again.
$ csscan system/a FULL=Y FROMCHAR=WE8ISO8859P1 TOCHAR=WE8MSWIN1252 LOG=csscanwin1252
ARRAY=1000000 PROCESS=2

Related Documents
CSSCAN fails with CSS-00151: failed to enumerate user tables CSS-00120
CSSCAN fails with error while loading shared libraries: libclntsh.so.10.1
How to run csscan in the background as a sysdba
CSSCAN fails with CSS-00107: Character set migration utility schema not installed
ORA-00904: "CNVTYPE" CSS-08888: failed to update conversion type

CSSCAN fails with error while loading shared libraries: libclntsh.so.10.1

Problem Description
While running csscan in order to check all character data in the database and tests for the effects and problems of changing the character set, it fails with error while loading shared libraries: libclntsh.so.10.1: cannot open shared object file as below.

[oracle@dbsoft ~]$ csscan sys/a as sysdba full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4
csscan: error while loading shared libraries: libclntsh.so.10.1: cannot open shared object file: No such file or directory

Cause and solution of the Problem The problem happens due to missing entry of LD_LIBRARY_PATH environmental variable. Proper setting of the parameter will solve the problem. On my 32 bit Red hat linux system setting,
$export LD_LIBRARY_PATH=$ORACLE_HOME/lib
will solve the problem.
Details about this problem is discussed on,

http://arjudba.blogspot.com/2008/09/on-solaris-64-bit-rman-fails-with.html

How to run csscan in the background as a sysdba

With a simple command,
csscan system/test full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4
you can run csscan in order to check all character data in the database and tests for the effects and problems of changing the character set encoding.

As csscan runs in the foreground (by default), so if you exits the terminal from which you run csscan, csscan also stops there. This is quite a pain task whenever you run csscan to another remote computer via ssh or any terminal software as you can't ensure network connectivity. So if network goes your terminal terminates and csscan terminates as well.

In order to solve the problem unix nohup tool is a great rid of our pain. With help of nohup we can run the process in the background and send the output to a text file; thus exiting the terminal remains the process running in the backend. After hours/days we can check the process whether it completed or not.

To run csscan in the background issue following command,
$nohup csscan system/a full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4 &

Note that at the end you have to append an ampersand to send the process in the background.

Later we can check the status of our csscan by,
$ps -ef |grep csscan
to be sure whether scanning is completed or not.

As we know in order to character set scanning process we need to scan full database. And sys is the most powerful user. So to access everything always run csscan as "sys as sysdba." Oracle also recommends to run csscan as a sys user. Thus running csscan as a sys user you might face difficulties. Like ,

[oracle@dbsoft ~]$ csscan sys/a as sysdba full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4
LRM-00108: invalid positional parameter value 'as'
failed to process command line parameters

Scanner terminated unsuccessfully.

[oracle@dbsoft ~]$ csscan userid="sys/a as sysdba" full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4
[1] 8042
LRM-00112: multiple values not allowed for parameter 'userid'

[oracle@dbsoft ~]$ csscan sys/a full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4

Character Set Scanner v2.1 : Release 10.2.0.0.0 - Production on Sat Mar 7 18:03:59 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

ORA-28009: connection as SYS should be as SYSDBA or SYSOPER

Scanner terminated unsuccessfully.
Though you can avoid the lastest erro by following http://arjudba.blogspot.com/2008/05/ora-28009-connection-as-sys-should-be.html that is by setting O7_DICTIONARY_ACCESSIBILITY=TRUE but this is not recommended.

So the issue stands how to run csscan as "sys as sysdba" and the process need to run the background. To do this below is the steps.

Step 01: Run csscan with nohup but only without any userid parameter(username+password).
$ nohup csscan <All options except username/password go here<

For example:
$nohup csscan full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4

Note that at the end there is no ampersand.

Step 02: Press Enter button from keyboard.

Step 03: At this steps all terminal output is being redirected to nohup.out and so, you can't see it, but your terminal is waiting for a username and password input.
So give the password of sys and connect as sysdba.
Like enter following words,
sys/a as sysdba
where the password of user sys is a.

Step 04: Press Enter button from keyboard.

Step 05: At this stage, csscan should be running, in the foreground, and all terminal
output is redirecting to nohup.out.

You still see your shell prompt is there but it takes no keyword. Just
press ctrl+z

Step 06: In the shell prompt type,
bg

You have done it. Now your csscan will run in the background and you may quit your current window or disconnect network or log off the terminal. Process will keep running and you will get the output of terminal in the file nohup.out. Check the status of the process by,

$ps -ef |grep csscan

Note: Make sure that you type the "bg" at the shell after the ctrl+z. If you don't do that process will remain suspended and will not do anything.

From my terminal here is the sample screenshot.

[oracle@dbsoft ~]$ nohup csscan full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4
nohup: appending output to `nohup.out'
sys/a as sysdba
bg

After this I press the cross button to close window.
In a new session I login and I got my progress inside nohup.out.
[oracle@dbsoft ~]$ cat nohup.out
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

Enumerating tables to scan...

. process 2 scanning MAXIMSG.SUBSCRIBERS_CARDS[AAAM6NAAGAAAXwJAAA]
. process 1 scanning MAXIMSG.FIRST_LEG_ACC[AAAM4IAAGAAAD+JAAA]

Remember the alternative. Also you can do above tasks by simply using escape characters with your options like below.

$nohup $ORACLE_HOME/bin/csscan userid=\'sys/a as sysdba\' full=y tochar=WE8MSWIN1252 ARRAY=1024000 process=4 &

Enjoy the post. Keep reading my blog.

Related Documents
CSSCAN fails with CSS-00151: failed to enumerate user tables CSS-00120
CSSCAN fails with error while loading shared libraries: libclntsh.so.10.1
CSSCAN fails with CSS-00107: Character set migration utility schema not installed
ORA-00904: "CNVTYPE" CSS-08888: failed to update conversion type
CSSCAN fails with ORA-00600, CSS-00152, CSS-00120

Wednesday, March 4, 2009

Unicode characterset in Oracle database.

Before starting this post let's have an idea about unicode. Unicode is a Universal encoding scheme which is designed to include far more characters than the normal character set, in fact, Unicode wants to be able to list ALL characters. So, with unicode support in oracle data from any languages can be stored and retrieved from oracle.

Oracle supports unicode within many of the character sets starting from Oracle 7.

Below is the list of character sets that is used to support unicode in oracle.

1)AL24UTFFSS: This character set was the first Unicode character set supported by Oracle. The AL24UTFFSS encoding scheme was based on the Unicode 1.1 standard, which is now obsolete. This unicode character set was used between oracle version 7.2 to 8.1.

2)UTF8: UTF8 was the UTF-8 encoded character set in Oracle8 and 8i. It followed the
Unicode 2.1 standard between Oracle 8.0 and 8.1.6, and was upgraded to Unicode
version 3.0 for oracle versions 8.1.7, 9i, 10g and 11g. If supplementary characters are inserted into in a UTF8 database encoded with Unicode version 3.0, then the actual data will be treated as 2 separate undefined characters, occupying 6 bytes in storage. So for fully support of supplementary characters use AL32UTF8 character set instead of UTF8.

3)UTFE: UTFE has the same properties as UTF8 on ASCII based platforms. As of UTF8 it is used in different oracle versions.

4)AL32UTF8: This is the UTF-8 encoded character set introduced in Oracle9i.
In Oracle 9.2 AL32UTF8 implemented unicode 3.1,
in 10.1 it implemented the Unicode 3.2 standard,
in Oracle 10.2 it supports the Unicode 4.01 standard and
in Oracle 11.1 it supports the Unicode 5.0.

AL32UTF8 was introduced to provide support for the newly defined supplementary characters. All supplementary characters are stored as 4 bytes in AL32UTF8. As while designed UTF-8 there was no concept of supplementary characters therefore UTF8 has a maximum of 3 bytes per character.

5)AL16UTF16: This is the first UTF-16 encoded character set in Oracle. It was introduced in Oracle9i as the default national character set (NLS_NCHAR_CHARACTERSET). It also provides support for the newly defined supplementary characters. All supplementary characters are stored as 4 bytes.
As with AL32UTF8, the plan is to keep enhancing AL16UTF16 as
necessary to support future version of the Unicode standard.

AL16UTF16 cannot be used as a database character set (NLS_CHARACTERSET), it is only used as the national character set (NLS_NCHAR_CHARACTERSET).

Like, AL32UTF8
In Oracle 9.0 AL16UTF16 implemented unicode 3.0,
in Oracle 9.2 it implemented unicode 3.1,
in 10.1 it implemented the Unicode 3.2 standard,
in Oracle 10.2 it supports the Unicode 4.01 standard and
in Oracle 11.1 it supports the Unicode 5.0.

Related Documents
What is NLS_LANG environmental variable?What is database character set and how to check it
Different ways to set up NLS parameters
What is national character set / NLS_NCHAR_CHARACTERSET?
Which datatypes use the National Character Set?
What is character set and character set encoding

Saturday, February 28, 2009

What is NLS_LANG environmental variable?

NLS_LANG is a client side environmental variable. To specify the locale behavior- setting the NLS_LANG environment parameter is the simplest way.

With the setting of NLS_LANG parameter on client machine it is specified the language, territory and character set used by the client application. As through NLS_LANG parameter, client character set is also specified so oracle has an idea which is the character set for data entered or displayed by a client program as well as Oracle can do (if needed) conversion from the client's character set to the database character set.

On UNIX machine NLS_LANG parameter is an environmental variable and on windows machine this value comes from registry settings.

The parameter NLS_LANG holds the following format.
NLS_LANG=[Language]_[Territory].[clients character set]
The default value of NLS_LANG is AMERICAN_AMERICA.US7ASCII which indicates that
The language is AMERICAN,
the territory is AMERICA, and
the character set is US7ASCII.

The first part of NLS_LANG parameter is language and it is used for Oracle Database messages, sorting, day names, and month names. Each language has a unique name.
The language specifies default values for territory and character set so if language is specified then the other two arguments can be omitted. Language can have the value like AMERICAN, GERMAN, FRENCH, JAPANESE etc. The default value is AMERICAN.

The second part of NLS_LANG parameter is territory and it is used for default date, monetary, and numeric formats. Each territory has a unique name. Territory can have the value like AMERICA, FRANCE, JAPAN, CANADA etc. If the territory is not specified, then the value is derived from the language value.

The third part of NLS_LANG parameter is the client character set. It specifies the character set that is used by the client application. The client character set used for Oracle should be equivalent to the character set supported for the client machine. This character set should also be equivalent to or a subset of the character set used for your database so that every character input through the terminal has a matching character to map to in the database. Example of client character set is US7ASCII, WE8ISO8859P1, WE8DEC, WE8MSWIN1252 etc.

It is important to note that all three parts of NLS_LANG environmental variable/parameter are optional. This means if any of the parts are not specified then default value is used- may be the default value is derived value. You can specify Territory and/or character set without language value; in this case your must include the preceding delimiter -underscore (_) for territory and period (.) for character set. If you don't include the delimiter then the whole value is parsed as a language name.

For example you can only set territory portion by,
NLS_LANG=_FRANCE

You can only set client character set portion by,
NLS_LANG=.WE8MSWIN1252

The three parts of NLS_LANG can be specified in many combination but all of the combination may not work properly. Like,
NLS_LANG = JAPANESE_JAPAN.WE8ISO8859P1

This combination can be will not work properly. Beacuse the specification will try to support Japanese by using a Western European character set but WE8ISO8859P1 character set does not support any Japanese characters.

So if you set your NLS_LANG environmental variable above then you can't store or display Japanese character.

Some logical combination,
NLS_LANG = AMERICAN_AMERICA.WE8MSWIN1252
NLS_LANG = FRENCH_CANADA.WE8ISO8859P1
NLS_LANG = JAPANESE_JAPAN.JA16EUC

In server machine there is no need to set NLS_LANG environmental variable. This variable is only needed for client machine. The character set defined for NLS_LANG environmental variable should be the subset or equal to the database character set so that oracle can aware of each character set and thus can convert client character set correctly. It is also important that character set value of NLS_LANG variable should reflect client machine supported character set so that client machine can display that properly. For example if japanese character set is not installed in client machine but NLS_LANG parameter is set as JAPANESE_JAPAN.JA16EUC then client will not be able to see JAPANESE characters properly.

Important Notes About NLS_LANG Parameter
1)NLS_LANG is used to let Oracle know what character set client's OS is using so that Oracle can do (if needed) conversion from the client's character set to the database characterset.

2)Don't think that NLS_LANG needs to be the same as the database characterset.

3)The characterset defined with the NLS_LANG parameter does not change your client's character set. You cannot change the characterset of your client by using a different NLS_LANG setting. NLS_LANG is used to let Oracle know what characterset you are using on the client side.

4)Don't think that, if you don't set the NLS_LANG on the client it uses the NLS_LANG of the server (which is not true). If you don't set it then default NLS_LANG as described earlier in this post is used.

5)If the NLS_LANG variable match with database character set then oracle will perform no validation on the character set; and thus incorrect NLS_LANG settings may cause to enter garbage data into the database.

Related Documents
Unicode characterset in Oracle database.
What is database character set and how to check it
Different ways to set up NLS parameters
What is national character set / NLS_NCHAR_CHARACTERSET?
Which datatypes use the National Character Set?
What is character set and character set encoding

Different ways to set up NLS parameters

The word NLS means National Language Support. The NLS_* parameters determine the
locale-specific behavior on both the client and the server; where * of NLS_* is for various strings which make various NLS parameters.

There are many NLS_* parameters like NLS_SORT, NLS_LANGUAGE, NLS_CHARACTERSET, NLS_DATE_LANGUAGE etc. In this post I will show how the NLS parameters can be set based on their setting of priority.

1)In SQL functions:
If you set NLS_* parameters inside SQL functions then that setting has the highest priority.

You can set in SQL functions like,
TO_CHAR(sysdate, 'DD/MON/YYYY', 'nls_date_language = FRENCH')

Below is an example. Note that in my client machine FRENCH language is not installed so it might not display properly.


SQL> select sysdate from dual;

SYSDATE
---------
07-FEB-09

SQL> select TO_CHAR(sysdate, 'DD/MON/YYYY', 'nls_date_language = FRENCH') from dual;

TO_CHAR(SYSDA
-------------
07/F╔VR./2009

Setting in this way (inside sql functions) overrides the default values that are set for the session in the initialization parameter file, set for the client with environment variables, or set for the session by the ALTER SESSION statement.

2)With the ALTER SESSION statement:
Setting through ALTER SESSION parameter has the second highest priority. Setting by an ALTER SESSION statement override the default values that are set for the session in the initialization parameter file or set by the client with environment variables.

Below is an example. As in my client machine Japanese language is not installed so displaying in Japanese character might not work properly.


SQL> select sysdate from dual;

SYSDATE
---------
07-FEB-09

SQL> alter session set NLS_DATE_LANGUAGE=JAPANESE;

Session altered.

SQL> select sysdate from dual;

SYSDATE
----------
07-2┐  -09

3)Through Environmental variable on the client machine:
This setting has the third highest priority. Through OS environmental variable you can set NLS_* parameters. Setting of environmental variable is platform specific. On windows machine you can set by,
C:>set NLS_*=value;
On unix machine
$export NLS_*=value (bash shell)
$setenv NLS_*=value (c shell)

Below is an example on my windows client machine.
C:\>set NLS_SORT=FRENCH

4)As initialization parameters on the server:
You can set the NLS_* parameters in the server machine inside the initialization parameter file. Setting in the initialization parameter specify a default session NLS environment. Setting in this way has no effect on the client side, they control only the server's behavior.
For example, if you use spfile then you can set NLS_TERRITORY parameter by below,

SQL> ALTER SYSTEM SET NLS_TERRITORY = "CZECH REPUBLIC" scope=spfile;

System altered.
Then in order to effect bounce database.

If I draw a table based on priority and ways to do then it will be like,


Priority         Ways to do the task.
-----------      -----------------------------------------
1 (highest)      Set in SQL functions
2                Set by an ALTER SESSION statement
3                Set as an environment variable
4                Specified in the initialization parameter file
5 (lowest)       Default

Related Documents
Unicode characterset in Oracle database.
What is NLS_LANG environmental variable?
What is database character set and how to check it
What is national character set / NLS_NCHAR_CHARACTERSET?
Which datatypes use the National Character Set?
What is character set and character set encoding

How to know whether there is N-type columns on database

Below query will return the name of the owner and the table whether there is N-type columns in the database.


SQL> select distinct OWNER, TABLE_NAME from DBA_TAB_COLUMNS where DATA_TYPE
in ('NCHAR','NVARCHAR2', 'NCLOB') order by 1;

OWNER                          TABLE_NAME
------------------------------ ------------------------------
SYS                            ALL_REPPRIORITY
SYS                            DBA_AUDIT_EXISTS
SYS                            DBA_AUDIT_OBJECT
SYS                            DBA_AUDIT_STATEMENT
SYS                            DBA_AUDIT_TRAIL
SYS                            DBA_COMMON_AUDIT_TRAIL
SYS                            DBA_FGA_AUDIT_TRAIL
SYS                            DBA_REPPRIORITY
SYS                            DEFLOB
SYS                            STREAMS$_DEF_PROC
SYS                            USER_AUDIT_OBJECT
SYS                            USER_AUDIT_STATEMENT
SYS                            USER_AUDIT_TRAIL
SYS                            USER_REPPRIORITY
SYSTEM                         DEF$_LOB
SYSTEM                         DEF$_TEMP$LOB
SYSTEM                         REPCAT$_PRIORITY

17 rows selected.

The DBA_FGA_AUDIT_TRAIL comes for Fine Grained Auditing.

ALL_REPPRIORITY, DBA_REPPRIORITY, USER_REPPRIORITY, DEF$_TEMP$LOB , DEF$_TEMP$LOB and REPCAT$_PRIORITY comes for Advanced Replication.

DEFLOB comes for Deferred Transactions functionality.

STREAMS$_DEF_PROC comes for Oracle Streams.

Related Documents
Unicode characterset in Oracle database.
What is NLS_LANG environmental variable?
What is database character set and how to check it
Different ways to set up NLS parameters
What is national character set / NLS_NCHAR_CHARACTERSET?
Which datatypes use the National Character Set?
What is character set and character set encoding

Friday, February 27, 2009

Which datatypes use the National Character Set?

There are three datatypes which can store data in the national character set.

1)NCHAR: It is fixed length national character set- character datatype. This datatype uses CHAR length semantics, that is, the length of the NCHAR datatype column is defined in characters.

2)NVARCHAR2: It is variable length national character set- character datatype. This datatype uses CHAR length semantics, that is, the length of the NVARCHAR2 datatype column is defined in characters.

3)NCLOB: It stores national character set data up to four gigabytes. Data is always stored in UCS2 or AL16UTF16, even if the NLS_NCHAR_CHARACTERSET is UTF8.

If you use NCHAR/NVARCHAR2/NCLOB data type then, use the (N'...') syntax when coding these data type so that literals are denoted as being in the national character set by prefixing letter 'N'.

Below is an example.


SQL> create table t_test(col1 NVARCHAR2(30));

Table created.

SQL> insert into t_test values(N'This is NLS_NCHAR_CHARACTERSET');

1 row created.

Related Documents
Unicode characterset in Oracle database.
What is NLS_LANG environmental variable?
What is database character set and how to check it
Different ways to set up NLS parameters
What is national character set / NLS_NCHAR_CHARACTERSET?
What is character set and character set encoding

What is national character set / NLS_NCHAR_CHARACTERSET?

The national character set is the character set which is defined in oracle database in addition to normal character set.

The normal character set is defined by the parameter NLS_CHARACTERSET and the national character set is defined by the parameter NLS_NCHAR_CHARACTERSET.

The national character set is used for data stored in NCHAR, NVARCHAR2 and NCLOB columns while the normal character set is used for data stored in CHAR, VARCHAR2, CLOB columns.

You can get the value of national character set or NLS_NCHAR_CHARACTERSET by,


SQL> select value from nls_database_parameters where parameter='NLS_NCHAR_CHARACTERSET';

VALUE
----------------------------------------
AL16UTF16

SQL> select value$ from sys.props$ where name='NLS_NCHAR_CHARACTERSET';

VALUE$
--------------------------------------------------------------------------------
AL16UTF16

SQL> select property_value from database_properties where property_name
='NLS_NCHAR_CHARACTERSET';

PROPERTY_VALUE
--------------------------------------------------------------------------------
AL16UTF16

NLS_NCHAR_CHARACTERSET is defined when the database is created and specified with the CREATE DATABASE command.

The default value of NLS_NCHAR_CHARACTERSET is AL16UTF16.

From Oracle 9i onwards the NLS_NCHAR_CHARACTERSET can have only 2 values, either UTF8 or AL16UTF16 and both are unicode character sets.

National character set are always defined in CHAR length semantics and you cannot define them in BYTE. That means if you defines NCHAR(5) then 5 maximum characters can be stored regardless of how many bytes they can hold.

Many one thinks that they need to use the NLS_NCHAR_CHARACTERSET to have UNICODE support in oracle but this is not true. One can always use UNICODE in either two ways. Storing data into NCHAR, NVARCHAR2 or NCLOB columns or you can perfectly use "normal" CHAR and VARCHAR2 columns for storing unicode in a database who has a AL32UTF8 / UTF8 NLS_CHARACTERSET.

Thursday, February 26, 2009

What is Oracle Globalization Support

The term Oracle Globalization Support is used for oracle database as oracle database now support to store, process, and retrieve data from all languages. It also ensures that database utilities, error messages, date, time, monetary, numeric, and calendar conventions automatically adapt to any native language and locale in oracle.

Before 9i the term Oracle Globalization Support term was referred as National Language Support(NLS) features. From 9i onwards, NLS is actually a subset of globalization support. NLS is the ability to choose a national language and store data in a specific character set.

The oracle globalization support feature enables you to develop multilingual applications and software products which can be accessed from anywhere in the world and in any languages. In the database you can now store any language you wish.

Related Documents
Unicode characterset in Oracle database.
What is NLS_LANG environmental variable?
What is database character set and how to check it
Different ways to set up NLS parameters
What is national character set / NLS_NCHAR_CHARACTERSET?
Which datatypes use the National Character Set?
What is character set and character set encoding

What is database character set and how to check it

Note that database character set refers to the term character set encoding and in oracle database the terms character set and character set encoding are often used interchangeably.

The database character set in oracle determines the set of characters can be stored in the database. It is also used to determine the character set to be used for object identifiers and PL/SQL variables and for storing PL/SQL program source.

The database character set information is stored in the data dictionary tables named SYS.PROPS$.

You can get the character set used in the database by SYS.PROPS$ table or any other views (like database_properties/ nls_database_parameters) exist in the database. The parameter NLS_CHARACTERSET value contains the database character set name. Get it from,


SQL> select value$ from sys.props$ where name='NLS_CHARACTERSET';

VALUE$
--------------------------------------------------------------------------------
WE8MSWIN1252

SQL> select property_value from database_properties where property_name=
'NLS_CHARACTERSET';

PROPERTY_VALUE
--------------------------------------------------------------------------------
WE8MSWIN1252

SQL> select value from nls_database_parameters where parameter='NLS_CHARACTERSET';

VALUE
----------------------------------------
WE8MSWIN1252

Related Documents

Unicode characterset in Oracle database.

What is NLS_LANG environmental variable?

Different ways to set up NLS parameters

What is national character set / NLS_NCHAR_CHARACTERSET?

Which datatypes use the National Character Set?

What is character set and character set encoding