Oracle in World: September 2010

Wednesday, September 29, 2010

How to install Oracle 10g on windows vista or windows 7

Step 01: Download the appropriate Oracle software version.
The first step is to download oracle software for windows vista or windows 7. So determine your hardware and operating system in order to determine whether you will download oracle 32 bit or 64 bit. You can follow the post http://arjudba.blogspot.com/2008/05/how-to-identify-os-or-oracle-64-bit-or.html in order to determine which version of oracle you will download.

You can get the oracle software from two ways.
Either download from
OTN (technet.oracle.com)
or from
edelivery (edelivery.oracle.com)

I am showing the downloads way from OTN (technet.oracle.com).
You can directly go to page http://www.oracle.com/technetwork/database/database10g/downloads/index.html in order to download Oracle 10g.

If you want to download 64 bit oracle for your windows vista or windows 7 then go to http://www.oracle.com/technetwork/database/10204-winx64-vista-win2k8-082253.html.

If you want to download 64 bit oracle for your windows vista or windows 7 then go to http://www.oracle.com/technetwork/database/10203vista-087538.html to download.

In summary,
For 32bit Windows: Oracle Database 10g Release 2 (10.2.0.3/10.2.0.4) for Microsoft Windows Vista and Windows 2008

For 64bit Windows: Oracle Database 10g Release 2 (10.2.0.4) for Microsoft Windows Vista x64 and Microsoft Windows Server 2008 x64

With 32 bit downloads you can install on Windows Vista, Windows Server 2008 and Windows 7.
With 64 but downloads you can install on Windows Vista x64, Windows Server 2008 x64, Windows 7 x64 and Windows Server 2008 R2 x64.

Step 02: Run the installer (OUI).
Double click on setup.exe. Ignore the following prerequisite errors displayed by Oracle Universal Installer and complete the installation:
i) Checking operating system requirements
ii) Checking service pack requirements
(Click on the Ignore Errors Check Box)

Another option is to run the following command from the command line:
change to the directory where the Oracle files were extracted.
Then run the command:
> setup.exe -ignoreSysprereqs
(command is case sensitive)
This will ignore the Pre-Requisite checks.

Renaming global database name hang with row cache enqueue lock

Problem Description
Renaming global database name hangs and never completes. That is following statements hangs,

SQL> alter database rename global_name to orcl.world;

From the alert log we see the a tracefile is generated containing following entry.

>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! <<< row cache enqueue: session: 0x11d4fd018, mode: N, request: X

Cause of the Problem
The database was using Database links, which in fact using database GLOBAL_NAME. The ALTER DATABASE RENAME requires an exclusive lock which was waiting for the database link session to end and release their lock on the underlying data dictionary table.

Solution of the Problem
Ensure that currently no active database link session exist and then you can issue rename global_name command.

Alternatively, you can do the following:

SQL> shutdown immediate
SQL> startup restrict
SQL> alter database rename global_name to new name;
SQL> alter system disable restricted session;

Primary DB freezed with waited too long for a row cache enqueue lock

Problem Description
On Oracle database 10.2.0.3.0 Data Guard with Broker configuration whenever there is an attempting to restart the standby in read only, the following error occurs:

ERROR: WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor
ORA-12170: TNS:Connect timeout occurred
PING[ARC6]: Error 3113 when pinging standby
ARC6: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (1089)

The errors occurred in standby database and primary database hangs!

Cause of the Problem
The above error is caused by LGWR process in Oracle Data Guard having RAC environment and it is in fact Oracle bug. The bug number is 7487408.
The bug fires whenever Data Guard is managed by Broker and standby database is shutdown or opened in read only mode. The Primary still tries to ship/send redo streams to standby in SYNC mode and eventually hangs.

Solution of the Problem
- Download the optach 7487408 from oracle support/metalink in order to solve the bug.
- The above bug is fixed in Oracle patchset 10.2.0.5. So you need to upgrade your oracle database to 10.2.0.5.

Tuesday, September 28, 2010

WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!

This article will explain some reasons for which you may encounter "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!" message in the alert log file. When Row cache contention occurs, if the enqueue cannot be obtained within a certain time period, a trace file will be generated in the trace location with some trace details.

The trace file tends to contain the following words:
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! <<

/opt/oracle/admin/EAIAPP/udump/eaiapp1_ora_23288.trc

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production

With the Partitioning, Real Application Clusters, OLAP, Data Mining

and Real Application Testing options

ORACLE_HOME = /opt/oracle/product/dbs

System name: Linux

Node name: db1-eai.prod.stl.cw.intraisp.com

Release: 2.6.18-128.4.1.el5

Version: #1 SMP Thu Sep 23 19:59:19 EDT 2010

Machine: x86_64

Instance name: EAIAPP1

Redo thread mounted by this instance: 1

Oracle process number: 43

Unix process pid: 23288, image: oracle@db1-eai.prod.stl.cw.intraisp.com

*** 2010-09-25 17:08:05.532

*** ACTION NAME:() 2010-09-25 17:08:05.532

*** MODULE NAME:(OMS) 2010-09-25 17:08:05.532

*** SERVICE NAME:(EAIAPP) 2010-09-25 17:08:05.532

*** SESSION ID:(120.29614) 2010-09-25 17:08:05.532

>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! <<<
row cache enqueue: session: 70000001b542d78, mode: N, request: S
row cache parent object: address=700000036f27328 cid=0(dc_tablespaces)
hash=a6840ab5 typ=9 transaction=0 flags=00008000

The trace will often contain a systemstate dump, although most useful information is in the header section. Typically a session holding the row cache resource will either be on cpu or blocked by another session. If it is on cpu then errorstacks are likely to be required to diagnose, unless tuning can be done to reduce the enqueue hold time. Remember that on a RAC environment, the holder may be on another node and so multiple systemstates from each node will be required.

For each enqueue type, there are a limited number of operations that require each enqueue.

1) DC_TABLESPACES : This is probably the most likely is allocation of new extents. If extent sizes are set low then the application may constantly be requesting new extents and causing contention. Do you have objects with small extent sizes that are rapidly growing? (You may be able to spot these by looking for objects with large numbers of extents). Check the trace for insert/update activity, check the objects inserted into for number of extents.

2) DC_SEQUENCES : Check for appropriate caching of sequences for the application requirements.

3) DC_USERS : Deadlock and resulting "WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!" can occur if a session issues a GRANT to a user, and that user is in the process of logging on to the database.

4) DC_OBJECTS : Look for any object compilation activity which might require an exclusive lock and thus block online activity.

5) DC_SEGMENTS : This is likely to be down to segment allocation. Identify what the session holding the enqueue is doing and use errorstacks to diagnose.

6) In many cases no operations are notified in the trace file. Only the session ID is specified. In that case we need to investigate that session id.

Index on a partitioned table waits on 'row cache lock' in RAC environment

Problem Description
Creating Index on a partitioned table fails with Oracle error ORA-04021. This is a 2-node Oracle RAC environment and in the table there is large number of partitions and sub-partitions. Index creation waits on 'row cache lock'. The same is true while dropping a tablespace. Drop Tablespace hangs on Row Cache Enqueue.

Problem Investigation
From the trace file we get following information.

PROCESS 24:

    SO: 0xb9e505908, type: 4, owner: 0xb9f351708, flag: INIT/-/-/0x00
    (session) sid: 477 trans: 0xb7ce39798, creator: 0xb9f351708, flag: (41) USR/- BSY/-/-/-/-/-
              DID: 0001-0018-00000018, short-term DID: 0001-0018-00000019
              txn branch: (nil)
              oct: 9, prv: 0, sql: 0xacafc75e8, psql: 0xb0266cc18, user: 55/TA
    O/S info: user: oracle, term: pts/1, ospid: 16322, machine: rac2
              program: sqlplus@rac2 (TNS V1-V3)
    application name: SQL*Plus, hash value=3669949024
    waiting for 'row cache lock' blocking sess=0x(nil) seq=4750 wait_time=0 seconds since wait started=18
                cache id=2, mode=0, request=5

Process 24 is executing sql: 0xacafc75e8 which is:

LIBRARY OBJECT HANDLE: handle=acafc75e8 mtx=0xacafc7718(1) cdp=1
      name=CREATE INDEX USER_ACTIVITY_IDX ON PROD.USER_ACTIVITY(IMSI, CALLING_IMSI) TABLESPACE HISTORY_IDX_01  LOCAL PARALLEL 32 UNUSABLE

It seems this process is waiting for exclusive enqueue lock mode on dc_segments:

----------------------------------------
        SO: 0xafb5989f0, type: 50, owner: 0xb7ce39798, flag: INIT/-/-/0x00
        row cache enqueue: count=1 session=0xb9e505908 object=0xb004f3690, request=X
        savepoint=0x25a3749
        row cache parent object: address=0xb004f3690 cid=2(dc_segments)
        hash=2997173a typ=11 transaction=(nil) flags=00000000
        own=0xb004f3760[0xb004f3760,0xb004f3760] wat=0xb004f3770[0xafb598a20,0xafb598a20] mode=N
        status=-/-/-/-/-/-/-/-/-
        request=X release=FALSE flags=2
        instance lock id=QC 1cd04d4c 3ab28c03
        data=
        0000006a 000001da 0039918b 00000000 00000000 00000000 00000000 00000000
        00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
        00000000 00000000
        ----------------------------------------

From the trace file it is not clear who is holding this row cache enqueue.

Cause of the Problem
Index creation waits on 'row cache lock' in RAC environment occurred due to oracle bug.
Oracle named this bug as 'Bug 6321551' in version 10.2.0.2
'Bug 8417354' in version 10.2.0.4
'Bug 6004916' in version 11.0.0.0

Solution of the Problem
The bug is Oracle internal unpublished bug.
- This bug is fixed in Oracle version 10.2.0.5. So apply Oracle 10.2.0.5 patchset if you are running Oracle version 10g.

- If your Oracle database version is 10.2.0.2 or 10.2.0.3 you can apply one-off patch 6004916.

- Another workaround is to create the index with only one node being started (single instance).
i) Stop all RAC instance but one.
ii) Run the index creation script on one node.
iii) After process is completed startup rest of the instances.

Monday, September 27, 2010

How to check used or free disk space in ASM Instance

With asmcmd lsdg command it will be displayed mounted disk groups lists and their information. In fact, lsdg queries V$ASM_DISKGROUP_STAT by default and shows disk groups information.

The syntax of lsdg command is as follows,
lsdg [-gH][--discovery][pattern]

If the --discovery flag is specified, then it is queried the V$ASM_DISKGROUP view. The output also includes notification of any current rebalance operation for a disk group. If a disk group is specified, then lsdg returns only information about that disk group.

If -g option is specified then it selects from GV$ASM_DISKGROUP_STAT, or from GV$ASM_DISKGROUP if the --discovery flag is also specified.

From the lsdg command,

Total_MB indicates the size of the disk group in megabytes.

Free_MB indicates free space in the disk group in megabytes, without regard to redundancy. From the V$ASM_DISKGROUP view.

Req_mir_free_MB indiates the amount of space that must be available in the disk group to restore full redundancy after the most severe failure that can be tolerated by the disk group. This is the REQUIRED_MIRROR_FREE_MB column from the V$ASM_DISKGROUP view.

Usable_file_MB indicates the amount of free space, adjusted for mirroring, that is available for new files.

You can directly query from v$asm_diskgroup or V$ASM_DISKGROUP_STAT view or you can issue lsdg within asmcmd tool.

From the V$ASM_DISKGROUP view,

SYS@EAI1>SELECT name, free_mb, total_mb, free_mb/total_mb*100 "%" FROM v$asm_diskgroup;

NAME FREE_MB TOTAL_MB %
------------------------------ ---------- ---------- ----------
DISKGROUP1 112168 245754 45.6423904

From ASMCMD,

ASMCMD> lsdg
State Type Rebal Unbal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Name
MOUNTED EXTERN N N 512 4096 1048576 245754 112168 0 112168 0 DISKGROUP1/

ORA-01110 ORA-01187: cannot read from file because it failed verification tests

Problem Description
In the physical standby database alert log file shows following errors.
Errors in file /u01/app/oracle/diag/rdbms/bdafisdrs/bdafisdc1/trace/bdafisdc1_m001_19676.trc:
ORA-01187: cannot read from file because it failed verification tests
ORA-01110: data file 201: '+DATA/bdafisdrs/tempfile/temp'

Following is the contents found in the generated trace file.

Trace file /u01/app/oracle/diag/rdbms/bdafisdrs/bdafisdc1/trace/bdafisdc1_m001_19676.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, Oracle Label Security,
OLAP, Data Mining, Oracle Database Vault and Real Application Testing option
ORACLE_HOME = /u01/app/oracle/product/11.2.0/dbhome_1
System name: Linux
Node name: DRS-DB-01
Release: 2.6.18-92.el5
Version: #1 SMP Tue Apr 29 13:16:15 EDT 2008
Machine: x86_64
Instance name: bdafisdc1
Redo thread mounted by this instance: 1
Oracle process number: 77
Unix process pid: 19676, image: oracle@DRS-DB-01 (M001)

*** 2010-09-19 11:35:51.595
*** SESSION ID:(103.167) 2010-09-19 11:35:51.595
*** CLIENT ID:() 2010-09-19 11:35:51.595
*** SERVICE NAME:(SYS$BACKGROUND) 2010-09-19 11:35:51.595
*** MODULE NAME:(MMON_SLAVE) 2010-09-19 11:35:51.595
*** ACTION NAME:(Autotask Slave Action) 2010-09-19 11:35:51.595

DDE: Problem Key 'ORA 1110' was flood controlled (0x5) (no incident)
ORA-01110: data file 201: '+DATA/bdafisdrs/tempfile/temp'
ORA-01187: cannot read from file because it failed verification tests
ORA-01110: data file 201: '+DATA/bdafisdrs/tempfile/temp'
Dump of memory from 0x0000000363A3B078 to 0x0000000363A3B3EA
...
...
...
ket_get_active: error 1187
ket_aba_main[2] : error 1187
ket_aba_slave: clearing error 1187

Cause of the Problem
Scenario 01: If your Oracle database version is 8i and you are able to activate the standby database with no errors, however, each time you try to open the standby database in read only mode it fails with the above errors, then possibly it is due to READ_ONLY_OPEN_DELAYED parameter setting.

Scenario 02:
If database files reside on an ocfs and datafiles appear to corrupt to nodes other than "node1" (instance1) in a RAC then problem happened because filesystem was mounted with the "reclaimid" option.

Solution of the Problem
Solution for Scenario 01:
Set READ_ONLY_OPEN_DELAYED parameter to false and then you will be able to open the database in read only mode.
SQL> Alter system set READ_ONLY_OPEN_DELAYED = FALSE scope=spfile;
SQL> shut immediate
SQL> startup

Solution for Scenario 02:
While mounting do not use the reclaimid mount option for "normal" mounts. This option should only be used once after having to regenerate the guid of a node after ip address change.

Step by Step Oracle 11gR2 RAC Installation on Linux

Steps by step RAC Installation

I want to divide total installation work into four parts.

A. Preinstallation Configuration.

B. Installing Oracle Grid Infrastructure.

C. Installing the Oracle Database Software.

D. Creating Oracle Database.

Before I go through each individual steps let’s discuss about some RAC concepts so that you can easily understand the documents.

Concepts & New things related to Oracle 11gR2

Starting with Oracle Database 11g Release 2, Oracle Clusterware and Oracle ASM are installed into a single home directory, which is called the Grid home. Oracle grid infrastructure refers to the installation of the combined products. Oracle Clusterware and Oracle ASM are still individual products, and are referred to by those names.

Oracle Clusterware enables servers, referred to as hosts or nodes, to operate together as if they are one server, commonly referred to as a cluster. Although the servers are standalone servers, each server has additional processes that communicate with other servers. In this way the separate servers appear as if they are one server to applications and end users. Oracle Clusterware provides the infrastructure necessary to run Oracle RAC.

An Oracle Database database has a one-to-one relationship between data files and the instance. An Oracle RAC database, however, has a one-to-many relationship between data files and instances. In an Oracle RAC database, multiple instances access a single set of database files. That’s why datafiles must be kept in a shared storage so that every instances can access. Instances indicate its own memory structures and background processes.

Oracle Automatic Storage Management (ASM) is an integrated, high-performance volume manager and file system. With Oracle Database 11g Release 2, Oracle ASM adds support for storing the Oracle Clusterware OCR and voting disk files. OCR and voting disk files are two component of Oracle clusterware.

To install and configuration Oracle RAC there are several tools that RAC provides. Let’s know the name of those tools.

i) Oracle Universal Installer (OUI) – GUI tool which installs the Oracle grid infrastructure software (which consists of Oracle Clusterware and Oracle ASM)

ii) Cluster Verification Utility (CVU) - a command-line tool that you can use to verify your environment. It is used for both preinstallation as well as postinstallation checks of your cluster environment.

iii) Oracle Enterprise Manager(OEM) – GUI tool for managing single- instance and Oracle RAC environments.

iv) SQL*Plus – Command line interface that enables you to perform database management operations for a database.

v) Server Control (SRVCTL) - A command-line interface that you can use to manage the resources defined in the Oracle Cluster Registry (OCR).

vi) Cluster Ready Services Control (CRSCTL)—A command-line tool that you can use to manage Oracle Clusterware daemons. These daemons include Cluster Synchronization Services (CSS), Cluster-Ready Services (CRS), and Event Manager (EVM).

vii) Database Configuration Assistant (DBCA)—A GUI utility that is used to create and configure Oracle Databases.

viii) Oracle Automatic Storage Management Configuration Assistant (ASMCA)—ASMCA is a utility that supports installing and configuring Oracle ASM instances, disk groups, volumes. It has both a GUI and a non-GUI interface.

ix) Oracle Automatic Storage Management Command Line utility (ASMCMD)—A command-line utility that you can use to manage Oracle ASM instances, Oracle ASM disk groups, file access control for disk groups, files and directories within Oracle ASM disk groups, templates for disk groups, and Oracle ASM volumes.

x) Listener Control (LSNRCTL)—A utility is a command-line interface that you use to administer listeners.

If you use ASMCMD, srvctl, sqlplus, or lnsrctl to manage Oracle ASM or its listener, then use the binaries located in the Grid home, not the binaries located in the Oracle Database home, and set ORACLE_HOME environment variable to the location of the Grid home.

If you use srvctl, sqlplus, or lnsrctl to manage a database instance or its listener, then use the binaries located in the Oracle home where the database instance or listener is running, and set the ORACLE_HOME environment variable to the location of that Oracle home

OUI no longer supports installation of Oracle Clusterware files on block or raw devices.

A. Preinstallation Requirements.

Hardware Requirements.
Network Hardware Requirements.
IP Address Requirements.
OS and software Requirements.
Preparing the server to install Grid Infrastructure.

Hardware Requirements:

The minimum required RAM is 1.5 GB for grid infrastructure for a cluster, or 2.5 GB for grid infrastructure for a cluster and Oracle RAC. To check your RAM issue,

# grep MemTotal /proc/meminfo

The minimum required swap space is 1.5 GB. Oracle recommends that you set swap space to

- 1.5 times the amount of RAM for systems with 2 GB of RAM or less.

- Systems with 2 GB to 16 GB RAM, use swap space equal to RAM.

- Systems with more than 16 GB RAM, use 16 GB of RAM for swap space.

To check swap space issue,

# grep SwapTotal /proc/meminfo

At least you need to have 1 GB of temp space in /tmp. However if you have more it will not hurt any.

To check issue you temp space issue,

# df -h /tmp

You will need at least 4.5 GB of available disk space for the Grid home directory, which includes both the binary files for Oracle Clusterware and Oracle Automatic Storage Management (Oracle ASM) and their associated log files, and at least 4 GB of available disk space for the Oracle Database home directory.

To check space in the OS partition issue,

# df –h

Network Hardware Requirements:

Each node must have at least two network interface cards (NIC), or network adapters. One adapter is for the public network interface and the other adapter is for the private network interface (the interconnect).

You need to install additional network adapters on a node if that node does not have at least two network adapters or has two network interface cards but is using network attached storage (NAS). You should have a separate network adapter for NAS.

Public interface names must be the same for all nodes. If the public interface on one node uses the network adapter eth0, then you must configure eth0 as the public interface on all nodes.

You should configure the same private interface names for all nodes as well. If eth1 is the private interface name for the first node, then eth1 should be the private interface name for your second node.

The private network adapters must support the user datagram protocol (UDP) using high-speed network adapters and a network switch that supports TCP/IP (Gigabit Ethernet or better). Oracle recommends that you use a dedicated network switch.

IP Address Requirements.

You must have a DNS server in order to make SCAN listener work. So, before you proceed installation prepare you DNS server. You must give the following entry manually in your DNS server.

i) A public IP address for each node

ii) A virtual IP address for each node

ii) Three single client access name (SCAN) addresses for the cluster

During installation a SCAN for the cluster is configured, which is a domain name that resolves to all the SCAN addresses allocated for the cluster. The IP addresses used for the SCAN addresses must be on the same subnet as the VIP addresses. The SCAN must be unique within your network. The SCAN addresses should not respond to ping commands before installation.

OS and software Requirements.

To determine which distribution and version of Linux is installed as root user issue,

# cat /proc/version

Be sure your linux version is supported by Oracle dataabase 11gR2.

To determine which chip architecture each server is using and which version of the software you should install, as the root user

issue,

# uname -m

This command displays the processor type. For a 64-bit architecture, the output would be "x86_64".

To determine if the required errata level is installed, as the root user issue,

# uname -r

2.6.9-55.0.0.0.2.ELsmp

From the output kernel version is 2.6.9, and the errata level (EL) is 55.0.0.0.2.ELsmp.

To determine whether the required packages are installed, enter commands similar to the following:

# rpm -q package_name

Without cluster verification utility as well as by running OUI you can determine whether you have missed any packages that is required to install Grid Infrastructure. If you get any package missing you can install it by,

# rpm -Uvh package_name

Preparing the server to install Grid Infrastructure.

i) Synchronize the time between each RAC nodes:

Oracle Clusterware 11g release 2 (11.2) requires time synchronization across all nodes within a cluster when Oracle RAC is deployed. The linux

# dateconfig

Command provides you a GUI through which you can set same timing across all nodes. But for accurate time time synchronization across the nodes you have two options: an operating system configured network time protocol (NTP), or Oracle Cluster Time Synchronization Service(octss).

I recommend to use oracle cluster time synchronization service because it can synchronize time among cluster members without contacting an external time server.

Note that If you use NTP, then the Oracle Cluster Time Synchronization daemon (ctssd) starts up in observer mode. If you do not have NTP daemons, then ctssd starts up in active mode.

If you have NTP daemons on your server but you cannot configure them to synchronize time with a time server, and you want to use Cluster Time Synchronization Service to provide synchronization service in the cluster, then deactivate and deinstall the Network Time Protocol (NTP).

To deactive do the following things:

# /sbin/service ntpd stop

# chkconfig ntpd off

# rm /etc/ntp.conf

or, mv /etc/ntp.conf to /etc/ntp.conf.org.

Also remove the following file:

/var/run/ntpd.pid

ii) Create the required OS users and groups.

# groupadd -g 1000 oinstall

# groupadd -g 1200 dba

# useradd -u 1100 -g oinstall -G dba oracle

# mkdir -p /u01/app/11.2.0/grid

# mkdir -p /u01/app/oracle

# chown -R oracle:oinstall /u01

# chmod -R 775 /u01/

# passwd oracle

iii) Modify the linux kernel parameters.

Open the /etc/sysctl.conf file and change the value like below.

#vi /etc/sysctl.conf

kernel.shmall = 2097152
kernel.shmmax = 536870912
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default=262144
net.core.wmem_default=262144
net.core.rmem_max=262144
net.core.wmem_max=262144

Note that the GUI does not give any warning about kernel.sem parameter setting and if you don’t set this parameter manually then later you may get unforeseen error.

iv) Configure the network.

Determine the cluster name. We set the cluster name as dc-db-cluster

Determine the public, private and virtual host name for each node in the cluster.

It is determined as,

For host dc-db-01 public host name as dc-db-01

For host dc-db-02 public host name as dc-db-02

For host db-db-01 private host name as dc-db-01-priv

For host dc-db-02 private host name as dc-db-02-priv

For host dc-db-01 virtual host name as dc-db-01-vip

For host dc-db-02 virtual host name as dc-db-02-vip

Identify the interface names and associated IP addresses for all network adapters by executing the following command on each node:

# /sbin/ifconfig

On each node in the cluster, assign a public IP address with an associated network name to one network adapter. The public name for each node should be registered with your domain name system (DNS).

Also configure private IP addresses for cluster member nodes in a different subnet in each node.

Also determine the virtual IP addresses for each nodes in the cluster. These addresses and name should be registered in your DNS server. The virtual IP address must be on the same subnet as your public IP address.

Note that you do not need to configure these private, public, virtual addresses manually in the /etc/hosts file.

You can test whether or not an interconnect interface is reachable using a ping command.

Define a SCAN that resolves to three IP addresses in your DNS.

My full IP Address assignment table is as following.

Identity	Host Node	Name	Type	Address	Address static or dynamic	Resolved by
Node 1 Public	dc-db-01	dc-db-01	Public	192.168.100.101	Static	DNS
Node 1 virtual	Selected by oracle clusterware	dc-db-01-vip	Virtual	192.168.100.103	Static	DNS and/ or hosts file
Node 1 private	dc-db-01	dc-db-01-priv	Private	192.168.200.101	Static	DNS, hosts file, or none
Node 2 Public	dc-db-02	dc-db-02	Public	192.168.100.102	Static	DNS
Node 2 virtual	Selected by oracle clusterware	dc-db-02-vip	Virtual	192.168.100.104	Static	DNS and/ or hosts file
Node 2 private	dc-db-02	dc-db-02-priv	Private	192.168.200.102	Static	DNS, hosts file, or none
SCAN vip 1	Select by oracle clusterware	dc-db-cluster	Virtual	192.168.100.105	Static	DNS
SCAN vip 2	Selected by oracle clusterware	dc-db-cluster	Virtual	192.168.100.106	Static	DNS
SCAN vip 3	Selected by oracle clusterware	dc-db-cluster	Virtual	192.168.100.107	Static	DNS

In your /etc/resolve.conf file entry your DNS nameserver address.

# vi /etc/resolve.conf

192.168.100.1

Verify the network configuration by using the ping command to test the connection from each node in your cluster to all the

other nodes.

$ ping -c 3 dc-db-01

$ ping -c 3 dc-db-02

v) Configure shared storages

a. Oracle RAC is a shared everything database. All datafiles, clusterware files, database files must share a common space. Oracle strongly recommends to use ASM type of shared storage.

b. When using Oracle ASM for either the Oracle Clusterware files or Oracle Database files, Oracle creates one Oracle ASM instance on each node in the cluster, regardless of the number of databases.

c. You need to prapare the storage for Oracle Automatic Storage Management(ASM). This storage preparatinon is necessary When you reboot the server, unless you have configured special files for device persistence, a disk that appeared as /dev/sdg before the system shutdown can appear as /dev/sdh as well as permission on the device is changed after the system is restarted.

d. Install the Linux ASMLIB RPMs is the simpliest solution to storage administration. ASMLIB provides persistent paths and permissions for storage devices used with Oracle ASM, eliminating the need for updating udev or devlabel files with storage device paths and permissions.

e. You can download the ASMLIB RPMs browsing http://www.oracle.com/technetwork/topics/linux/downloads/index.html, select downloads

tabs and click on "Linux Drivers for Automatic Storage Management". You will see ASMLIB RMPs for various operating systems like SuSE Linux Enterprise Server 11, SuSE Linux Enterprise Server 10, Red Hat Enterprise Linux 5 AS, Red Hat Enterprise Linux 4 AS, SuSE Linux Enterprise Server 9, Red Hat Enterprise Linux 3 AS, SuSE Linux Enterprise Server 8 SP3, Red Hat Advanced Server 2.1.

Select your OS from the list, Download the oracleasmlib and oracleasm-support packages for your version of Linux. Then you must download the appropriate package for the kernel you are running. Use the uname -r command to determine the version of the kernel

on your server. For example, if your kernel version is 2.6.18-194.8.1.el5 then you need to download oracleasm drivers for kernel 2.6.18-194.8.1.el5.

ASMLib 2.0 is delivered as a set of three Linux packages:

i) oracleasmlib-2.0 - the Oracle ASM libraries

ii) oracleasm-support-2.0 - utilities needed to administer ASMLib

iii)oracleasm - a kernel module for the Oracle ASM library

f. As a root user, install these three packages.

# rpm -Uvh oracleasm-support-2.1.3-1.el4.x86_64.rpm

# rpm -Uvh oracleasmlib-2.0.4-1.el4.x86_64.rpm

# rpm -Uvh oracleasm-2.6.9-55.0.12.ELsmp-2.0.3-1.x86_64.rpm

g. To configure ASMLIB issue,

# oracleasm configure -i

If you enter the command oracleasm configure without the -i flag, then you are shown the current configuration.

After you issue oracleasm configure –i you will be prompted to provide

Default user to own the driver interface (example: oracle),

Default group to own the driver interface (example: dba),

Start Oracle Automatic Storage Management Library driver on boot (y/n): (provide: y), Fix permissions of Oracle ASM disks on boot? (y/n): (provide: y)

After it is run, it

Creates the /etc/sysconfig/oracleasm configuration file

Creates the /dev/oracleasm mount point

Mounts the ASMLIB driver file system

Enter the following command to load the oracleasm kernel module:

# /usr/sbin/oracleasm init

Repeat above steps in all nodes.

h. To mark a disk for use by Oracle ASM, enter the following command syntax, where ASM_DISK_NAME is the name of the Oracle ASM

disk group, and candidate_disk is the name of the disk device that you want to assign to that disk group:

# oracleasm createdisk ASM_DISK_NAME candidate_disk

In other words,

# /usr/sbin/oracleasm createdisk ASM_DISK_NAME device_partition_name

For example,

# oracleasm createdisk data1 /dev/sdf

Note that For Oracle Enterprise Linux and Red Hat Enterprise Linux version 5, when scanning, the kernel sees the devices as /dev/mapper/XXX entries. By default, the 2.6 kernel device file naming scheme udev creates the /dev/mapper/XXX names for human readability. Any configuration using ORACLEASM_SCANORDER should use the/dev/mapper/XXX entries.

So your command would look like,

# oracleasm createdisk data1 /dev/mapper/mpath1

For each disk you need to issue createdisk statment that will be used for Oracle ASM.

If you need to unmark a disk that was used in a createdisk command use,

# /usr/sbin/oracleasm deletedisk disk_name

After you have created all the ASM disks for your cluster, use the listdisks command to verify their availability:

# /usr/sbin/oracleasm listdisks

Note that you need to create the ASM disks only on one node. On all the other nodes in the cluster, use the scandisks command to

view the newly created ASM disks.

# /usr/sbin/oracleasm scandisks

After scanning for ASM disks, display the available ASM disks on each node to verify their availability:

# /usr/sbin/oracleasm listdisks

i. If you use ASMLIB, then you do not need to ensure permissions and device path persistency in udev. If you do not

use ASMLib, then you must create a custom rules file in the path /etc/udev/rules.d/.

Preinstallation configuration is done at this stage. Now we will move to install Oracle Grid Infrastructure.

B. Installing Oracle Grid Infrastructure.

As Oracle run the runInstaller from the Oracle Grid Infrastructure CD room.

$ ./runInstaller

If you don’t have CD then download the software from from http://download.oracle.com/otn/linux/oracle11g/R2/linux_11gR2_grid.zip, unzip and then run the runInstaller.

Now I am providing the each level screenshot in stead of discussing much.

At this stage Oracle Grid Infrasture Installation is successful. Now we need to install Oracle software and create the database.

C. Installing the Oracle Database Software.

Insert the CD ROM which contains the Oracle Database software. And then simply run the runInstaller as oracle user. In order to reduce the size of the document I hereby pasted the screenshot of first and last steps.

$ ./runInstaller

In this stage both Oracle grid infrastructure installation and Oracle database software installation is done. Now you need to create the database.

D. Creating Oracle Database.

Simply login as oracle user.

Set the ORACLE_HOME to your oracle software home not grid home.

$ export ORACLE_HOME=ORACLE_INSTALLATION_HOME_HERE.

$ ./$ORACLE_HOME/bin/dbca

Now the steps are so simple that I would not want to paste screenshot here.

While it invokes Global database name put as,

DB_NAME.world

TNS-12542: TNS:address already in use Linux Error: 98

Problem Description
After changing listener entry while you start the oracle listener it fails with error TNS-12542 like below.

[grid@DC-DB-01 ~]$ lsnrctl start listener2

LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 19-SEP-2010 11:41:48

Copyright (c) 1991, 2009, Oracle.  All rights reserved.

Starting /u01/app/11.2.0/grid/bin/tnslsnr: please wait...

TNSLSNR for Linux: Version 11.2.0.1.0 - Production
System parameter file is /u01/app/11.2.0/grid/network/admin/listener.ora
Log messages written to /u01/app/11.2.0/grid/log/diag/tnslsnr/DC-DB-01/listener2/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER2)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=DC-DB-01)(PORT=1522)))
Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=DC-DB-01-vip)(PORT=1522)))
TNS-12542: TNS:address already in use
TNS-12560: TNS:protocol adapter error
TNS-00512: Address already in use
Linux Error: 98: Address already in use

Listener failed to start. See the error message(s) above...

The XML log file shows following message.

<msg time='2010-09-19T11:42:38.026+06:00' org_id='oracle' comp_id='tnslsnr'
 type='UNKNOWN' level='16' host_id='DC-DB-01'
 host_addr='192.168.100.101'>
<txt>Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=DC-DB-01-vip)(PORT=1522)))
</txt>
</msg>
<msg time='2010-09-19T11:42:38.026+06:00' org_id='oracle' comp_id='tnslsnr'
 type='UNKNOWN' level='16' host_id='DC-DB-01'
 host_addr='192.168.100.101'>
<txt>TNS-12542: TNS:address already in use
TNS-12560: TNS:protocol adapter error
TNS-00512: Address already in use
Linux Error: 98: Address already in use
</txt>
</msg>
<msg time='2010-09-19T11:42:38.027+06:00' org_id='oracle' comp_id='tnslsnr'
 type='UNKNOWN' level='16' host_id='DC-DB-01'
 host_addr='192.168.100.101'>
<txt>No longer listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER2)))
</txt>
</msg>
<msg time='2010-09-19T11:42:38.027+06:00' org_id='oracle' comp_id='tnslsnr'
 type='UNKNOWN' level='16' host_id='DC-DB-01'
 host_addr='192.168.100.101'>
<txt>No longer listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=DC-DB-01)(PORT=1522)))
</txt>
</msg>

Here goes listener.ora file contents.

LISTENER2 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = IPC)(KEY = LISTENER2))
      (ADDRESS = (PROTOCOL = TCP)(HOST =DC-DB-01)(PORT = 1522))
      (ADDRESS = (PROTOCOL = TCP)(HOST =DC-DB-01-vip)(PORT = 1522))      
    )
  )

Cause of the Problem
The "TNS-00512: Address already in use" is occurred as there is duplicate port used in same listener2 entry. It is not possible to start a listener using duplicate TCP port or IPC KEY values in a single listener.ora file configuration.

Solution of the Problem
The solution is edit the listener entry so that there does exist duplicate listener TCP port or IPC key. You can either change the port or IPC key or remove the duplicate entry. So either of the following two entries are valid.

LISTENER2 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = IPC)(KEY = LISTENER2))
      (ADDRESS = (PROTOCOL = TCP)(HOST =DC-DB-01)(PORT = 1522))
  
    )
  )

or,

LISTENER2 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = IPC)(KEY = LISTENER2))
      (ADDRESS = (PROTOCOL = TCP)(HOST =DC-DB-01)(PORT = 1522))
      (ADDRESS = (PROTOCOL = TCP)(HOST =DC-DB-01-vip)(PORT = 1523))      
    )
  )

Sunday, September 26, 2010

ORA-12532: TNS:invalid argument

Problem Description
While connecting to oracle database it fails with ORA-12532: TNS:invalid argument like below.

$ sqlplus system@bddip

SQL*Plus: Release 11.1.0.6.0 - Production on Sun Sep 26 13:18:56 2010

Copyright (c) 1982, 2007, Oracle. All rights reserved.

Enter password:
ERROR:
ORA-12532: TNS:invalid argument

Cause of the Problem
If you look for the Oracle error message for ORA-12532 you will see,

ORA-12532:TNS:invalid argument
Cause:  An internal function received an invalid parameter.
Action:  Not normally visible to the user. For further details, turn on tracing and reexecute the operation. If error persists, contact Oracle Customer Support.

It sounds like oracle bug. If you do tnsping it also fails like below,

$tnsping ddip
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)
(HOST = 192.168.100.1)(PORT = 1521))) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = bddip.com)))
TNS-12532: TNS:invalid argument

If you enable client tracing you will see,

ntt2err: soc  error - operation=1, ntresnt[0]=502, ntresnt[1]=113, ntresnt[2]=0 
ntt2err: exit
nttcni: exit 
nttcon: exit 
nserror: entry 
nserror: nsres: id=0, op=65, ns=12532, ns2=12560; nt[0]=502, nt[1]=113, nt[2]=0; ora[0]=0, ora[1]=0, ora[2]=0
nsopen: unable to open transport

The OS error 113 indicates that there is no route to host in Linux platform.

Based on tracing we can say the problem happened due to firewall settings or there is a network issue.

Note that the problem can also happened due to bad password definition for user that you are using connect. For example in the password there is '@' symbol.

Solution of the Problem
Ensure that firewall is not blocking the connection. You can easily test it from client machine by using telnet.
$ telnet {database server IP} {listener port}

If your database server IP address is 192.168.100.1 and listener port is 1521 then issue,

$ telnet 192.168.100.1 1521

You would get a blank screen with blinking cursor. It will fail to connect if there is either firewall or network transport issue.

Remove underlying firewall.

Also if you have '@' character with the password field then remove the character.

ORA-12709: error while loading create database character set

Problem Description
While mounting oracle database it fails with ORA-12709 like below.

SQL> alter database mount;
alter database mount
*
ERROR at line 1:
ORA-12709: error while loading create database character set

Cause of the Problem
The error ORA-12709 is returned due to incorrect setting of environmental variable NLS_LANG
or ORA_NLS33.

Solution of the Problem
Check your NLS_LANG environmental variable by,
$echo $NLS_LANG.
If it is set to wrong value and your database character set is WE8ISO8859P1 then set by,
$export NLS_LANG=American_America.WE8ISO8859P1

Note that, For Oracle7 V7.3.2 version ORA_NLS33 environmental parameter is called ORA_NLS,
for Oracle7 V7.3.3 and V7.3.4 it is called ORA_NLS32,
for Oracle8 it is called ORA_NLS33 because of NLS libraries version.

When using both Oracle8 V8.x and Developer/2000 V1.6.1 in the same Oracle Home, ORA_NLS33 needs to be set to $ORACLE_HOME/ocommon/nls/admin/datad2k
The environmental variable along with database version is given below.
RDBMS 7.2.x -> ORA_NLS
RDBMS 7.3.x -> ORA_NLS32
RDBMS 8.0.x -> ORA_NLS33
RDBMS 8.1.x -> ORA_NLS33
RDBMS 9.X.X -> ORA_NLS33
RDBMS 10.X -> ORA_NLS10

After you set both parameters correctly login as sys user.

$ sqlplus '/ as sysdba'

SQL*Plus: Release 9.2.0.8.0 - Production on Sun Sep 26 00:40:18 2010

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production

Shutdown and Startup the database.

SQL> startup force;
ORACLE instance started.

Total System Global Area 320300808 bytes
Fixed Size 734984 bytes
Variable Size 285212672 bytes
Database Buffers 33554432 bytes
Redo Buffers 798720 bytes
Database mounted.
Database opened.
SQL>

Adding client to server's list failed, CORBA error: IDL:omg.org/CORBA/COMM_FAILURE:1.0

Problem Description
Whenever you login to your gnome GUI window to your linux machine several pop ups appear. If you click on details you will notice following same messages across all windows.

Adding client to server's list failed, CORBA error: IDL:omg.org/CORBA/COMM_FAILURE:1.0
Adding client to server's list failed, CORBA error: IDL:omg.org/CORBA/COMM_FAILURE:1.0

Problem Investigation
If you look for OS log message entry inside /var/log/messages you will see following entries.

Sep 23 12:34:44 DC-DB-01 gconfd (oracle-23060): Failed to get lock for daemon,
exiting: Directory /tmp/gconfd-oracle has a problem, gconfd can't use it
Sep 23 12:34:44 DC-DB-01 gconfd (oracle-23062): starting (version 2.14.0), pid 2
3062 user 'oracle'
Sep 23 12:34:44 DC-DB-01 gconfd (oracle-23062): Bad permissions 777 on directory
/tmp/gconfd-oracle
Sep 23 12:34:44 DC-DB-01 gconfd (oracle-23062): Failed to get lock for daemon,
exiting: Directory /tmp/gconfd-oracle has a problem, gconfd can't use it
Sep 23 12:34:44 DC-DB-01 gconfd (oracle-23064): starting (version 2.14.0), pid 2
3064 user 'oracle'
Sep 23 12:34:44 DC-DB-01 gconfd (oracle-23064): Bad permissions 777 on directory
/tmp/gconfd-oracle
Sep 23 12:34:44 DC-DB-01 gconfd (oracle-23064): Failed to get lock for daemon,
exiting: Directory /tmp/gconfd-oracle has a problem, gconfd can't use it
Sep 23 12:35:26 DC-DB-01 pcscd: winscard.c:304:SCardConnect() Reader E-Gate 0 0
Not Found
Sep 23 12:35:26 DC-DB-01 last message repeated 4 times
Sep 23 12:35:26 DC-DB-01 gconfd (oracle-23229): starting (version 2.14.0), pid 2
3229 user 'oracle'
Sep 23 12:35:26 DC-DB-01 gconfd (oracle-23228): starting (version 2.14.0), pid 2
3228 user 'oracle'

Cause of the Problem
From OS log message, we see the problem happened due to wrong permission on the file /tmp/gconfd-oracle. It is needed to lock the directory /tmp/gconfd-oracle exclusively but because of 777 permission linux can't lock the file and so gconfd can't use it.

Solution of the Problem
The solution of the problem is give 700 permission and change ownership on the folder /tmp/gconfd-$USER.
If user is root then just do following,
#chmod 700 /tmp/gconfd-root/
#chown -R root:root /tmp/gconfd-root/
Now exit the window and re-login.

You can also try by removing /tmp/orbit- and then restarting X window.
#rm /tmp/orbit-root