Thursday, September 18, 2008

How we can minimize our database recovery time greatly

Now a days storage space is no matter in terms of data availability. Everybody always wants or expects the least downtime if any problem happens in the database. Suppose one of my datafile got corrupted. How quickly we can recover data. An efficient backup strategy should always satisfy least downtime in case of any possible problems. In this topic I will give an idea about how we can minimize our recovery time.

We know with RMAN we can take two types of backup. 1)Backupset and 2)Image copy.
By default whenever we give backup database command then RMAN create backupset. We can also take image copy of backup by giving command as BACKUP AS COPY DATABASE. If flash recovery area is enabled then those backups reside in flash recovery area and maintained there based on retention policy of RMAN setting.

Let's now think about the scenario. We have terabytes size of database and one datafile of 100G is corrupted. Now if we would took our backup as backupset then our approach would be:

1)Take the tablespace containing affected datafile offline.
2)Restore the datafile from backupset.
3)Recover the datafile.
4)Make the tablespace online.

Between these four steps step 2 consumes most of the time. If the backupset contains in slow media then restore operation can take more than a day if datafile size is near 100G.

We can minimize our recovery time from days to seconds easily. If we use image copy (bit by bit copy or OS copy of datafile) can we can acheive our goal.

To take an image copy backup of database through RMAN just issue,
RMAN>run {
backup as copy
database;
}

With this command RMAN will create the copies of the datafiles in the Flash Recovery Area with an Oracle-generated name such as o1_mf_users_2rqnthy_.dbf for users tablespace.

Suppose now our users tablespace datafile got corrupted. Then our procedure will be following which will minimize our recovery time greatly.

1)Just get a list of affected file_id by querying,
SQL>file_id from users tablespace by,
select file_id, file_name
from dba_data_files
where tablespace_name = 'USERS';


2)Connect to RMAN and take the affected tablespace offline.
$rman target /
RMAN>sql 'alter tablespace users offline';


3)Switch the affected datafile to the copy in the FRA. We get the file_id of affected datafile in step 1. Suppose we get file_id 4 is corrupted. Then our RMAN command will just point to the image copy of file_id 4 in the flash recovery area.
Which is,
RMAN> switch datafile 4 to copy;

4)Recover the datafile.
RMAN> recover datafile 4;

5)Make tablespace online.
RMAN> sql 'alter tablespace users online';

This way of recovering a datafile just saves restoring time as we did in case of backup of backupsets.

After making tablespace online if you check the location of file_id 4 you will see location is pointed to flash recovery area.
You can check it by,
SQ>select name from v$datafile where file# = 4;

Recovery area may contain slow disk storage. If you want to get back your datafile as of original location then just do the following task.
1. Make an image copy of the datafile at the original location.
RMAN> backup as copy datafile 4 format '/oradata/PROD/users01.dbf';

2. Take the tablespace offline.
RMAN> sql 'alter tablespace users offline';

3. Switch the datafile to the "copy" (Here "copy" points to the original location).
RMAN> switch datafile 4 to copy;

4. Recover the tablespace.
RMAN> recover datafile 4;
or RMAN>recover tablespace users;

5. Place the tablespace online.
RMAN> sql 'alter tablespace users online';

In above example I just shown an example of using flash recovery area and RMAN. However if you don't use RMAN and or flash recovery area then still you can achieve the same goal.

1 comment:

  1. BLOCKRECOVERY with 10g R2 and onwards will be faster.

    Roger N.

    ReplyDelete