ORACLE DATABASE ADMINISTRATOR: 2012

Tuesday, December 11, 2012

Recover archivelog gap between primary to standby using RMAN incremental backups

A Physical Standby database synchs with Primary by continuous apply of archive logs from a Primary Database. In case of an archive log gone missing or corrupt, We have to bring back the standby to sync with the primary.

When the logs missing or corrupt is less in number (say below 15), we can ship the logs which were missing in the standby site from the primary site (scp/sftp/ftp) and then we can register the log file in the standby so that the gap can be resolved.

Find the archives which are missing by issueing the following command.

SQL> select * from v$archive_gap;

This would give the gap sequences. Or you can use the v$managed_standby view to find where the log apply stuck.

SQL> select sequence#,process,status from v$managed_standby;

Here u can see status as wait for log for say sequence# 100 but your primary would've proceeded to sequence# 110

At primary

SQL> select max(sequence#) from v$archived_log; ---> This would show you 110

Copy the logs to the standby site from the primary site

$ scp log_file_name_n.arc oracle@standby:/log/file/location/log_file_name_n.arc

At standby site

SQL> alter database register logfile '/log/file/location/log_file_name_n.arc';

logfile registered

Do the log file registration at the standby site until all the missing log files are registered. Now apply would take place and your standby will become sync with the primary.

This is easy process if you have missing or corrupt logs in lesser number. But when the difference is huge (say around 500 logs) this method is very time consuming and not a proper approach.

Else you have to rebuild the standby database from scratch.

As an enhancement from 10g, an incremental backup created with BACKUP INCREMENTAL... FROM SCN can be used to refresh the standby database with changes at the primary database since the last SCN at Standby and then managed recovery can resume i.e. Compensate for the missing archive logs.

Let us see the steps involved.

Step 1: On the primary:

SQL> select current_scn from v$database;

CURRENT_SCN

-----------

144710998

On the standby:

SQL> select current_scn from v$database;

CURRENT_SCN

-----------

130158742

Clearly there is a difference. But this by itself does not indicate a problem; since the standby is expected to lag behind the primary (this is an asynchronous non-real time apply setup). The real question is how much it is lagging in the terms of wall clock.

To know that use the scn_to_timestamp function to translate the SCN to a timestamp:

SQL> select scn_to_timestamp(144710998) from dual;

SCN_TO_TIMESTAMP(1447102)

-------------------------------

18-AUG-11 08.54.28.000000000 AM

Run the same query to know the timestamp associated with the SCN of the standby database as well

SQL> select scn_to_timestamp(130158742) from dual;

SCN_TO_TIMESTAMP(1301571)

-------------------------------

13-AUG-11 07.19.27.000000000 PM

Note: Run it on the primary database, since it will fail in the standby in a mounted mode

This shows that the standby is four and half days lagging!

Step 2: [Standby] Stop the managed standby apply process:

SQL> alter database recover managed standby database cancel;

Database altered.

Step 3: [Standby] Shutdown the standby database

SQL> shut immediate

Step 4: [Primary] On the primary, take an incremental backup from the SCN number where the standby has been stuck:

RMAN> run {

2> allocate channel c1 type disk format '/u01/backup/%U.bkp';

3> backup incremental from scn 130158740 database;

4> }

Step 5: [Primary] On the primary, create a new standby controlfile:

SQL> alter database create standby controlfile as '/u01/backup/for_standby.ctl';

Database altered.

Step 6: [Primary] Copy these files to standby host:

oracle@dba1 /u01/backup]$ scp * oracle@dba2:/u01/backup

Step 7: [Standby] Bring up the instance in nomount mode:

SQL> startup nomount

Step 8: [Standby] Check the location of the controlfile:

SQL> show parameter control_files

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

control_files string /u01/oradata/standby_cntfile.ctl

Step 9: [Standby] Replace the controlfile with the one you just created in primary.

$ cp /u01/backup/for_standby.ctl /u01/oradata/standby_cntfile.ctl

Step 10: [Standby] Mount the standby database:

SQL> alter database mount standby database;

Step 11: [Standby] Connect to RMAN. RMAN does not know about these files yet; so you must let it know – by a process called cataloging. Catalog these files:

$ rman target=/

RMAN> catalog start with '/u01/backup';

Step 12: Recover these files:

RMAN> recover database;

Step 13: After some time, the recovery fails with the message:

archive log filename=/u01/oradata/1_18108_697108460.dbf thread=1 sequence=18109

ORA-00310: archived log contains sequence 18108; sequence 18109 required

This happens because we have come to the last of the archived logs. The expected archived log with sequence# 18108 has not been generated yet.

Step 14: At this point exit RMAN and start managed recovery process:

SQL> alter database recover managed standby database disconnect from session;

Database altered.

Step 15: Check the SCN’s in primary and standby:

[Standby] SQL> select current_scn from v$database;

CURRENT_SCN

-----------

144747125

[Primary] SQL> select current_scn from v$database;

CURRENT_SCN

-----------

144747111

Now they are very close to each other. The standby has now caught up.

Monday, November 19, 2012

crsctl - Cluster Ready Service Control

To Get help
$crsctl -h

To Get Active Cluster Version

$crsctl query crs activeversion

To Get cluster version of given node

$crsctl query crs softwareversion [node_name]

Ex:$crsctl query crs softwareversion rac1

#crsctl start crs

#crsctl stop crs

(or)

#/etc/init.d/init.crs start

#/etc/init.d/init.crs stop

#crsctl enable crs

#crsctl disable crs

(or)

#/etc/init.d/init.crs enable

#/etc/init.d/init.crs disable

To get cluster stack health status

$crsctl check crs

To get the viability of CSS across nodes

$crsctl check cluster [-node node_name] -- 11gR2 RACcommand

#crsctl start cluster -n HostName -- 11g R2

#crsctl stop cluster -n HostName -- 11g R2

#crsctl stop cluster -all -- 11g R2

$ps -ef | grep d.bin

$crsctl check cssd

$crsctl check crsd

$crsctl check evmd

$crsctl check oprocd

$crsctl check ctss

#/etc/init.d/init.cssd stop

#/etc/init.d/init.cssd start

#/etc/rc.d/init.d/init.evmd

#/etc/rc.d/init.d/init.cssd

#/etc/rc.d/init.d/init.crsd

#mv /etc/rc3.d/S96init.cssd /etc/rc3.d/_S96init.cssd -- to stop cssd from autostarting after reboot

#crsctl check css votedisk

#crsctl query css votedisk -- lists the voting disks used by CSS command in 11gR2

#crsctl add css votedisk PATH

#crsctl add css votedisk PATH -force -- if Clusterware is not running

#crsctl delete css votedisk PATH

#crsctl delete css votedisk PATH -force -- if Clusterware is not running

#crsctl set css parameter_name value -- set parameters on OCR

#crsctl set css misscount 100

#crsctl unset css parameter_name -- sets CSS parameter to its default

#crsctl unset css misscount

#crsctl get css parameter_name -- gets the value of a CSS parameter

#crsctl get css disktimeout

#crsctl get css misscount

#crsctl get css reboottime

#crsctl start resources -- starts Clusterware resources

#crsctl start resource ora.DATA.dg
#crsctl stop resources -- stops Clusterware resources

$crsctl status resource

$crsctl status resource -t

$crsctl stat resource -t

#crsctl lsmodules crs -- lists CRS modules that can be used for debugging( need root privilage)

List CRSD Debug Module: AGENT

List CRSD Debug Module: AGFW

List CRSD Debug Module: CLSFRAME

List CRSD Debug Module: CLSVER

List CRSD Debug Module: CLUCLS

List CRSD Debug Module: COMMCRS

List CRSD Debug Module: COMMNS

List CRSD Debug Module: CRSAPP

List CRSD Debug Module: CRSCCL

List CRSD Debug Module: CRSCEVT

List CRSD Debug Module: CRSCOMM

List CRSD Debug Module: CRSD

List CRSD Debug Module: CRSEVT

List CRSD Debug Module: CRSMAIN

List CRSD Debug Module: CRSOCR

List CRSD Debug Module: CRSPE

List CRSD Debug Module: CRSPLACE

List CRSD Debug Module: CRSRES

List CRSD Debug Module: CRSRPT

List CRSD Debug Module: CRSRTI

List CRSD Debug Module: CRSSE

List CRSD Debug Module: CRSSEC

List CRSD Debug Module: CRSTIMER

List CRSD Debug Module: CRSUI

List CRSD Debug Module: CSSCLNT

List CRSD Debug Module: OCRAPI

List CRSD Debug Module: OCRASM

List CRSD Debug Module: OCRCAC

List CRSD Debug Module: OCRCLI

List CRSD Debug Module: OCRMAS

List CRSD Debug Module: OCRMSG

List CRSD Debug Module: OCROSD

List CRSD Debug Module: OCRRAW

List CRSD Debug Module: OCRSRV

List CRSD Debug Module: OCRUTL

List CRSD Debug Module: SuiteTes

List CRSD Debug Module: UiServer

$crsctl lsmodules css -- lists CSS modules that can be used for debugging

The following are the Cluster Synchronization Services modules::

CSSD

COMMCRS

COMMNS

CLSF

SKGFD

$crsctl lsmodules evm -- lists EVM modules that can be used for debugging

The following are the Cluster Synchronization Services modules::

CSSD

COMMCRS

COMMNS

CLSF

SKGFD

$crsctl start has (HAS - High Availability Services)
$crsctl stop has

$crsctl check has

CRS-4638 Oracle High Availability Service is online

OCR Modules -- cannot be listed with crsctl lsmodules command

OCRAPI

OCRCLI

OCRSRV

OCRMAS

OCRMSG

OCRCAC

OCRRAW

OCRUTL

OCROSD

#crsctl debug statedump crs -- dumps state info for crs objects

#crsctl debug statedump css -- dumps state info for css objects

#crsctl debug statedump evm -- dumps state info for evm objects

#crsctl debug log crs [module:level]{,module:level} ...

-- Turns on debugging for CRS

#crsctl debug log crs CRSEVT:5,CRSAPP:5,CRSTIMER:5,CRSRES:5,CRSRTI:1,CRSCOMM:2

#crsctl debug log css [module:level]{,module:level} ...

-- Turns on debugging for CSS

#crsctl debug log css CSSD:1

#crsctl debug log evm [module:level]{,module:level} ...

-- Turns on debugging for EVM

#crsctl debug log evm EVMCOMM:1

#crsctl debug trace crs -- dumps CRS in-memory tracing cache

#crsctl debug trace css -- dumps CSS in-memory tracing cache

#crsctl debug trace evm -- dumps EVM in-memory tracing cache

#crsctl debug log res resource_name:level -- turns on debugging for resources

#crsctl debug log res "ora.lnx04.vip:1"

#crsctl trace all_the_above_commands -- tracing by adding a "trace" argument.

#crsctl trace check css

#crsctl backup -h

#crsctl backup css votedisk

Here is the list of the options for CRSCTL in 11gR2:

crsctl add - add a resource, type or other entity

crsctl backup - back up voting disk for CSS

crsctl check - check a service, resource or other entity

crsctl config - output autostart configuration

crsctl debug - obtain or modify debug state

crsctl delete - delete a resource, type or other entity

crsctl disable - disable autostart

crsctl discover - discover DHCP server

crsctl enable - enable autostart

crsctl get - get an entity value

crsctl getperm - get entity permissions

crsctl lsmodules - list debug modules

crsctl modify - modify a resource, type or other entity

crsctl query - query service state

crsctl pin - Pin the nodes in the nodelist

crsctl relocate - relocate a resource, server or other entity

crsctl replace - replaces the location of voting files

crsctl release - release a DHCP lease

crsctl request - request a DHCP lease

crsctl setperm - set entity permissions

crsctl set - set an entity value

crsctl start - start a resource, server or other entity

crsctl status - get status of a resource or other entity

crsctl stop - stop a resource, server or other entity

crsctl unpin - unpin the nodes in the nodelist

crsctl unset - unset a entity value, restoring its default

How do I identify the voting disk/file location?
#crsctl query css votedisk

How to take backup of voting file/disk?
crsctl backup css votedisk

Initialization parameters in Oracle RAC 11gR2

SPECIAL parameters in RAC:
instance_number
instance_group
thread
cluster_database
cluster_database_instances
cluster_interconnects
remote_listener
local_listener
parallel_instance
max_commit_propagation_delay

Parameters with SAME VALUE across all instances:
active_instance_count
archive_lag_target
compatible
cluster_database
cluster_database_instances
cluster_interconnects
control_files
db_block_size
db_domain
db_files
db_name
db_recovery_file_dest
db_recovery_file_dest_size
db_unique_name
dml_locks                     -- when 0
instance_type                -- rdbms or asm
max_commit_propagation_delay
parallel_max_servers
remote_login_password_file
trace_enabled
undo_management

For example,
*.cluster_database=true
*.cluster_database_instances=2
*.compatible='11.2.0.2'
*.undo_management='AUTO'

Parameters with UNIQUE Values across all instances:
instance_number
instance_name
thread
undo_tablespace/rollback_segments

For example,
INST1.instance_number=1
INST1.instance_name=rac1
INST1.thread=1
INST1.undo_tablespace='UNDOTBS1'

INST2.instance_number=2
INST2.instance_name=rac2
INST2.thread=2
INST2.undo_tablespace='UNDOTBS2'

OCRCHECK Command in 11gR2 RAC

$ ocrcheck -h
Name:
        ocrcheck - Displays health of Oracle Cluster/Local Registry.

Synopsis:
        ocrcheck [-config] [-local]

-config       Displays the configured locations of the Oracle Cluster Registry.
                This can be used with the -local option to display the configured
                location of the Oracle Local Registry
-local        The operation will be performed on the Oracle Local Registry.

Notes:
        A log file will be created in
        $ORACLE_HOME/log/<hostname>/client/ocrcheck_<pid>.log.
        File creation privileges in the above directory are needed
        when running this tool.

$ ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2736
         Available space (kbytes) :     259384
         ID                       :   33615009
         Device/File Name         :      +DATA
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured

         Cluster registry integrity check succeeded
         Logical corruption check succeeded

$ocrcheck -local
Status of Oracle Local Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2172
         Available space (kbytes) :     259948
         ID                       : 1558172427
         Device/File Name         : /u01/app/11.2.0/grid/cdata/rac3.olr
                                    Device/File integrity check succeeded

         Local registry integrity check succeeded
         Logical corruption check succeeded

$ ocrcheck -config
Oracle Cluster Registry configuration is :
         Device/File Name         :      +DATA

$ ocrcheck -local -config
Oracle Local Registry configuration is :
         Device/File Name         : /u01/app/11.2.0/grid/cdata/rac3.olr

NOTE: Debugging can be controlled through $CRS_HOME/srvm/admin/ocrlog.in

$ ocrcheck -local    ---(need root permission)
PROTL-602: Failed to retrieve data from the local registry
PROCL-26: Error while accessing the physical storage Operating System error [Permission denied] [13]

# ocrcheck -local
Status of Oracle Local Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2172
         Available space (kbytes) :     259948
         ID                       : 1558172427
         Device/File Name         : /u01/app/11.2.0/grid/cdata/rak3.olr
                                    Device/File integrity check succeeded

         Local registry integrity check succeeded
         Logical corruption check succeeded

ORACLE DATABASE ADMINISTRATOR

Search

Tuesday, December 11, 2012

Recover archivelog gap between primary to standby using RMAN incremental backups

Monday, November 19, 2012

crsctl - Cluster Ready Service Control

Initialization parameters in Oracle RAC 11gR2

OCRCHECK Command in 11gR2 RAC

Followers