Search

Sunday, June 19, 2016

To find segment name from Corrupt Block

Find segment name from Corrupt Block:

Message from Alert Log :


Fri Apr 01 13:56:27 2016
Errors in file /u01/app/oracle/diag/rdbms/prod/PROD/trace/PROD_pr04_4915260.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/u02/flash_recovery_area/PROD/PROD/onlinelog/o1_mf_1_chsdf09j_.log'
Errors in file /u01/app/oracle/diag/rdbms/prod/PROD/trace/PROD_pr04_4915260.trc (incident=4243):
ORA-00600: internal error code, arguments: [kdBlkCheckError], [80], [1288987], [6264], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/PROD/PROD/incident/incdir_4243/PROD_pr04_4915260_i4243.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Slave exiting with ORA-10562 exception
Errors in file /u01/app/oracle/diag/rdbms/prod/PROD/trace/PROD_pr04_4915260.trc:
ORA-10562: Error occurred while applying redo to data block (file# 80, block# 1288987)
ORA-10564: tablespace REGISTRATION
ORA-01110: data file 80: '/u02/flash_recovery_area/PROD/ORADATA/hr01.dbf'
ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 89532
ORA-00600: internal error code, arguments: [kdBlkCheckError], [80], [1288987], [6264], [], [], [], [], [], [], [], []
............................


In the alert log we found that datafile 80 has corrupt block and the dartafile is hr01.dbf. The corrupted block is "1288987".

Then run below query to find segment name :

1) Query-1: Find segment name with owner

SQL> select * from dba_extents where file_id = 80 and 1288987 between block_id and block_id + blocks -1 ; 


2) Query-2: Find object details

SQL> select * from dba_objects where object_name='MLOG$_PATIENT' and object_type='TABLE'; 





Fix : DBV-00008: USERID must be specified for OSM files

One of my friend get below error while running dbv command for a datafile which was available in ASM disk group.

Err:   DBV-00008: USERID must be specified for OSM files

Here is sample error:

$ dbv file='+DATA/PROD/datafile/sample01.dbf'

DBVERIFY: Release 11.2.0.4.0 - Production on Fri Apr 1 14:02:30 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


DBV-00008: USERID must be specified for OSM files


So, you give the username and password via userid, then it will work fine.

Sample Example :

$ dbv userid=registration/reg40 file=+DATA/PROD/datafile/sample01.dbf 

DBVERIFY: Release 11.2.0.4.0 - Production on Fri Apr 1 14:05:27 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = +DATA/PROD/datafile/sample01.dbf


DBVERIFY - Verification complete

Total Pages Examined         : 1365512
Total Pages Processed (Data) : 836672
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 84780
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 360647
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 83413
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 0 (0.0)




Fix block corruption via RMAN - Pains & Gains of NOLOGGING mode in Oracle database

Fix block corruptions via RMAN using Oracle 11gR2
Pains & Gains of NOLOGGING mode in Oracle database

a) About NOLOGGING
b) FORCE LOGGING Options
c) Gains of NO LOGGING
d) Detection of Nologging Operations On the Primary and Standby Databases
e) An workaround to fix block corruption via RMAN

f) Best practice to avoid block corruptions
------------------------------------------------------------------------



a) About NoLOGGING Operation:


The FORCE LOGGING option is the safest method to ensure that all the changes made in the database will be captured and available for recovery in the redo logs. Force logging is the new feature added to the family of logging attributes.

Before the existence of FORCE LOGGING, Oracle provided logging and nologging options. These two options have higher precedence at the schema object level than the tablespace level; therefore, it was possible to override the logging settings at the tablespace level with nologging setting at schema object level.

The FORCE LOGGING option can be set at the database level or the tablespace level. The precedence is from database to tablespace. If a tablespace is created or altered to have FORCE LOGGING enabled, any change in that tablespace will go into the redo log and be usable for recovery.

Similarly, if a database is created or altered to have the FORCE LOGGING enabled, any change across the database, with exception of temporary segments and temporary tablespace, will be available in redo logs for recovery. The FORCE LOGGING option can be set at database creation time or later using the alter database command.

To enable FORCE LOGGING after the database is created, use the following command:

ALTER DATABASE FORCE LOGGING;

b) FORCE LOGGING Options:

1) Database level:
  -- Enable
SQL> ALTER DATABASE FORCE LOGGING;
  -- Disable
SQL> ALTER DATABASE NO FORCE LOGGING;


b) Tablespace Level:
-- Disable
SQL> ALTER TABLESPACE <tablespace_name> NO FORCE LOGGING;
-- Enable
SQL> ALTER TABLESPACE <tablespace name> FORCE LOGGING;

c) Table level:
-- Disable
SQL> ALTER TABLE <table_name> NOLOGGING;
-- Enable
SQL> ALTER TABLE <table_name> NOLOGGING;
Note:

Temporary tablespaces and temporary segments have no effect during FORCE LOGGING mode because these objects do not generate any redo. Undo tablespaces are in FORCE LOGGING mode by default, so they cannot be put into FORCE LOGGING mode. Oracle will generate an error if an attempt is made to put a temporary tablespace or undo tablespace into FORCE LOGGING mode.

The FORCE_LOGGING column of v$database view can be queried to verify that the database is in FORCE LOGGING mode. Similarly, the FORCE_LOGGING column of dba_tablespaces view provides the same logging information for each tablespace.

select force_logging from v$database;
select force_logging from dba_tablespaces;



Warning !!!

Putting a database in FORCE LOGGING mode will have some performance impact.


c) Gains of NO LOGGING:

Nologging operations indicate that the database operation is not logged in the online redo log file. Even though a small invalidation redo record is still written to the online redo log file, nologging operations skip the redo generation of the corresponding DML data.  Nologging can be extremely beneficial for the following reasons:
  • data written to the redo is minimized dramatically
  • time to insert into a large table or index or LOB can be reduced dramatically
  • performance improves for parallel creation of large tables or indices
However, NOLOGGING is intended for configurations in which media recovery or the recovery of the corresponding object is not important. Thus, if the disk or tape or storage media fails, you will not be able to recover your changes from the redo because the changes were never logged. 

d) Detection of Nologging Operations On the Primary and Standby Databases

On the primary database, you can monitor for the most recent nologging operation that occurred in the database by issuing the following query:

         SELECT NAME, UNRECOVERABLE_CHANGE#,               
         TO_CHAR (UNRECOVERABLE_TIME,'DD-MON-YYYY HH:MI:SS')
         FROM V$DATAFILE;


The above primary database's query dictates when the most recent nologging operation occurred and when the invalidation redo was written to the redo.  Once Redo Apply (or Media Recovery) processes the invalidation redo, it marks all the corresponding data blocks corrupt.  You will detect encounter corrupted blocks on the physical standby database when you query any data that references these data blocks.   You will receive the following errors: 

                ORA-01578: ORACLE data block corrupted (file # 3, block # 514)
                ORA-01110: data file 3: '/oracle/oradata/PROD/users.dbf'
                ORA-26040: Data block was loaded using the NOLOGGING option


You can proactively catch some of these corrupted blocks on Redo Apply (or media recovery) instance by running DBVERIFY on the data files.

Example :
           
$ cd /oracle/oradata/PROD
$ dbv file=users01.dbf

DBVERIFY: Release 11.2.0.4.0 - Production on Tue Aug 25 16:41:53 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /oracle/oradata/PROD/users01.dbf

DBV-00201: Block, DBA 17162242, marked corrupt for invalid redo application
DBV-00201: Block, DBA 17162244, marked corrupt for invalid redo application
...
...
DBV-00201: Block, DBA 17162493, marked corrupt for invalid redo application


DBVERIFY - Verification complete

Total Pages Examined         : 407680
Total Pages Processed (Data) : 221830
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 1257
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 175354
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 9239
Total Pages Marked Corrupt   : 111
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 2355429844 (29.2355429844)
 

$ SQL apply ignores the invalidation redo since it cannot convert it to any reasonable SQL; so, the logical standby will not receive any immediate errors.   If future transactions reference the missing data, then apply slave will receive an ORA-01403 in the alert.log.  

For example, the following UPDATE statement failed on the logical standby because it was referencing  'nologged' rows that do not exist on the logical standby database.

-- Repair of Nologged Changes on the Physical and Logical Standby Databases:

After a nologged operation on the primary is detected, it is recommended to create a backup immediately if you want to recover from this operation in the future.  However there are additional steps required if you have an existing physical or logical standby database.    This is crucial if you want to preserve the data integrity of your standby databases.

For a physical standby database, Redo Apply will process the invalidation redo and mark the corresponding data blocks corrupt.

For a physical standby database, follow these steps to reinstantiate the relevant data files .

    1. stop Redo Apply (recover managed standby database cancel)
    2. offline corresponding datafile(s) (alter database datafile <NAME> offline drop;)
    3. start Redo Apply (recover managed standby database disconnect)
    4. copy the appropriate backup datafiles over from the primary database (e.g. use RMAN to backup datafiles and copy them)
    5. stop Redo Apply (recover managed standby database cancel)
    6. online corresponding data files (alter database datafile <NAME> online;)
    7. start Redo Apply (recover managed standby database disconnect)

======================
Solution with an workaround:
======================

As per above section, we can use this procedure if RMAN backup has not failed while reading the corrupt block. To determine the same, run a backup on the datafile having the corrupt block:

RMAN> backup check logical datafile 7 format '/u03/backup/%U' tag 'CORRUPT_BLK_FILE_BKP';

Starting backup at 24-AUG-15
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00007 name=/oracle/oradata/PROD/demo01.dbf
channel ORA_DISK_1: starting piece 1 at 24-AUG-15
channel ORA_DISK_1: finished piece 1 at 24-AUG-15
piece handle=/u03/backup/1jnbhl5c_1_1 tag=CORRUPT_BLK_FILE_BKP comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 24-AUG-15


Ensure that the location where backup-piece will be created has sufficient space. You can change this destination using the FORMAT clause. Also, even though the backup seems successful, crosscheck using
below query that the corrupt block is not contained in the backup:

SQL> select BP.HANDLE, BP.COMPLETION_TIME, BC.FILE#, BC.BLOCK#, BC.BLOCKS, BC.MARKED_CORRUPT, BC.CORRUPTION_TYPE
    from V$BACKUP_PIECE BP, V$BACKUP_CORRUPTION BC
    where BP.SET_COUNT = BC.SET_COUNT and
          BP.SET_STAMP = BC.SET_STAMP and
          BP.TAG = 'CORRUPT_BLK_FILE_BKP';


no rows selected.

If the above query returns rows showing the corrupt block, we cannot use this procedure. In above case, since it has not returned rows, we can be sure that RMAN has skipped the corrupt block due to unused block optimization algorithm described above. Now, if the datafile is restored from this backup, RMAN will format and restore a empty copy of the corrupt block which can then be used for rman block recovery as below.

1. Restore the datafile to alternate location:

RMAN> run {
2> set newname for datafile 7 to '/oracle/oradata/PROD/demo01_RESTORED.dbf';
3> restore datafile 7 from tag 'CORRUPT_BLK_FILE_BKP';
4> }

executing command: SET NEWNAME

Starting restore at 24-AUG-15
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00007 to /oracle/oradata/PROD/demo01_RESTORED.dbf
channel ORA_DISK_1: reading from backup piece /u03/backup/1jnbhl5c_1_1
channel ORA_DISK_1: piece handle=/u03/backup/1jnbhl5c_1_1 tag=CORRUPT_BLK_FILE_BKP
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
Finished restore at 24-AUG-15


2. Run DBV on the restored datafile to verify it is free from corruption:

$ dbv file=/oracle/oradata/PROD/demo01_RESTORED.dbf blocksize=8192


DBVERIFY: Release 11.2.0.4.0 - Production on Tue Aug 25 17:01:13 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /oracle/oradata/PROD/demo01_RESTORED.dbf

DBVERIFY - Verification complete

Total Pages Examined         : 12800
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 12799
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 775154 (0.775154)


3. Run BLOCKRECOVER command as below to repair the corrupt block.

The corrupt block will be replaced by the empty, formatted block from the restored datafile:

RMAN> blockrecover datafile 7 block 150 FROM DATAFILECOPY;

Starting recover at 24-AUG-15
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=22 device type=DISK

channel ORA_DISK_1: restoring block(s) from datafile copy /oracle/oradata/PROD/demo01_RESTORED.dbf

starting media recovery
media recovery complete, elapsed time: 00:00:01


Finished recover at 24-AUG-154. 

Run DBV on the original file to confirm that it is now free from corruption:


$ dbv file=/oracle/oradata/PROD/demo01.dbf blocksize=8192

DBVERIFY: Release 11.2.0.4.0 - Production on Tue Aug 25 17:11:33 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /oracle/oradata/PROD/demo01.dbf

DBVERIFY - Verification complete

Total Pages Examined         : 12800
Total Pages Processed (Data) : 356
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 152
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 12292
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 775154 (0.775154)


5. Additionally, you can run VALIDATE in RMAN to verify further:

RMAN> backup validate check logical datafile 7;

Starting backup at 24-AUG-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=22 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00007 name=/oracle/oradata/PROD/demo01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
7    OK     0              12292        12801           775154
  File Name: /oracle/oradata/PROD/demo01.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              356
  Index      0              0
  Other      0              152

Finished backup at 24-AUG-15


f) Best practices to avoid block corruptions:

Regular and proactive checking of the database for corruptions:

1. Run the DBVerify utility against the datafile at a periodic intervals, to check for any physical corruption.

e.g.,
dbv file=system01.dbf blocksize=8192

2.  Run object level analyze command at a periodic intervals. This will check for logical inconsistencies and even detect physical corruptions. It is important to note that this command locks the object, which is  being  analyzed so need to be performed at off-peak hours. The online option available with analyze helps
in removing this restriction to some extent.( IN OFF PAEK HOURS only)

e.g.,
SQL> Analyze table <user>.<table_name> validate structure cascade [online];
SQL> Analyze index <user>.<index_name/cluster_name> validate structure;
     
   For partition table analyze, the utlvalid.sql script must be run to create invalid_rows table If not run you need to run the below analyze command instead.

SQL> Analyze table <user>.<table_name> partition <partition_name> validate structure cascade into invalid_rows;

3. Always take full database export backup ( use EXPDP/ EXP)

e.g.,
expdp directory=dir_name dumpfile=dump_name.dmp logfile=log_name.log full=y

Note : As per requirement you can add many more parameters. Use expdp -help to see details.

to read more about expdp

4. Ensuring the data free of corruptions when doing a bulk load. After the dataload, perform normal
   validations(select/analyze/export) to detect corruptions, if introduced while loading.

5. Use RMAN to check the existence of physical and logical corruption.

e.g.to validate the complete database

RMAN> BACKUP CHECK LOGICAL VALIDATE DATABASE;

Note: See the above document to know more about RMAN validation.

6. Use DBMS_REPAIR package to verify corruption in an object and can use this package to mark the block as soft corrupt.

DBMS_REPAIR.CHECK_OBJECT :CHECK_OBJECT procedure checks the specified object and populates the repair
table with information about corruption and repair directive(s). Validation consists of block checking all blocks in the object. All blocks previously marked corrupt will be skipped.
To read more... Block corruption & Correction: a case study 

7. Run Memory/Hardware diagnostics periodically.

8. Set block checking parameters.

DB_BLOCK_CHECKING = TRUE (FULL from 10.2 onwards)
DB_BLOCK_CHECKSUM = TRUE (FULL from 10.2 onwards)
_DB_BLOCK_CHECK_FOR_DEBUG = TRUE

Note : Enable these parameter has performance impact. Don't go above parameters unless untill load test clarification. See more documents in Oracle site.

11g specific:

* DB_ULTRA_SAFE = { OFF | DATA_ONLY | DATA_AND_INDEX }

Click here to read "Best Practices for Avoiding and Detecting Corruption" (Doc ID 428570.1)

ORA-00399: corrupt change description in redo log : a workaround

Fix : ORA-00399: corrupt change description in redo log


In one of my test instance I found below issues and fixed with given workaround. You can try this.

SQL> startup;
ORACLE instance started.

Total System Global Area 1.7103E+10 bytes
Fixed Size                  2238616 bytes
Variable Size            2415921000 bytes
Database Buffers         1.4663E+10 bytes
Redo Buffers               21716992 bytes
Database mounted.
ORA-00399: corrupt change description in redo log
ORA-00353: log corruption near block 113746 change 123790201307 time 08/07/2015
10:23:06
ORA-00312: online log 5 thread 1: '/u10/EHISHYD/ORADATA/EHISHYD/redo05.log'


SQL>


SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open


SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u10/EHISHYD/ORADATA/EHISHYD/system01.dbf'

Reason and Prerequisites:

In general, the above errors suggest that the redo log is corrupt. A repair for these types of  inconsistencies is not supported. In almost all cases, hardware problems trigger these errors If the process of recovering a redo log terminates with one of the above errors, you can no longer import the subsequent redo logs, for the sake of transaction consistency. In the worst-case scenario, you can only perform a partial recovery before the redo log in question. All subsequent changes then disappear.

If you receive one of the above errors when archiving an online redo log, you can usually solve this problem without any data loss.  In this case, additional errors such as ORA-00255 or ORA-16038 usually occur.

Note 540463 contains information about how to proactively check the redo log consistency.
Note 1016173 describes the redo log validation using RMAN.


My workaround:

SQL> select name,open_mode from v$database;

NAME      OPEN_MODE
--------- --------------------
PROD   MOUNTED

SQL> col FIRST_CHANGE# format 99999999999999;
SQL>
SQL> select group#,status,archived,sequence#,first_change# from v$log;

    GROUP# STATUS           ARC  SEQUENCE#   FIRST_CHANGE#
---------- ---------------- --- ---------- ---------------
         5 CURRENT          NO       29551    123790176390
         6 ACTIVE           NO       29550    123790099280
         7 INACTIVE         YES          1    120240973534
         8 UNUSED           YES          0               0

SQL>


Do below as per the "status":

1) If 'status' = INACTIVE you are lucky, you can clear the group:

SQL> alter database clear <unarchived> logfile group n;
(use 'unarchived' when 'archived' shows 'NO') .


2) If 'status' = ACTIVE try

SQL> alter database checkpoint;,

if this command executes successfully, logfile gets status INACTIVE and again you can clear the

logfile, if not you have to handle the logfile as CURRENT.


3) If 'status' = CURRENT you must do an incomplete recovery up to the latest usable SCN:

restore database until scn <first_change#_of_current_logfile>;
recover database until scn <first_change#_of_current_logfile>;
alter database open resetlogs;

e.g.,

RMAN> restore database until scn 123790176390;

RMAN> recover database until scn 123790176390;

SQL> alter database open resetlogs;