The following information is important about Physical Data Guard Redo Apply performance:
11g Media Recovery performance improvements include:
===========================================
•More parallelism by default
•More efficient asynchronous redo read, parse, and apply
•Fewer synchronization points in the parallel apply algorithm
•The media recovery checkpoint at a redo log boundary no longer blocks the apply of the next log
In 11g, when tuning redo apply consider following:
=====================================
•By default recovery parallelism = CPU Count-1. Do not use any other values.
•Keep PARALLEL_EXECUTION_MESSAGE_SIZE >= 8192
•Keep DB_CACHE_SIZE >= Primary value
•Keep DB_BLOCK_CHECKING = FALSE (if you have to)
•System Resources Needs to be assessed
•Query what MRP process is waiting
select a.event, a.wait_time, a.seconds_in_wait from gv$session_wait a, gv$session b where a.sid=b.sid and b.sid=(select SID from v$session where PADDR=(select PADDR from v$bgprocess where NAME='MRP0'))
When tuning redo transport service, consider following:
==============================================
1 - Tune LOG_ARCHIVE_MAX_PROCESSES parameter on the primary.
•Specifies the parallelism of redo transport
•Default value is 2 in 10g, 4 in 11g
•Increase if there is high redo generation rate and/or multiple standbys
•Must be increased up to 30 in some cases.
•Significantly increases redo transport rate.
2 - Consider using Redo Transport Compression:
•In 11.2.0.2 redo transport compression can be always on
•Use if network bandwidth is insufficient
•and CPU power is available
Also consider:
===========
3 - Configuring TCP Send / Receive Buffer Sizes (RECV_BUF_SIZE / SEND_BUF_SIZE)
4 - Increasing SDU Size
5 - Setting TCP.NODELAY to YES
Problem: Recovery service has stopped for a while and there has been a gap between primary and standby side. After recovery process was started again, standby side is not able to catch primary side because of low log applying performance. Disk I/O and memory utilization on standby server are nearly 100%.
Solution:
1 – Rebooting the standby server reduced memory utilization a little.
2 – ALTER DATABASE RECOVER MANAGED STANDBY DATABASE PARALLEL 8 DISCONNECT FROM SESSION;
In general, using the parallel recovery
option is most effective at reducing recovery time when several
datafiles on several different disks are being recovered concurrently.
The performance improvement from the parallel recovery option is also
dependent upon whether the operating system supports asynchronous I/O.
If asynchronous I/O is not supported, the parallel recovery option can
dramatically reduce recovery time. If asynchronous I/O is supported, the
recovery time may be only slightly reduced by using parallel recovery.
3 – SQL>alter system Set PARALLEL_EXECUTION_MESSAGE_SIZE = 4096 scope = spfile;
Set PARALLEL_EXECUTION_MESSAGE_SIZE = 4096
When using parallel media recovery or
parallel standby recovery,
increasing the PARALLEL_EXECUTION_MESSAGE_SIZE database parameter to 4K (4096) can
improve parallel recovery by as much as 20 percent.
Set this parameter
on both the primary and standby databases in preparation for switchover
operations. Increasing this parameter requires more memory from the
shared pool by each parallel execution slave process.
4 – Kernel parameters that changed in order to reduce file system cache size.
dbc_max_pct 10 10 Immed
dbc_min_pct 3 3 Immed
5 – For secure path (HP) load balancing, SQL Shortest Queue Length is chosen.
autopath set -l 6005-08B4-0007-4D25-0000-D000-025F-0000 -b SQL