DB file sequential reads
Possible
Causes :
· Use of an unselective index
· Fragmented Indexes
· High I/O on a particular disk or mount point
· Bad application design
· Index reads performance can be affected by slow I/O subsystem and/or poor database files layout, which result in a higher average wait time
· Use of an unselective index
· Fragmented Indexes
· High I/O on a particular disk or mount point
· Bad application design
· Index reads performance can be affected by slow I/O subsystem and/or poor database files layout, which result in a higher average wait time
Actions
:
· Check indexes on the table to ensure that the right index is being used
· Check indexes on the table to ensure that the right index is being used
· Check
the column order of the index with the WHERE clause of the Top SQL
statements
· Rebuild
indexes with a high clustering factor
· Use
partitioning to reduce the amount of blocks being visited
· Make
sure optimizer statistics are up to date
· Relocate
‘hot’ datafiles
· Consider
the usage of multiple buffer pools and cache frequently used indexes/tables in
the KEEP pool
· Inspect
the execution plans of the SQL statements that access data through indexes
· Is
it appropriate for the SQL statements to access data through index lookups?
· Would
full table scans be more efficient?
· Do
the statements use the right driving table?
· The
optimization goal is to minimize both the number of logical and physical
I/Os.
Remarks:
· The Oracle process wants a block that is currently not in the SGA, and it is waiting for the database block to be read into the SGA from disk.
· Significant db file sequential read wait time is most likely an application issue.
· If the DBA_INDEXES.CLUSTERING_FACTOR of the index approaches the number of blocks in the table, then most of the rows in the table are ordered. This is desirable.
· The Oracle process wants a block that is currently not in the SGA, and it is waiting for the database block to be read into the SGA from disk.
· Significant db file sequential read wait time is most likely an application issue.
· If the DBA_INDEXES.CLUSTERING_FACTOR of the index approaches the number of blocks in the table, then most of the rows in the table are ordered. This is desirable.
· However,
if the clustering factor approaches the number of rows in the table, it means
the rows in the table are randomly ordered and thus it requires more I/Os to
complete the operation. You can improve the index’s clustering factor by
rebuilding the table so that rows are ordered according to the index key and
rebuilding the index thereafter.
· The
OPTIMIZER_INDEX_COST_ADJ and OPTIMIZER_INDEX_CACHING initialization parameters
can influence the optimizer to favour the nested loops operation and choose an
index access path over a full table scan.
DB file scattered reads
Possible
Causes :
· The Oracle session has requested and is waiting for multiple contiguous database blocks (up to DB_FILE_MULTIBLOCK_READ_COUNT) to be read into the SGA from disk.
· Full Table scans
· The Oracle session has requested and is waiting for multiple contiguous database blocks (up to DB_FILE_MULTIBLOCK_READ_COUNT) to be read into the SGA from disk.
· Full Table scans
· Fast
Full Index Scans
Actions
:
· Optimize multi-block I/O by setting the parameter DB_FILE_MULTIBLOCK_READ_COUNT
· Optimize multi-block I/O by setting the parameter DB_FILE_MULTIBLOCK_READ_COUNT
· Partition
pruning to reduce number of blocks visited
· Consider
the usage of multiple buffer pools and cache frequently used indexes/tables in
the KEEP pool
· Optimize the SQL statement that initiated most of the waits. The goal is to minimize the number of physical
and logical reads.
· Should the statement access the data by a full table scan or index FFS? Would an index range or unique scan
be more efficient? Does the query use the right driving table?
· Are the SQL predicates appropriate for hash or merge join?
· If full scans are appropriate, can parallel query improve the response time?
· The objective is to reduce the demands for both the logical and physical I/Os, and this is best
achieved through SQL and application tuning.
· Make sure all statistics are representative of the actual data. Check the LAST_ANALYZED date
· Optimize the SQL statement that initiated most of the waits. The goal is to minimize the number of physical
and logical reads.
· Should the statement access the data by a full table scan or index FFS? Would an index range or unique scan
be more efficient? Does the query use the right driving table?
· Are the SQL predicates appropriate for hash or merge join?
· If full scans are appropriate, can parallel query improve the response time?
· The objective is to reduce the demands for both the logical and physical I/Os, and this is best
achieved through SQL and application tuning.
· Make sure all statistics are representative of the actual data. Check the LAST_ANALYZED date
Remarks:
· If an application that has been running fine for a while suddenly clocks a lot of time on the db file scattered read event and there hasn’t been a code change, you might want to check to see if one or more indexes has been dropped or become unusable.
· Or whether the stats has been stale.
· If an application that has been running fine for a while suddenly clocks a lot of time on the db file scattered read event and there hasn’t been a code change, you might want to check to see if one or more indexes has been dropped or become unusable.
· Or whether the stats has been stale.
Log
file parallel write
Possible
Causes :
· LGWR waits while writing contents of the redo log buffer cache to the online log files on disk
· I/O wait on sub system holding the online redo log files
· LGWR waits while writing contents of the redo log buffer cache to the online log files on disk
· I/O wait on sub system holding the online redo log files
Actions
:
· Reduce the amount of redo being generated
· Reduce the amount of redo being generated
· Do
not leave tablespaces in hot backup mode for longer than necessary
· Do
not use RAID 5 for redo log files
· Use
faster disks for redo log files
· Ensure
that the disks holding the archived redo log files and the online redo log
files are separate so as to avoid contention
· Consider
using NOLOGGING or UNRECOVERABLE options in SQL statements
Log
file sync:
Possible Causes :
· Oracle foreground processes are waiting for a COMMIT or ROLLBACK to complete
Actions :
· Tune LGWR to get good throughput to disk eg: Do not put redo logs on RAID5
· Reduce
overall number of commits by batching transactions so that there are fewer
distinct COMMIT operations
Actions
:
- Tune LGWR to get good throughput to disk eg: Do not put redo logs on RAID5
- Reduce overall number of commits by batching transactions so that there are fewer distinct COMMIT operations
Buffer
busy waits:
Possible Causes :
· Buffer busy waits are common in an I/O-bound Oracle system.
· The two main cases where this can occur are:
· Another session is reading the block into the buffer
· Another session holds the buffer in an incompatible mode to our request
· These waits indicate read/read, read/write, or write/write contention.
· The Oracle session is waiting to pin a buffer .A buffer must be pinned before it can be read or modified. Only one process can pin a buffer at any one time.
· Buffer busy waits are common in an I/O-bound Oracle system.
· The two main cases where this can occur are:
· Another session is reading the block into the buffer
· Another session holds the buffer in an incompatible mode to our request
· These waits indicate read/read, read/write, or write/write contention.
· The Oracle session is waiting to pin a buffer .A buffer must be pinned before it can be read or modified. Only one process can pin a buffer at any one time.
· This
wait can be intensified by a large block size as more rows can be
contained within the block
· This
wait happens when a session wants to access a database block in the buffer
cache but it cannot as the buffer is “busy
· It
is also often due to several processes repeatedly reading the same blocks (eg:
i lots of people scan the same index or data block)
Actions
:
· The main way to reduce buffer busy waits is to reduce the total I/O on the system
· The main way to reduce buffer busy waits is to reduce the total I/O on the system
· Depending
on the block type, the actions will differ
Data
Blocks
· Eliminate
HOT blocks from the application. Check for repeatedly scanned / unselective
indexes.
· Try
rebuilding the object with a higher PCTFREE so that you reduce the number of
rows per block.
· Check for ‘right- hand-indexes’ (indexes that get inserted into at the same point by many processes).
· Check for ‘right- hand-indexes’ (indexes that get inserted into at the same point by many processes).
· Increase
INITRANS and MAXTRANS and reduce PCTUSED This will make the table less dense .
· Reduce
the number of rows per block
Segment
Header
· Increase
of number of FREELISTs and FREELIST GROUPs
Undo
Header
· Increase
the number of Rollback Segments
Free
buffer waits:
Possible
Causes :
· This means we are waiting for a free buffer but there are none available in the cache because there are too many dirty buffers in the cache
· This means we are waiting for a free buffer but there are none available in the cache because there are too many dirty buffers in the cache
· Either
the buffer cache is too small or the DBWR is slow in writing modified buffers
to disk
· DBWR
is unable to keep up to the write requests
· Checkpoints
happening too fast – maybe due to high database activity and
under-sized online redo log files
· Large
sorts and full table scans are filling the cache with modified blocks faster
than the DBWR is able to write to disk
· If the number of dirty buffers that need to be written to disk is larger than the number that DBWR can write per batch, then these waits can be observed
· If the number of dirty buffers that need to be written to disk is larger than the number that DBWR can write per batch, then these waits can be observed
Actions
:
Reduce checkpoint frequency – increase the size of the online redo log files
Reduce checkpoint frequency – increase the size of the online redo log files
Examine
the size of the buffer cache – consider increasing the size of the buffer cache
in the SGA
Set
disk_asynch_io = true set
If
not using asynchronous I/O increase the number of db writer processes or dbwr
slaves
Ensure
hot spots do not exist by spreading datafiles over disks and disk controllers
Pre-sorting
or reorganizing data can help
Enqueue
waits
Possible
Causes :
· This wait event indicates a wait for a lock that is held by another session (or sessions) in an incompatible mode to the requested mode.
· This wait event indicates a wait for a lock that is held by another session (or sessions) in an incompatible mode to the requested mode.
TX
Transaction Lock
· Generally
due to table or application set up issues
· This
indicates contention for row-level lock. This wait occurs when a transaction
tries to update or delete rows that are currently
locked by another transaction.
locked by another transaction.
· This
usually is an application issue.
TM
DML enqueue lock
· Generally
due to application issues, particularly if foreign key constraints have not
been indexed.
ST
lock
· Database
actions that modify the UET$ (used extent) and FET$ (free extent) tables
require the ST lock, which includes actions such as drop, truncate, and
coalesce.
· Contention
for the ST lock indicates there are multiple sessions actively performing
· dynamic
disk space allocation or deallocation
· in
dictionary managed tablespaces
Actions
:
· Reduce waits and wait times
· Reduce waits and wait times
· The
action to take depends on the lock type which is causing the most
problems
· Whenever
you see an enqueue wait event for the TX enqueue, the first step is to find out
who the blocker is and if there are multiple waiters for the same resource
· Waits
for TM enqueue in Mode 3 are primarily due to unindexed foreign key columns.
· Create
indexes on foreign keys < 10g
· Following
are some of the things you can do to minimize ST lock contention in your
database:
· Use
locally managed tablespaces
· Recreate all temporary tablespaces using the CREATE TEMPORARY TABLESPACE TEMPFILE… command.
· Recreate all temporary tablespaces using the CREATE TEMPORARY TABLESPACE TEMPFILE… command.
Cache
buffer chain latch
Possible
Causes :
· Processes need to get this latch when they need to move buffers based on the LRU block replacement policy in the buffer cache
· The cache buffer lru chain latch is acquired in order to introduce a new block into the buffer cache and when writing a buffer
back to disk, specifically when trying to scan the LRU (least recently used) chain containing all the dirty blocks in the buffer
cache. Competition for the cache buffers lru chain .
· Processes need to get this latch when they need to move buffers based on the LRU block replacement policy in the buffer cache
· The cache buffer lru chain latch is acquired in order to introduce a new block into the buffer cache and when writing a buffer
back to disk, specifically when trying to scan the LRU (least recently used) chain containing all the dirty blocks in the buffer
cache. Competition for the cache buffers lru chain .
· latch
is symptomatic of intense buffer cache activity caused by inefficient
SQL statements. Statements that repeatedly scan
· large
unselective indexes or perform full table scans are the prime culprits.
· Heavy
contention for this latch is generally due to heavy buffer cache activity
which can be caused, for example, by:
Repeatedly scanning large unselective indexes
Repeatedly scanning large unselective indexes
Actions
:
Contention in this latch can be avoided implementing multiple buffer pools or increasing the number of LRU latches with the parameter DB_BLOCK_LRU_LATCHES (The default value is generally sufficient for most systems).
Contention in this latch can be avoided implementing multiple buffer pools or increasing the number of LRU latches with the parameter DB_BLOCK_LRU_LATCHES (The default value is generally sufficient for most systems).
Its
possible to reduce contention for the cache buffer lru chain latch by increasing
the size of the buffer cache and thereby reducing the rate at which
new blocks are introduced into the buffer cache.
Direct
Path Reads
Possible
Causes :
· These waits are associated with direct read operations which read data directly into the sessions PGA bypassing the SGA
· These waits are associated with direct read operations which read data directly into the sessions PGA bypassing the SGA
· The
“direct path read” and “direct path write” wait events are related to
operations that are performed in PGA like sorting, group by operation, hash
join
· In
DSS type systems, or during heavy batch periods, waits on “direct path read”
are quite normal However, for an OLTP system these waits are significant
· These wait events can occur during sorting operations which is not surprising as direct path reads and writes usually occur in connection with temporary segments
· SQL statements with functions that require sorts, such as ORDER BY, GROUP BY, UNION, DISTINCT, and ROLLUP, write sort runs to the temporary tablespace when the input size is larger than the work area in the PGA
Actions :
Ensure the OS asynchronous IO is configured correctly.
Check for IO heavy sessions / SQL and see if the amount of IO can be reduced.
Ensure no disks are IO bound.
Set your PGA_AGGREGATE_TARGET to appropriate value (if the parameter WORKAREA_SIZE_POLICY = AUTO) Or set *_area_size manually (like sort_area_size and then you have to set WORKAREA_SIZE_POLICY = MANUAL
Whenever possible use UNION ALL instead of UNION, and where applicable use HASH JOIN instead of SORT MERGE and NESTED LOOPS instead of HASH JOIN.
Make sure the optimizer selects the right driving table. Check to see if the composite index’s columns can be rearranged to match the ORDER BY clause to avoid sort entirely.
· These wait events can occur during sorting operations which is not surprising as direct path reads and writes usually occur in connection with temporary segments
· SQL statements with functions that require sorts, such as ORDER BY, GROUP BY, UNION, DISTINCT, and ROLLUP, write sort runs to the temporary tablespace when the input size is larger than the work area in the PGA
Actions :
Ensure the OS asynchronous IO is configured correctly.
Check for IO heavy sessions / SQL and see if the amount of IO can be reduced.
Ensure no disks are IO bound.
Set your PGA_AGGREGATE_TARGET to appropriate value (if the parameter WORKAREA_SIZE_POLICY = AUTO) Or set *_area_size manually (like sort_area_size and then you have to set WORKAREA_SIZE_POLICY = MANUAL
Whenever possible use UNION ALL instead of UNION, and where applicable use HASH JOIN instead of SORT MERGE and NESTED LOOPS instead of HASH JOIN.
Make sure the optimizer selects the right driving table. Check to see if the composite index’s columns can be rearranged to match the ORDER BY clause to avoid sort entirely.
Also,
consider automating the SQL work areas using PGA_AGGREGATE_TARGET in Oracle9i
Database.
Query
V$SESSTAT> to identify sessions with high “physical reads direct”
Remark:
· Default size of HASH_AREA_SIZE is twice that of SORT_AREA_SIZE
· Default size of HASH_AREA_SIZE is twice that of SORT_AREA_SIZE
· Larger
HASH_AREA_SIZE will influence optimizer to go for hash joins instead of nested
loops
· Hidden
parameter DB_FILE_DIRECT_IO_COUNT can impact the direct path read
performance.It sets the maximum I/O buffer size of direct read and write
operations. Default is 1M in 9i
Direct Path Writes:
Possible
Causes :
· These are waits that are associated with direct write operations that write data from users’ PGAs to data files or temporary tablespaces
· Direct load operations (eg: Create Table as Select (CTAS) may use this)
· Parallel DML operations
· Sort IO (when a sort does not fit in memory
· These are waits that are associated with direct write operations that write data from users’ PGAs to data files or temporary tablespaces
· Direct load operations (eg: Create Table as Select (CTAS) may use this)
· Parallel DML operations
· Sort IO (when a sort does not fit in memory
Actions
:
If the file indicates a temporary tablespace check for unexpected disk sort operations.
Ensure
<Parameter:DISK_ASYNCH_IO> is TRUE . This is unlikely to reduce wait times from the wait event timings but may reduce sessions elapsed times (as synchronous direct IO is not accounted for in wait event timings).
Ensure the OS asynchronous IO is configured correctly.
Ensure no disks are IO bound
If the file indicates a temporary tablespace check for unexpected disk sort operations.
Ensure
<Parameter:DISK_ASYNCH_IO> is TRUE . This is unlikely to reduce wait times from the wait event timings but may reduce sessions elapsed times (as synchronous direct IO is not accounted for in wait event timings).
Ensure the OS asynchronous IO is configured correctly.
Ensure no disks are IO bound
Latch
Free Waits
Possible Causes :
· This wait indicates that the process is waiting for a latch that is currently busy (held by another process).
· When you see a latch free wait event in the V$SESSION_WAIT view, it means the process failed to obtain the latch in the
willing-to-wait mode after spinning _SPIN_COUNT times and went to sleep. When processes compete heavily for latches, they will also consume more CPU resources because of spinning. The result is a higher response time
Actions
:
· If the TIME spent waiting for latches is significant then it is best to determine which latches are suffering from contention.
Remark:
· A latch is a kind of low level lock. Latches apply only to memory structures in the SGA. They do not apply to database objects. An Oracle SGA has many latches, and they exist to protect various memory structures from potential corruption by concurrent access.
· If the TIME spent waiting for latches is significant then it is best to determine which latches are suffering from contention.
Remark:
· A latch is a kind of low level lock. Latches apply only to memory structures in the SGA. They do not apply to database objects. An Oracle SGA has many latches, and they exist to protect various memory structures from potential corruption by concurrent access.
· The
time spent on latch waits is an effect, not a cause; the cause is that you are
doing too many block gets, and block gets require cache buffer chain latching
Library cache latch
Possible
Causes :
· The library cache latches protect the cached SQL statements and objects definitions held in the library cache within the shared pool. The library cache latch must be acquired in order to add a new statement to the library cache.
· The library cache latches protect the cached SQL statements and objects definitions held in the library cache within the shared pool. The library cache latch must be acquired in order to add a new statement to the library cache.
· Application
is making heavy use of literal SQL- use of bind variables will reduce this
latch considerably
Actions
:
· Latch is to ensure that the application is reusing as much as possible SQL statement representation. Use bind variables whenever possible in the application.
· Latch is to ensure that the application is reusing as much as possible SQL statement representation. Use bind variables whenever possible in the application.
· You
can reduce the library cache latch hold time by properly setting the
SESSION_CACHED_CURSORS parameter.
· Consider increasing shared pool.
Remark:
· Larger shared pools tend to have long free lists and processes that need to allocate space in them must spend extra time scanning the long free lists while holding the shared pool latch
· Consider increasing shared pool.
Remark:
· Larger shared pools tend to have long free lists and processes that need to allocate space in them must spend extra time scanning the long free lists while holding the shared pool latch
· if
your database is not yet on Oracle9i Database, an oversized shared pool
can increase the contention for the shared pool latch..
Shared
pool latch
Possible
Causes :
The shared pool latch is used to protect critical operations when allocating and freeing memory in the shared pool Contentions for the shared pool and library cache latches are mainly due to intense hard parsing. A hard parse applies to new cursors and cursors that are aged out and must be re-executed.
The shared pool latch is used to protect critical operations when allocating and freeing memory in the shared pool Contentions for the shared pool and library cache latches are mainly due to intense hard parsing. A hard parse applies to new cursors and cursors that are aged out and must be re-executed.
The
cost of parsing a new SQL statement is expensive both in terms of CPU
requirements and the number of times the library cache and shared pool
latches may need to be acquired and released.
Actions
:
· Ways to reduce the shared pool latch are, avoid hard parses when possible, parse once, execute many.
· Ways to reduce the shared pool latch are, avoid hard parses when possible, parse once, execute many.
· Eliminating
literal SQL is also useful to avoid the shared pool latch. The size of
the shared_pool and use of MTS (shared server option) also greatly
influences the shared pool latch.
· The workaround is to set the initialization parameter CURSOR_SHARING to FORCE. This allows statements that differ in literal
values but are otherwise identical to share a cursor and therefore reduce latch contention, memory usage, and hard parse.
· The workaround is to set the initialization parameter CURSOR_SHARING to FORCE. This allows statements that differ in literal
values but are otherwise identical to share a cursor and therefore reduce latch contention, memory usage, and hard parse.
Row
cache objects latch
Possible
Causes :
This latch comes into play when user processes are attempting to access the cached data dictionary values.
This latch comes into play when user processes are attempting to access the cached data dictionary values.
Actions
:
· It is not common to have contention in this latch and the only way to reduce contention for this latch is by increasing the size of the shared pool (SHARED_POOL_SIZE).
· It is not common to have contention in this latch and the only way to reduce contention for this latch is by increasing the size of the shared pool (SHARED_POOL_SIZE).
· Use
Locally Managed tablespaces for your application objects especially
indexes
· Review
and amend your database logical design , a good example is to merge or decrease
the number of indexes on tables with heavy inserts
Remark:
· Configuring the library cache to an acceptable size usually ensures that the data dictionary cache is also properly sized. So tuning Library Cache will tune Row Cache indirectly.
Remark:
· Configuring the library cache to an acceptable size usually ensures that the data dictionary cache is also properly sized. So tuning Library Cache will tune Row Cache indirectly.