1.DB File Scattered Read
This event signifies that the user process is reading buffers into the SGA buffer cache and is waiting for a physical I/O call to return. A db file scattered read issues a scattered read to read the data into multiple discontinuous memory locations. A scattered read is usually a multiblock read. It can occur for a fast full scan (of an index) in addition to a full table scan.
The db file scattered read wait event identifies that a full scan is occurring. When performing a full scan into the buffer cache, the blocks read are read into memory locations that are not physically adjacent to each other. Such reads are called scattered read calls, because the blocks are scattered throughout memory. This is why the corresponding wait event is called 'db file scattered read'. multiblock (up to DB_FILE_MULTIBLOCK_READ_COUNT blocks) reads due to full scans into the buffer cache show up as waits for 'db file scattered read'
In one line we can describe scattered read The Oracle session has requested and is waiting for multiple contiguous database blocks to be read into the SGA from disk.
Cause :
· Full Table scans
· Fast Full Index Scans
· Missing or unselective or unusable index
· Table not analyzed after created index or lack of accurate statistics for the optimizer
If an application that has been running fine for a while suddenly clocks a lot of time on the db file scattered read event and there hasn’t been a code change, you might want to check to see if one or more indexes has been dropped or become unusable.
SELECT p1 "file#", p2 "block#", p3 "class#",event
FROM v$session_wait
WHERE event IN ('db file scattered read');
Where P1,P2,P3 are
P1 - The absolute file number
P2 - The block being read
P3 - The number of blocks (should be greater than 1)
SELECT relative_fno, owner, segment_name, segment_type
FROM DBA_EXTENTS
WHERE file_id = &FILE
AND &BLOCK BETWEEN block_id
AND block_id + blocks - 1;
SELECT
s.p1 file_id, s.p2 block_id,o.object_name obj,
o.object_type otype,
s.SQL_ID,
w.CLASS,event
FROM v$session s,
( SELECT ROWNUM CLASS#, CLASS FROM v$waitstat ) w,
ALL_OBJECTS o
WHERE
event IN ('db file scattered read')
AND
w.CLASS#(+)=s.p3
AND o.object_id (+)= s.row_wait_OBJ#
ORDER BY 1;
Find the related SQL statement using sql_id
SELECT sql_text FROM v$sqltext WHERE sql_id=&sq_id ORDER BY piece
You can also find the objects using below sql :-
Finding the SQL Statement executed by Sessions Waiting for I/O
SELECT SQL_ADDRESS, SQL_HASH_VALUE
FROM V$SESSION
WHERE EVENT ='read by other session';
Finding the Object Requiring I/O
SELECT row_wait_obj#
FROM V$SESSION
WHERE EVENT = 'db file scattered read';
To identify the object and object type contended for, query DBA_OBJECTS using the value for ROW_WAIT_OBJ# that is returned from V$SESSION. For example:
SELECT owner, object_name, subobject_name, object_type
FROM DBA_OBJECTS
WHERE data_object_id = &row_wait_obj;
Once you identify the hot blocks and the segments they belong to, and related quires then you reduce the using following solutions
1. Optimize the SQL statement that initiated most of the waits. The goal is to minimize the number of physical and logical reads.
2. Optimize multi-block I/O by setting the parameter DB_FILE_MULTIBLOCK_READ_COUNT i.e if
DB_FILE_MULTIBLOCK_READ_COUNT initialization parameter values too high which favors full scans reduce it.
3 Partition pruning to reduce number of blocks visited Consider the usage of multiple buffer pools and cache frequently used indexes/tables in the KEEP pool Make sure that the query use the right driving table?
4 Are the SQL predicates appropriate for hash or merge join?
5 If full scans are appropriate, can parallel query improve the response time?
6 The objective is to reduce the demands for both the logical and physical I/Os, and this is best achieved through SQL and application tuning.
7 Gather statistics the related objects if they are missing .
8 Check the LAST_ANALYZED date from user_tables view
2. DB File Sequential Read
An Oracle session logs the db file sequential read wait event when it has to wait for a single-block I/O read request to complete. Oracle issues single-block I/O read requests when reading from indexes, rollback segments, sort segments, control files, datafile headers and tables (when tables are accessed via rowids). A sequential read is usually a single-block read, although it is possible to see sequential reads for more than one block (See P3). To determine the actual object being waited can be checked by the p1, p2, p3 info in v$session_wait.
SELECT p1 "file#", p2 "block#", p3 "class#",event
FROM v$session_wait
WHERE event IN ('db file sequential read');
Select * from v$session_event
where event = 'db file sequential read'
order by time_waited;
Select segment_name, partition_name, segment_type, tablespace_name
from dba_extents a, v$session_wait b
where b.p2 between a.block_id and (a.block_id + a.blocks - 1)
and a.file_id = b.p1
and b.event = 'db file sequential read';
Select a.sid, a.serial#, a.username, a.osuser, b.sql_text
from v$session a, v$sqltext b
where a.sql_hash_value = b.hash_value
and a.sql_address = b.address and a.sid in (select sid from v$session_wait
where event = 'db file sequential read')
order by a.sid, b.hash_value, b.piece;
Note: Where P1 = file#, P2 = block#, P3 = blocks 9 (should be 1)
Generally the entire database having some wait event doing IO for index scan usually. But if you see seconds in waiting greater then 0, you must tune index I/O.
To reduce this wait event follow the below points:
1. Tuning SQL statements to reduce unnecessary I/O request is the only guaranteed way to reduce "db file sequential read" wait time.
2. Distribute the index in different file system to reduce the contention for I/O
Tuning Physical devices, the data on different disk to reduce the I/O.
Tuning Physical devices, the data on different disk to reduce the I/O.
3. Use of Faster disk reduces the unnecessary I/O request.
Increase db_block_buffers or larger buffer cache sometimes can help.
Increase db_block_buffers or larger buffer cache sometimes can help.
3. Read by other session wait event
Read by other session or buffer busy waits occur a when another session is reading the block into the buffer OR Another session holds the buffer in an incompatible mode to our request. This wait event was known as buffer busy wait event before oracle 10g.
These waits indicate read/read, read/write, or write/write contention. The Oracle session is waiting to pin a buffer. A buffer must be pinned before it can be read or modified. Only one process can pin a buffer at any one time.
“Read by other session wait event” wait event indicates a wait for another session to read the data from the disk into oracle buffer cache .The main cause for this wait event is contention for "hot" blocks or objects .i.e several processes repeatedly reading the same blocks, e.g. many sessions scanning the same index or performing full table scans on the same table.
When query is requested data from the database, Oracle will first read the data from disk into the SGA. If two or more sessions request the same information,
the first session will read the data into the buffer cache while other sessions wait.
Before 10g this wait event knows as buffer busy wait.
+---------Find the file_id and block_id using below query---------+
SELECT p1 "file#", p2 "block#", p3 "class#"
FROM v$session_wait
WHERE event = 'read by other session';
+--------Find the object name using below query----------+
SELECT relative_fno, owner, segment_name, segment_type
FROM dba_extents
WHERE file_id = &FILE
AND &BLOCK BETWEEN block_id
AND block_id + blocks - 1;
+----------You can also find the sql id and sql statement using below queries ---------+
SELECT
s.p1 file_id, s.p2 block_id,o.object_name obj,
o.object_type otype,
s.SQL_ID,
w.CLASS,event
FROM v$session s,
( SELECT ROWNUM CLASS#, CLASS FROM v$waitstat ) w,
all_objects o
WHERE
event IN ('read by other session')
AND
w.CLASS#(+)=s.p3
AND o.object_id (+)= s.row_wait_OBJ#
ORDER BY 1;
SELECT sql_text FROM v$sqltext WHERE sql_id=&sq_id ORDER BY piece
Once you identify the hot blocks and the segments they belong to, and related quires then you reduce the using following solutions
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---Tune the query----+
This block of contention can belong to data block,segement header or undo block.
The main way to reduce buffer busy waits is to reduce the total I/O on the system by tuning the query Depending on the block type, the actions will differ
Data Blocks:
-Eliminate HOT blocks from the application.
-Reduce the number of rows per block( by moving table to tablespace with smaller block size or by below techniques)
-Try rebuilding the object with a higher PCTFREE so that you reduce the number of rows per block.
-Increase INITRANS and MAXTRANS and reduce PCTUSED This will make the table less dense .
-Check for repeatedly scanned /unselective indexes.
-Check for 'right- hand-indexes' (indexes that get inserted into at the same point by many processes).
INDEX: then the insert index leaf block is probably hot,
solutions:- Hash partition the index, Use reverse key index.
TABLE: then insert block is hot,
solutions:- Use free lists, Put Object in ASSM tablespace
Segemnt Header:
Use Automate segment management that is bit maps or Increase / Add of number of FREELISTs and FREELIST GROUPs
If "segment header" occurs at the same time as CLASS= "data block" on the same object and the object is of OTYPE= "TABLE" then this is just a confirmation that the TABLE needs to use free lists or ASSM.
Most likely extent allocation problems, look at extent size on tablespace and increase the extent size to there are few extent allocations and less contention on the File Header Block.
If "segment header" occurs at the same time as CLASS= "data block" on the same object and the object is of OTYPE= "TABLE" then this is just a confirmation that the TABLE needs to use free lists or ASSM.
Most likely extent allocation problems, look at extent size on tablespace and increase the extent size to there are few extent allocations and less contention on the File Header Block.
Undo Header:Increase the number of Rollback Segments
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Block contention wait events are also recorded in specific view V$WAITSTAT and since V$SESSION has all the wait events data integrated with it from 10g and it also have the row wait information, below query can also be used to find the sql statements.
SELECT
s.p1 file_id, s.p2 block_id,o.object_name obj,
o.object_type otype,
s.SQL_ID,
w.CLASS,event
FROM v$session s,
( SELECT ROWNUM CLASS#, CLASS FROM v$waitstat ) w,
all_objects o
WHERE
event IN ('read by other session')
AND
w.CLASS#(+)=s.p3
AND o.object_id (+)= s.row_wait_OBJ#
ORDER BY 1;
SELECT SQL_FULLTEXT from V$SQL WHERE sql_id=&
SQL_FULLTEXT is CLOB column which displays full query
ASH samples the active sessions every one second and so we can query v$active_session_history also to get buffer busy waits or read by other session.
SELECT
p1 file_id , p2 block_id ,o.object_name obj,
o.object_type otype,
ash.SQL_ID,
w.CLASS
FROM v$active_session_history ash,
( SELECT ROWNUM CLASS#, CLASS FROM v$waitstat ) w,
all_objects o
WHERE event='read by other session'
AND w.CLASS#(+)=ash.p3
AND o.object_id (+)= ash.CURRENT_OBJ#
AND ash.sample_time > SYSDATE - &MIN/(60*24)
ORDER BY 1;
+------- Segment Contention (Table /Index)----------+
SELECT OBJECT_NAME, OBJ#, STATISTIC_NAME, VALUE FROM v$SEGMENT_STATISTICS where STATISTIC_NAME LIKE '%waits%' AND VALUE > 0 ORDER BY STATISTIC_NAME, VALUE DESC
SELECT OBJECT_NAME, OBJ#, STATISTIC_NAME, VALUE FROM v$SEGMENT_STATISTICS where STATISTIC_NAME LIKE '%waits%' AND VALUE > 0 ORDER BY STATISTIC_NAME, VALUE DESC
select object_name,value from v$segment_statistics where owner='JW' and object_type='INDEX' and statistic_name='buffer busy waits';
------------------------------------------------------------------------------------------------------------
Scenario:-
------------------------------------------------------------------------------------------------------------
Recently I was involved in resolving an ongoing degrading performance issue for an application. It was at a state whereby the users were just frustrated and the business was not happy at all.
When it ended up in my court, the first thing I did was to identify the window when the user experience was really bad. Hence, I had to liaise with app vendors to undertand the application a bit and what else was running on it.
It was finally understood that there is a small etl job that runs on the database during business hours regularly. It was one of the business requirments to fulfill a reporting need. Hence, stopping that etl process was out of the question. Also, the application performance slows down when the etl kicks off and the response time gets better when the etl is finished. Also, the etl used to run 4 times in the business hours.
After analysing the statspack report of the database since it was 9i, I found that the top wait event when in the slowness window was 'Buffer Busy Waits' apart from IO:
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file sequential read 2,198,322 13,940 40.34
buffer busy waits 1,273,203 9,852 28.51
CPU time 4,241 12.27
enqueue 1,170 3,414 9.88
latch free 97,147 1,542 4.46
Also the buffer busy waits are not in the top 5 wait events when performance issue is not there.
I tried to find whether there was a link when users were complaining about database performance and buffer busy waits and I did find a link. Here is an output of my query from statspack to check.....
users were complaining about database performance when when the buffer busy waits was high. Also, the contention in on data blocks.
After running some monitoring script to find out which object was having this, it was identified that there was a massive table. Also, the database default block size was 32K meaning having alot of rows per block. To make it worse, the table was sitting on a local managed tablespace with manual segment space management. Another interesting finding during the analysis was that the reason code for buffer busy wait event was 300
(Block is being read by another session, and no other suitable block image was found, so we wait until the read is completed. This may also occur after a buffer cache assumed deadlock. The kernel can't get a buffer in a certain amount of time and assumes a deadlock. Therefore it will read the CR version of the block. )
It simply indicated we were having hot block contention.
Hence, my recommnedation to fix the issue was to create another 8K blocksize tablespace which had 'automatic segment space managment' and move the table with the issue to it. Once the migration was done, the buffer busy waits disappeared from the top 5 wait events and users were finding the application response time normal even when the etl was running. The business was happy as well.
The buffer busy wait events was also showing better results after the change
------------------------------------------------------------------------------------------------------------
To reduce contention for table blocks due to delete, select or update statements, reduce the number of rows per block. This can be done by using a smaller block size.
To reduce contention for table blocks due to insert statements, increase the number of freelists, or buffer frames.
To reduce contention for index blocks the best strategy is to implement a Reverse index.
In most situations the goal is to spread queries over a greater number of blocks, to avoid concentrating on any single one.
SQL> select * from v$waitstat where class='data block';
CLASS COUNT TIME
------------------ ---------- ----------
data block 1676599 115212971
CLASS COUNT TIME
------------------ ---------- ----------
data block 1676599 115212971
4. Free Buffer Wait
Possible Causes:
- This means we are waiting for a free buffer but there are none available in the cache because there are too many dirty buffers in the cache.
- Either the buffer cache is too small or the DBWR is slow in writing modified buffers to disk.
- DBWR is unable to keep up to the write requests.
- Checkpoints happening too fast – maybe due to high database activity and under-sized online redo log files.
- Large sorts and full table scans are filling the cache with modified blocks faster than the DBWR is able to write to disk.
- If the number of dirty buffers that need to be written to disk is larger than the number that DBWR can write per batch, then these waits can be observed.
Actions:
- Reduce checkpoint frequency.
- Increase the size of the online redo log files .
- Examine the size of the buffer cache
-consider increasing the size of the buffer cache in the SGA .
- Set disk_asynch_io = true set.
- If not using asynchronous I/O increase the number of db writer processes or dbwr slaves.
- Ensure hot spots do not exist by spreading datafiles over disks and disk controllers.
- Pre-sorting or reorganizing data can help.
Remarks:
- Understanding and Tuning Buffer Cache and DBWR Note# 62172.1
- How to Identify a Hot Block within the database Buffer Cache. Note# 163424.1
5. Log file sync
Possible Causes:
- Oracle foreground processes are waiting for a COMMIT or ROLLBACK to complete.
Actions:
- Tune LGWR to get good throughput to disk eg: Do not put redo logs on RAID5.
- Reduce overall number of commits by batching transactions so that there are fewer distinct COMMIT operations.
- Check to see if redologs are large enough. Enlarge the redologs so the logs switch between 15 to 20 minutes.
Remarks:
- Reference Note# 34592.1
- High Waits on log file sync Note# 125269.1
- Tuning the Redolog Buffer Cache and Resolving Redo Latch Contention Note# 147471.1
6. Log file parallel write
Possible Causes:
- LGWR waits while writing contents of the redo log buffer cache to the online log files on disk
- I/O wait on sub system holding the online redo log files
Actions:
- Reduce the amount of redo being generated
- Do not leave tablespaces in hot backup mode for longer than necessary
- Do not use RAID 5 for redo log files
- Use faster disks for redo log files
- Ensure that the disks holding the archived redo log files and the online redo log files are separate so as to avoid contention
- Consider using NOLOGGING or UNRECOVERABLE options in SQL statements
Remarks:
- Reference Note# 34583.1
7. Log File Switch
Possible Causes:
All commit requests are waiting for "logfile switch (archiving needed)" or "logfile switch (Checkpoint. Incomplete)." Ensure that the archive disk is not full or slow. DBWR may be too slow because of I/O. You may need to add more or larger redo logs, and you may potentially need to add database writers if the DBWR is the problem.
Actions:
1. Optimize the number of Redo log Switches - As a general rule of thumb, it should not be more than 3-4 per hour during peak load.
2. Optimize the Redo Log Switch Time.
Addressing point number 1 was very straight forward. Our Redo logs were 256MB in size. We increased them to 512MB and Redo log switches per hour became 3-4. This certainly reduced number of failures per hour by half but further tuning was still required.
For Point number 2, after some research, we learnt that during Redo Log Switch, Oracle Writes the required information in all the control files on the system SEQUENTIALY and not in Parallel, this happens on both Primary and Standby. We reviewed the setup and found that we have 3 Control files on mount points with internal Hardware Level RAID (1+0).
Since we already had enough Fault Tolerance with respect to our setup and our control files are backed up on a frequent basis, we decided to keep just one control file instead of 3 on both Primary and Standby.
This reduced our redo log switch time by about 60%.
One thing to note is that our System did not have a superfast IO. For systems with High end SAN storage and enough Write Cache, point number 2 might not be required.
Addressing point number 1 was very straight forward. Our Redo logs were 256MB in size. We increased them to 512MB and Redo log switches per hour became 3-4. This certainly reduced number of failures per hour by half but further tuning was still required.
For Point number 2, after some research, we learnt that during Redo Log Switch, Oracle Writes the required information in all the control files on the system SEQUENTIALY and not in Parallel, this happens on both Primary and Standby. We reviewed the setup and found that we have 3 Control files on mount points with internal Hardware Level RAID (1+0).
Since we already had enough Fault Tolerance with respect to our setup and our control files are backed up on a frequent basis, we decided to keep just one control file instead of 3 on both Primary and Standby.
This reduced our redo log switch time by about 60%.
One thing to note is that our System did not have a superfast IO. For systems with High end SAN storage and enough Write Cache, point number 2 might not be required.
8. Log Buffer Space
Possible Causes:
This wait occurs because you are writing the log buffer faster than LGWR can write it to the redo logs, or because log switches are too slow. To address this problem, increase the size of the log files, or increase the size of the log buffer, or get faster disks to write to. You might even consider using solid-state disks, for their high speed.
Actions:
- Put log files on faster disks.
- Look at tuning checkpoint or archive processes (log files switches).
- Review application design, use NOLOGGING operations where appropriate, and avoid changing more data than required.
- Finally, check refresh methods when using Materialized Views to reduce logging activity.
9. Direct Path Writes
Possible Causes:
- These are waits that are associated with direct write operations that write data from users’ PGAs to data files or temporary tablespaces.
- Direct load operations (eg: Create Table as Select (CTAS) may use this).
- Parallel DML operations.
- Sort IO (when a sort does not fit in memory).
Actions:
- If the file indicates a temporary tablespace check for unexpected disk sort operations.
- Ensure <Parameter:DISK_ASYNCH_IO> is TRUE . This is unlikely to reduce wait times from the wait event timings but may reduce sessions elapsed times (as synchronous direct IO is not accounted for in wait event timings).
- Ensure the OS asynchronous IO is configured correctly.
- Ensure no disks are IO bound
10. Direct Path Reads
Possible Causes:
- These waits are associated with direct read operations which read data directly into the sessions PGA bypassing the SGA.
- The "direct path read" and "direct path write" wait events are related to operations that are performed in PGA like sorting, group by operation, hash join.
- In DSS type systems, or during heavy batch periods, waits on "direct path read" are quite normal.
- However, for an OLTP system these waits are significant.
- These wait events can occur during sorting operations which is not surprising as direct path reads and writes usually occur in connection with temporary tsegments.
- SQL statements with functions that require sorts, such as ORDER BY, GROUP BY, UNION, DISTINCT, and ROLLUP, write sort runs to the temporary tablespace when the input size is larger than the work area in the PGA.
Actions:
- Ensure the OS asynchronous IO is configured correctly.
- Check for IO heavy sessions / SQL and see if the amount of IO can be reduced.
- Ensure no disks are IO bound.
- Set your PGA_AGGREGATE_TARGET to appropriate value (if the parameter WORKAREA_SIZE_POLICY = AUTO) Or set *_area_size manually (like sort_area_size and then you have to set WORKAREA_SIZE_POLICY = MANUAL.
- Whenever possible use UNION ALL instead of UNION, and where applicable use HASH JOIN instead of SORT MERGE and NESTED LOOPS instead of HASH JOIN.
- Make sure the optimizer selects the right driving table. Check to see if the composite index’s columns can be rearranged to match the ORDER BY clause to avoid sort entirely.
- Also, consider automating the SQL work areas using PGA_AGGREGATE_TARGET in Oracle9i Database.
- Query V$SESSTAT> to identify sessions with high "physical reads direct".
Remarks:
- Default size of HASH_AREA_SIZE is twice that of SORT_AREA_SIZE.
- Larger HASH_AREA_SIZE will influence optimizer to go for hash joins instead of nested loops.
- Hidden parameter DB_FILE_DIRECT_IO_COUNT can impact the direct path read performance.It sets the maximum I/O buffer size of direct read and write operations. Default is 1M in 9i.
- How to identify resource intensive SQL statements?
11. Shared pool latch
Possible Causes:
- The shared pool latch is used to protect critical operations when allocating and freeing memory in the shared pool.
- Contentions for the shared pool and library cache latches are mainly due to intense hard parsing. A hard parse applies to new cursors and cursors that are aged out and must be re-executed.
- The cost of parsing a new SQL statement is expensive both in terms of CPU requirements and the number of times the library cache and shared pool latches may need to be acquired and released.
Actions:
- Ways to reduce the shared pool latch are, avoid hard parses when possible, parse once, execute many.
- Eliminating literal SQL is also useful to avoid the shared pool latch. The size of the shared_pool and use of MTS (shared server option) also greatly influences the shared pool latch.
- The workaround is to set the initialization parameter CURSOR_SHARING to FORCE. This allows statements that differ in literal values but are otherwise identical to share a cursor and therefore reduce latch contention, memory usage, and hard parse.
Remarks:
- <Note 62143.1> explains how to identify and correct problems with the shared pool, and shared pool latch
12. Library cache latch
Possible Causes:
- The library cache latches protect the cached SQL statements and objects definitions held in the library cache within the shared pool. The library cache latch must be acquired in order to add a new statement to the library cache.
- Application is making heavy use of literal SQL- use of bind variables will reduce this latch considerably.
Actions:
- Latch is to ensure that the application is reusing as much as possible SQL statement representation. Use bind variables whenever possible in the application.
- You can reduce the library cache latch hold time by properly setting the SESSION_CACHED_CURSORS parameter.
- Consider increasing shared pool.
Remarks:
- Larger shared pools tend to have long free lists and processes that need to allocate space in them must spend extra time scanning the long free lists while holding the shared pool latch.
- if your database is not yet on Oracle9i Database, an oversized shared pool can increase the contention for the shared pool latch.
13. Latch Free Waits
Possible Causes:
- This wait indicates that the process is waiting for a latch that is currently busy (held by another process).
- When you see a latch free wait event in the V$SESSION_WAIT view, it means the process failed to obtain the latch in the willing-to-wait mode after spinning _SPIN_COUNT times and went to sleep. When processes compete heavily for latches, they will also consume more CPU resources because of spinning. The result is a higher response time.
Actions:
· If the TIME spent waiting for latches is significant then it is best to determine which latches are suffering from contention.
Remarks:
- A latch is a kind of low level lock.
- Latches apply only to memory structures in the SGA. They do not apply to database objects. An Oracle SGA has many latches, and they exist to protect various memory structures from potential corruption by concurrent access.
- The time spent on latch waits is an effect, not a cause; the cause is that you are doing too many block gets, and block gets require cache buffer chain latching.
- What are Latches and what causes Latch contention
- Database Lock and Latch Information Knowledge Browser Product Page
14. Cache Buffer LRU Chain Latch
Possible Causes:
- Processes need to get this latch when they need to move buffers based on the LRU block replacement policy in the buffer cache
- The cache buffer lru chain latch is acquired in order to introduce a new block into the buffer cache and when writing a buffer back to disk, specifically when trying to scan the LRU (least recently used) chain containing all the dirty blocks in the buffer cache.
- Competition for the cache buffers lru chain latch is symptomatic of intense buffer cache activity caused by inefficient SQL statements. Statements that repeatedly scan large unselective indexes or perform full table scans are the prime culprits.
- Heavy contention for this latch is generally due to heavy buffer cache activity which can be caused, for example, by: Repeatedly scanning large unselective Indexes.
Actions:
· Contention in this latch can be avoided implementing multiple buffer pools or increasing the number of LRU latches with the parameter _DB_BLOCK_LRU_LATCHES & _db_block_hash_buckets (The default value is generally sufficient for most systems).
· Its possible to reduce contention for the cache buffer lru chain latch by increasing the size of the buffer cache and thereby reducing the rate at which new blocks are introduced into the buffer cache.
select count(*) child_count, sum(gets) sum_gets,
sum(misses) sum_misses, sum(sleeps) sum_sleeps
from v$latch_children where name = 'cache buffers chains';
sum(misses) sum_misses, sum(sleeps) sum_sleeps
from v$latch_children where name = 'cache buffers chains';
Finding Hot Block
select P1 from v$session_wait where event = 'cache buffer chains';
If you are not using ASSM (bitmap freelists), you can easily relieve the buffer chain latch wait by adding freelists, up to your high-water mark of concurrent DML on the object:
alter index hot_idx freelists 4;
MOSC has this script to locate a hot block:
select /*+ RULE */
e.owner ||'.'|| e.segment_name segment_name,
e.extent_id extent#,
x.dbablk - e.block_id + 1 block#,
x.tch,
l.child#
from sys.v$latch_children l, sys.x$bh x, sys.dba_extents e
where x.hladdr = 'ADDR' and
e.file_id = x.file# and
x.hladdr = l.addr and
x.dbablk between e.block_id and e.block_id + e.blocks -1
order by x.tch desc;
select /*+ RULE */
e.owner ||'.'|| e.segment_name segment_name,
e.extent_id extent#,
x.dbablk - e.block_id + 1 block#,
x.tch,
l.child#
from sys.v$latch_children l, sys.x$bh x, sys.dba_extents e
where x.hladdr = 'ADDR' and
e.file_id = x.file# and
x.hladdr = l.addr and
x.dbablk between e.block_id and e.block_id + e.blocks -1
order by x.tch desc;
15, Cache Buffer Chain Latch
Possible Causes:
· This latch is acquired when searching for data blocks.
Buffer cache is a chain of blocks and each chain is protected by a child latch when it needs to be scanned.
Buffer cache is a chain of blocks and each chain is protected by a child latch when it needs to be scanned.
- Hot blocks are another common cause of cache buffers chains latch contention. This happens when multiple sessions repeatedly access one or more blocks that are protected by the same child cache buffers chains latch.
- SQL statements with high BUFFER_GETS (logical reads) per EXECUTIONS are the main culprits.
- Multiple concurrent sessions are executing the same inefficient SQL that is going after the same data set.
- CPU utilization continuously high.
Actions:
- Reducing contention for the cache buffer chains latch will usually require reducing logical I/O rates by tuning and minimizing the I/O requirements of the SQL involved. High I/O rates could be a sign of a hot block (meaning a block highly accessed).
- Exporting the table, increasing the PCTFREE significantly, and importing the data. This minimizes the number of rows per block, spreading them over many blocks. Of course, this is at the expense of storage and full table scans operations will be slower.
- Minimizing the number of records per block in the table.
- For indexes, you can rebuild them with higher PCTFREE values, bearing in mind that this may increase the height of the index.
- Consider reducing the block size.
- Starting in Oracle9i Database, Oracle supports multiple block sizes. If the current block size is 16K, you may move the table or recreate the index in a tablespace with an 8K block size. This too will negatively impact full table scans operations. Also, various block sizes increase management complexity.
- Find hot block and TCH count. If no HOT block found by kust looking low TCH. Checked who is holding the latch by checking latch address.
FIND HOT BLOCK select SID,PID,EVENT,P1,P1TEXT,P2,P2TEXT,P3,P3TEXT from v$session_wait
where event = 'cache buffer chains';
where event = 'cache buffer chains';
Reference:
Cache buffer chain high because Utilize CPU High
https://orainternals.wordpress.com/2008/07/30/tuning-latch-contention-cache-buffers-chain-latches/
http://arup.blogspot.in/2014/11/cache-buffer-chains-demystified.html
https://orainternals.wordpress.com/2008/07/30/tuning-latch-contention-cache-buffers-chain-latches/
http://arup.blogspot.in/2014/11/cache-buffer-chains-demystified.html
Remarks:
- The default number of hash latches is usually 1024.
- The number of hash latches can be adjusted by the parameter _DB_BLOCKS_HASH_LATCHES.
- What are latches and what causes latch contention
16. Enqueue Wait
Possible Causes:
· This wait event indicates a wait for a lock that is held by another session (or sessions) in an incompatible mode to the requested mode.
- TX Transaction Lock
o Generally due to table or application set up issues.
o This indicates contention for row-level lock.
o This wait occurs when a transaction tries to update or delete rows that are currently locked by another transaction.
o This usually is an application issue.
- TM DML enqueue lock
o Generally due to application issues, particularly if foreign key constraints have not been indexed.
· ST lock
o Database actions that modify the UET$ (used extent) and FET$ (free extent) tables require the ST lock, which includes actions such as drop, truncate, and coalesce.
o Contention for the ST lock indicates there are multiple sessions actively performing dynamic disk space allocation or deallocation in dictionary managed tablespaces.
Actions:
· Reduce waits and wait times.
· The action to take depends on the lock type which is causing the most problems
· Whenever you see an enqueue wait event for the TX enqueue, the first step is to find out who the blocker is and if there are multiple waiters for the same resource.
· Waits for TM enqueue in Mode 3 are primarily due to unindexed foreign key columns.
· Create indexes on foreign keys < 10g
· Following are some of the things you can do to minimize ST lock contention in your database:
· Use locally managed tablespaces.
· Recreate all temporary tablespaces using the CREATE TEMPORARY TABLESPACE TEMPFILE… command.
Remarks:
· Maximum number of enqueue resources that can be concurrently locked is controlled by the ENQUEUE_RESOURCES parameter.
17 Row cache objects latch
Possible Causes:
- This latch comes into play when user processes are attempting to access the cached data dictionary values.
Actions:
- It is not common to have contention in this latch and the only way to reduce contention for this latch is by increasing the size of the shared pool (SHARED_POOL_SIZE).
- Use Locally Managed tablespaces for your application objects especiallyindexes
- Review and amend your database logical design , a good example is to merge or decrease the number of indexes on tables with heavy inserts
Remarks:
- Configuring the library cache to an acceptable size usually ensures that the data dictionary cache is also properly sized. So tuning Library Cache will tune Row Cache indirectly
18 Idle Event
Possible Causes:
There are several idle wait events listed after the output; you can ignore them. Idle events are generally listed at the bottom of each section and include such things as SQL*Net message to/from client and other background-related timings. Idle events are listed in the stats$idle_event table
19 RDBMS IPC Message
Possible Causes:
It is typical of Oracle Database background processes to wait for more work.
For example, (LGWR,DBWr,LMS0) will wait for more work until another (foreground or background ) process request LGWR to do a log flush. In UNIX platforms, wait mechanism is implemented as a sleep on a specific semaphore associated with that process. This wait time is accounted towards database wait events ‘rdbms ipc message’. Also note that, semaphore based waits are used in other wait scenarios too, not just ‘rdbms ipc message’ waits.
20 Row Lock Contention
Tuning idea:
------------------
1. I'm thinking to reorganize hot tables as well as their indexes, but by instinct it seems to not give so much value to avoid the huge row lock wait time.
2. I'm also seeing if I can reduce the number of rows per block, by increasing PCTFREE and diminishing PCTUSED, so the contention will spread over many blocks instead of one heavy block.
Question
As SQL stment related to those locked tables are select ... for update, how could I tune this kind of stment?
Does someone have other idea to come up with this row lock contention?
Ans
Increase the initrans value of that object appearing in the select statement.
------------------------------------------------------------------------------------------------------
But I have first to know how many initrans are allocated in the targetted table.
To do that, I need to do the following:
1. record one block number
SELECT DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid),COUNT(*) NUMBER_ROWS_PER_BLOCK FROM SIMANG_D.INWARD GROUP BY DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) Order by COUNT(*) desc
/
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) NUMBER_ROWS_PER_BLOCK
--------------------------------------------------------------------------------
---------------------
226534 5
226530 5
226504 5
226533 5
226538 5
226535 5
226525 5
226526 5
226532 5
228092 4
2. Dump the contents of a block through the following command
ALTER SYSTEM DUMP DATAFILE 4 BLOCK 226525;
3. see the result
Block header dump: 0x010374dd
Object id on Block? Y
seg/obj: 0xd863 csc: 0x00.cafe8 itc: 169 flg: E typ: 1 – DATA (max itl=169) brn: 1 bdba: 0x10374d1 ver: 0x01 opc: 0
inc: 0 exflg: 0
from this example, the initrans value is 169.
So, I need first to have the data locally then, check the initrans value before change it.
furthermore as I know, the new initrans won't affect current block, but new allocated, so I have to Rebuild the table by changing the initrans value.
Question.
What is the overhead when dumping a contents of a block from the production database as I specified above?
------------------
1. I'm thinking to reorganize hot tables as well as their indexes, but by instinct it seems to not give so much value to avoid the huge row lock wait time.
2. I'm also seeing if I can reduce the number of rows per block, by increasing PCTFREE and diminishing PCTUSED, so the contention will spread over many blocks instead of one heavy block.
Question
As SQL stment related to those locked tables are select ... for update, how could I tune this kind of stment?
Does someone have other idea to come up with this row lock contention?
Ans
Increase the initrans value of that object appearing in the select statement.
------------------------------------------------------------------------------------------------------
But I have first to know how many initrans are allocated in the targetted table.
To do that, I need to do the following:
1. record one block number
SELECT DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid),COUNT(*) NUMBER_ROWS_PER_BLOCK FROM SIMANG_D.INWARD GROUP BY DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) Order by COUNT(*) desc
/
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) NUMBER_ROWS_PER_BLOCK
--------------------------------------------------------------------------------
---------------------
226534 5
226530 5
226504 5
226533 5
226538 5
226535 5
226525 5
226526 5
226532 5
228092 4
2. Dump the contents of a block through the following command
ALTER SYSTEM DUMP DATAFILE 4 BLOCK 226525;
3. see the result
Block header dump: 0x010374dd
Object id on Block? Y
seg/obj: 0xd863 csc: 0x00.cafe8 itc: 169 flg: E typ: 1 – DATA (max itl=169) brn: 1 bdba: 0x10374d1 ver: 0x01 opc: 0
inc: 0 exflg: 0
from this example, the initrans value is 169.
So, I need first to have the data locally then, check the initrans value before change it.
furthermore as I know, the new initrans won't affect current block, but new allocated, so I have to Rebuild the table by changing the initrans value.
Question.
What is the overhead when dumping a contents of a block from the production database as I specified above?
21 Detect Index Leaf Block Contention
Index block contention can be tricky to diagnose and repair, especially since Oracle does not have many direct tuning knobs for tuning for contention.
The wait “enq: TX – index contention” indicates a wait on an index, and in a RAC environment you might also see “gc” (global cache) waits on indexes:
gc buffer busy waits on Index Branch Blocks
gc buffer busy waits on Index Leaf Blocks
gc current block busy on Remote Undo Headers
gc current split
gcs ast xid
gcs refuse xid
There are many ways to Fixing Oracle index contention (RAC / NON-RAC ENV.)
There are three techniques that are used to relieve this index contention issue:
1. Reverse key indexes
2. Sequences with the cache and noorder options
3. Using hash partitioned global indexes
4. Adjusting the index block size
select sid, sql_text
from v$session s, v$sql q
where sid in (select sid from v$session where state in ('WAITING')
and wait_class != 'Idle'
and event='enq: TX - index contention'
and (q.sql_id = s.sql_id or q.sql_id = s.prev_sql_id));
The wait “enq: TX – index contention” indicates a wait on an index, and in a RAC environment you might also see “gc” (global cache) waits on indexes:
gc buffer busy waits on Index Branch Blocks
gc buffer busy waits on Index Leaf Blocks
gc current block busy on Remote Undo Headers
gc current split
gcs ast xid
gcs refuse xid
There are many ways to Fixing Oracle index contention (RAC / NON-RAC ENV.)
There are three techniques that are used to relieve this index contention issue:
1. Reverse key indexes
2. Sequences with the cache and noorder options
3. Using hash partitioned global indexes
4. Adjusting the index block size
select sid, sql_text
from v$session s, v$sql q
where sid in (select sid from v$session where state in ('WAITING')
and wait_class != 'Idle'
and event='enq: TX - index contention'
and (q.sql_id = s.sql_id or q.sql_id = s.prev_sql_id));