Search

Tuesday, June 28, 2016

AWR and ASH commands

- Automatic Workload Repository (AWR)
- Basic ideas
- dba_hist% views
- 11.2 enhancements
- AWR use cases & limitations

- Active Session History (ASH)
- Basic ideas
- 11.2 enhancements
- Typical use cases

Oracle Database performance diagnostics tools evolution:
- UTLBSTAT/UTLESTAT
- STATSPACK (starting from Oracle 8i)
- AWR (starting from Oracle 10g)

Automatic Workload Repository

SQL> select inst_id, name, value from gv$sysstat
where name in ('physical reads','session logical reads','user commits','user rollbacks')
order by name, inst_id;

INST_ID NAME VALUE
---------- ---------------------- ----------
1 physical reads 219525005
2 physical reads 905189162
1 session logical reads 5925054313
2 session logical reads 1.7457E+10
1 user commits 5294595
2 user commits 28113236
1 user rollbacks 8455066
2 user rollbacks 150792358


-- All stattistics:

select * from v$statname;
10.2.0.4 = 387; 10.2.0.5 = 395
11.2.0.2 = 628


SQL> select inst_id, event, total_waits, time_waited, average_wait,
sum(time_waited_micro) over (partition by event) twm_total
from gv$system_event
where wait_class='User I/O'
order by twm_total desc, event, inst_id;


INST_ID EVENT TOTAL_WAITS TIME_WAITED AVERAGE_WAIT TWM_TOTAL
---------- ------------------------- ----------- ----------- ------------ ----------
1 db file sequential read 27577537 40274551 1.46 3.8091E+12
2 db file sequential read 159856155 340633564 2.13 3.8091E+12
1 direct path read temp 44837941 8919281 .2 2.9431E+11
2 direct path read temp 122808964 20511343 .17 2.9431E+11
1 direct path read 1435296 1511142 1.05 9.9823E+10
2 direct path read 12075539 8471152 .7 9.9823E+10


-- All events:
select count(*) from v$event_name;
10.2.0.4 = 889; 10.2.0.5 = 916
11.2.0.2 = 1142


SQL> select sql_id, execs, rw, LIOs, PhIOs from ( select sql_id, /*sql_text,*/ sum(executions) execs, sum(rows_processed) rw, sum(buffer_gets) LIOs, sum(disk_reads) PhIOs from gv$sql where module='SQL Developer' and rownum>=1 group by sql_id/*, sql_text*/ order by PhIOs desc) where rownum<=5;



SQL_ID EXECS RW LIOS PHIOS
------------- ---------- ---------- ---------- ----------
3w1fk8r93360m 1 1 8156 348
davt444np8yr7 6 6 3583 289
7b775q3tc69a9 22 22 2899 213
58kz178ugwm87 29 29 2982 96
ayaq4d3bbt7k8 1 3 106 88



AWR just makes a copy of system views with scheduled time interval…
… and gives a possibility to make a report (diff) based on two snapshots.



AWR configuration.
 
- Oracle Diagnostic Pack is required (Oracle EE additional option).
- STATISTICS_LEVEL = TYPICAL or ALL
- (11g) CONTROL_MANAGEMENT_PACK_ACCESS = DIAGNOSTIC or DIAGNOSTIC+TUNING



AWR configuration.
 
SQL> select * from dba_hist_wr_control;
DBID SNAP_INTERVAL RETENTION TOPNSQL
---------- ------------------ ------------------ ----------
1486911308 +00000 01:00:00.0 +00007 00:00:00.0 DEFAULT


To change:
dbms_workload_repository.modify_snapshot_settings(
retention=>129600, -- New retention time (in minutes). 60*24*90=129600
interval=>30, -- New interval setting between each snapshot, in units of minutes
topnsql=>50 – Number 30..50000; 'DEFAULT', 'MAXIMUM') 



Create (additional) snapshot:
dbms_workload_repository.create_snapshot('ALL') -- flush_level = 'TYPICAL' or 'ALL‘


SQL> select dbms_workload_repository.create_snapshot from dual;


CREATE_SNAPSHOT
---------------
7762


SQL> select dbms_workload_repository.create_snapshot('ALL') from dual;


DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT('ALL')
-----------------------------------------------
7763


SQL> select snap_id, count(*) from dba_hist_sqlstat where snap_id in (7762,7763) group by snap_id;


SNAP_ID COUNT(*)
---------- ----------
7763 103
7762 59


Generate report across two snapshots:
 

select * from table(dbms_workload_repository.awr_report_text(2334238573, -- DBID
1, -- instance number
38392, -- ‘Begin Snapshot’ ID
38393, -- ‘End Snapshot’ ID
0) -- flag = 0 or 8 (display ADDM specific parts));




-- ADDM sections

Buffer Cache Advisory (for ADDM); 
SGA Target Advisory (for ADDM); 
Shared Pool Advisory (for ADDM); 
Streams Pool Advisory (for ADDM); 
PGA Target Advisory (for ADDM)
There is a similar awr_report_html() function.



AWR report (fragments) 1/8:
WORKLOAD REPOSITORY report for


DB Name DB Id Instance Inst Num Startup Time Release RAC
------------ ----------- ------------ -------- --------------- ----------- ---
XXXXX 207063651 ###### 1 09-Oct-11 20:43 11.2.0.2.0 YES
Host Name Platform CPUs Cores Sockets Memory(GB)
---------------- -------------------------------- ---- ----- ------- ----------
###### Linux x86 64-bit 8 8 4 31.38


Snap Id Snap Time Sessions Curs/Sess
--------- ------------------- -------- ---------
Begin Snap: 46095 03-Feb-12 08:00:02 248 5.0
End Snap: 46096 03-Feb-12 09:00:12 249 5.0
Elapsed: 60.17 (mins)
DB Time: 183.68 (mins)
11.2 new data/format


AWR reports 2/8:
 
Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~ --------------- --------------- ---------- ----------
DB Time(s): 3.1 0.4 0.05 0.02
DB CPU(s): 1.1 0.1 0.02 0.01
Redo size: 167,379.4 20,585.9
Logical reads: 70,689.2 8,694.0
Block changes: 769.8 94.7
Physical reads: 3,480.6 428.1
Physical writes: 126.5 15.6
User calls: 138.9 17.1
Parses: 46.4 5.7
Hard parses: 0.1 0.0
W/A MB processed: 4.9 0.6
Logons: 0.1 0.0
Executes: 63.2 7.8
Rollbacks: 0.3 0.0
Transactions: 8.1


AWR report 3/8:
 
Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                  Avg
                                 wait %   DB
Event Waits Time(s)  (ms)      time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
db file sequential read 618,214 4,667 8 42.3 User I/O
DB CPU 4,085 37.1
direct path read temp 810,891 1,726 2 15.7 User I/O
enq: TX - row lock contention 693 208 300 1.9 Application
log file sync 29,137 104 4 .9 Commit
Host CPU (CPUs: 8 Cores: 8 Sockets: 4)
~~~~~~~~ Load Average
Begin End %User %System %WIO %Idle
--------- --------- --------- --------- --------- ---------
4.45 4.17 27.9 1.1 14.1 70.7
Instance CPU
~~~~~~~~~~~~
% of total CPU for Instance: 15.0
% of busy CPU for Instance: 51.3 




AWR reports 4/8:
Wait Event Histogram DB/Inst: ######/###### Snaps: 46095-46096
<…>
            % of Waits
            -----------------------------------------------
Total
Event Waits <1ms <2ms <4ms <8ms <16ms <32ms <=1s >1s
-------------------------- ----- ----- ----- ----- ----- ----- ----- ----- -----
db file sequential read 619.1 22.0 1.8 9.4 37.0 25.0 3.5 1.5 .0
<…>
            Waits 64ms
Event to 2s <32ms <64ms <1/8s <1/4s <1/2s <1s <2s >=2s
-------------------------- ----- ----- ----- ----- ----- ----- ----- ----- -----
db file sequential read 9020 98.5 1.0 .3 .1 .1 .0 .0 .0
<…>
           Waits 4s
Event to 2m <2s <4s <8s <16s <32s < 1m < 2m >=2m
-------------------------- ----- ----- ----- ----- ----- ----- ----- ----- -----
db file sequential read 1 100.0 .0 





AWR report 5/8:
SQL ordered by Elapsed Time DB/Inst: ######/##### Snaps: 46095-46096
<…>
Elapsed Elapsed Time
Time (s) Executions per Exec (s) %Total %CPU %IO SQL Id
---------------- -------------- ------------- ------ ------ ------ -------------
3,610.8 4 902.69 32.8 3.8 96.6 3c8sz91kt214t
Module: JDBC Thin Client
MERGE INTO …
3,609.9 0 N/A 32.8 52.0 47.8 0q694rv7gzkzs
Module: SQL*Plus
SELECT count(*) FROM …




AWR report 6/8:
SQL ordered by Elapsed Time
SQL ordered by CPU Time
SQL ordered by User I/O Wait Time
SQL ordered by Gets
SQL ordered by Reads
SQL ordered by Physical Reads (UnOptimized)
-> UnOptimized Read Reqs = Physical Read Reqts - Optimized Read Reqs

SQL ordered by Executions
SQL ordered by Parse Calls
SQL ordered by Sharable Memory
SQL ordered by Version Count
SQL ordered by Cluster Wait Time




AWR report 7/8:
IOStat by Function summary DB/Inst: #####/##### Snaps: 46095-46096
Reads: Reqs Data Writes: Reqs Data Waits: Avg
Function Name Data per sec per sec Data per sec per sec Count Tm(ms)
--------------- ------- ------- ------- ------- ------- ------- ------- -------
Direct Reads 45.5G 228.3 12.9145 0M 0.0 0M 0 N/A
Buffer Cache Re 2.4G 171.7 .681144 0M 0.0 0M 619.1K 7.1
DBWR 0M 0.0 0M 1.7G 71.4 .493891 0 N/A
<…>
TOTAL: 48.9G 409.7 13.8572 2.8G 89.3 .800808 716.7K 6.3
IOStat by Filetype summary DB/Inst: #####/##### Snaps: 46095-46096
Reads: Reqs Data Writes: Reqs Data Small Large
Filetype Name Data per sec per sec Data per sec per sec Read Read
--------------- ------- ------- ------- ------- ------- ------- ------- -------
Temp File 39.6G 224.9 11.2232 0M 0.0 0M 1.6 N/A
Data File 8.4G 175.2 2.37223 1.7G 71.4 .493060 7.1 29.1
<…>
TOTAL: 48.9G 409.7 13.8569 2.8G 89.3 .799977 3.9 31.7
IOStat by Function/Filetype summary





AWR report 8/8:
Segments by Physical Reads
Segments by Physical Read Requests
Segments by UnOptimized Reads
Segments by Optimized Reads
Segments by Direct Physical Reads
Segments by Physical Writes
Segments by Physical Write Requests
Segments by Direct Physical Writes
Segments by Table Scans
Segments by DB Blocks Changes
+ enhanced RAC (interconnect, DRM) statistics





AWR baselines.
Baseline is a pair of snapshots…
Oracle10g: fixed baselines only:
DBMS_WORKLOAD_REPOSITORY.CREATE_BASELINE(
start_snap_id,
end_snap_id,
baseline_name...
);
Oracle 11g: baseline templates for fixed date or for repeated timeframe.



AWR tables and views.
AWR data is stored in SYS schema in SYSAUX tablespace.
Looks like there are WRH$% and WRM$% tables.
DBA_HIST% views are created above the tables.
10.2.0.4 – 78 views; 11.2.0.2 – 111 views
 




select <...>
from dba_hist_snapshot, dba_hist_sys_time_model
<...>
Date            min_snap_id max_id DB time DB CPU
31.01.2012 121851 121955 3582,46 1899,31
30.01.2012 121747 121850 3942,73 1787,58
29.01.2012 121643 121746 3225,62 1587,83
28.01.2012 121538 121642 2626,14 1410,22
27.01.2012 121435 121537 -705799,94 -284112,92
26.01.2012 121329 121434 3524,76 1934,44
25.01.2012 121226 121328 3935,66 2004,01




select <...>
from dba_hist_snapshot, dba_hist_system_event
<...>
Event_name   waits_1 duration_1 waits_2 duration_2
gc buffer busy       1378135         1,9 1    6189        0,4
gc cr grant 2-way     5830652  0,2         97805       0,2
gc current block 2-way 2783997 0,3     365330     0,3
gc cr multi block request 1339350 0,2    1850738   0,2
gc cr block 2-way 39967                0,3   948740    0,3
gc current grant busy 298901        0,7      20273     0,3
gc current grant 2-way 1027710    0,2    1164        0,3 
 



Example 1. Average and Skew.
So four executions took 278 seconds (69,5*4).
How long did each take?
69,5 in average.
 





Example 2. TOPNSQL.
 
A query is the same (9g7dr4k9qj1r1). Usage pattern is the same.  



“Heavy” executions:
DD EXECS ELA_PER_EXEC GETS_PER_EXEC
-------------------- ---------- ------------ -------------
28.12.2011 14 1053 379809 21161
28.12.2011 13 12336 9043237 21148
28.12.2011 12 942 379609 21296
28.12.2011 11 1097 378855 21060
28.12.2011 10 1505 382856 21168
28.12.2011 09 6102 7229608 21604



“Light” executions:
-------------------- ---------- ------------ -------------
29.12.2011 14 5860 2003 89
29.12.2011 13 6723 2016 91
29.12.2011 10 6435 2008 86
There were no executions in some snaps.
ADD_COLORED_SQL/REMOVE_COLORED_SQL to mark sql_id.
 



A kind of conclusion about AWR.
- It’s very useful tool for analyzing performance in past periods.
- Oracle11g AWR is better (more comprehensive and more convenient) than 10g.
- Selecting from AWR views gives even more powerful and flexible reports.
 


But…
AWR requires addition licensing.
It gives you only aggregated information.
Some data can be ‘omitted’.
 



- Automatic Workload Repository (AWR)
- Basic ideas
- dba_hist% views
- 11.2 enhancements
- AWR use cases & limitations



- Active Session History (ASH)

- Basic ideas
- 11.2 enhancements
- Typical use cases
 


Active Session History (ASH) is a new tool introduced in Oracle10g.
 
Once a second it makes snapshot of active sessions (on CPU or in non-Idle events).
Something close to…


select ... from gv$session
where wait_class<>'Idle';


ASH data is not complete but..
It is not aggregated (like other AWR data).
 



dba_hist_active_sess_history
10.2.0.5 – 52 columns; 11.2.0.2 – 97 columns.
 

Some of the columns:
 
SNAP_ID, DBID, INSTANCE_NUMBER, SAMPLE_ID, SAMPLE_TIME – sample identification;
 

SESSION_ID, SESSION_SERIAL#, USER_ID, PROGRAM, MODULE, MACHINE, PORT – session 
identification;
 

SQL_ID, SQL_PLAN_HASH_VALUE, SQL_PLAN_LINE_ID, SQL_EXEC_ID – SQL information;
 

EVENT, P1, P2, P3, TIME_WAITED – wait event information;
 

BLOCKING_SESSION, BLOCKING_SESSION_SERIAL#, BLOCKING_INST_ID – locking information;
 

TM_DELTA_CPU_TIME, TM_DELTA_DB_TIME, DELTA_READ_IO_REQUESTS. DELTA_WRITE_IO_REQUESTS, DELTA_READ_IO_BYTES, DELTA_WRITE_IO_BYTES, DELTA_INTERCONNECT_IO_BYTES – session activity info. 

 






ASH in memory – fixed memory consumption, variable time range.
SQL> select inst_id, bytes/(1024*1024) MB from gv$sgastat where name='ASH buffers';
 

INST_ID MB
---------- ----------
1 16
3 16
2 16
 


SQL> select inst_id, (max(sample_time)-min(sample_time)) ASH_window
from gv$active_session_history group by inst_id;
 

INST_ID ASH_WINDOW
---------- -------------------------------
1 +000000000 07:18:12.038
2 +000000000 07:01:26.268
3 +000000000 10:15:25.558
 

Samples for every second

ASH on disk – fixed time range controlled by AWR retention.
 
SQL> select instance_number, (max(sample_time)-min(sample_time)) ASH_window
from dba_hist_active_sess_history group by instance_number;


INSTANCE_NUMBER ASH_WINDOW
--------------- ----------------------------------
1 +000000032 04:59:50.479
2 +000000032 04:59:45.743
3 +000000032 04:59:40.818


SQL> select 'ASH in memory' what, count(distinct sample_id) samples_per_hour from v$active_session_history where sample_time>=trunc(sysdate) and sample_time<trunc(sysdate)+1/24 union all select 'ASH on disk', count(distinct sample_id) sample_id from dba_hist_active_sess_history where instance_number=1 and sample_time>=trunc(sysdate) and sample_time<trunc(sysdate)+1/24;



WHAT SAMPLES_PER_HOUR
------------- ----------------
ASH in memory 3592
ASH on disk 359
1/10 of samples from ‘ASH in memory’.


Each sample in ASH data represents a second of database activity.
Theoretically:
- Count of rows for one sample is DB Time.
- Count of rows with session_state=‘ON CPU’ is DB CPU.
In my experience the accuracy of the calculation hardly ever better than 15-20% on production systems with significant load.



Example 1. AWR report:
Snap Id Snap Time Sessions Curs/Sess
--------- ------------------- -------- ---------
Begin Snap: 118264 27-Dec-11 13:30:23 1,410 49.8
Elapsed: 15.06 (mins)
DB Time: 2,438.18 (mins)
<…>
 

Event Waits Time (s) (ms) Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
latch: cache buffers chains 409,488 51,930 127 35.5 Concurrenc
CPU time 6,341 4.3
db file sequential read 164,900 1,113 7 0.8 User I/O
<…>
 

Elapsed CPU Elap per % Total
Time (s) Time (s) Executions Exec (s) DB Time SQL Id
---------- ---------- ------------ ---------- ------- -------------
139,282 4,307 11,574 12.0 95.2 9g7dr4k9qj1r1
140 000 (ela for the query) / 900 (15 minutes) =
= 155 concurrently executed queries.



Example 1. ASH content (1/3):
 
SQL> select to_char(trunc(sample_time,'mi'),'hh24:mi') dd, count(*) c, count(distinct h.session_id||':'||h.session_serial#) s from dba_hist_active_sess_history h where h.instance_number=1 and h.snap_id between 118265 and 118265 and h.sample_time>=to_date('27.12.2011 13:30','dd.mm.yyyy hh24:mi') and h.sample_time <to_date('27.12.2011 13:45','dd.mm.yyyy hh24:mi') group by trunc(sample_time,'mi') order by trunc(sample_time,'mi');



DD C S
--------------- ---------- ----------
13:30 33 24
13:31 3134 1044
13:32 4846 914
13:33 55 48
13:34 37 24
13:35 28 17
13:36 25 17
13:37 239 230
13:38 5242 1016
13:39 1188 873
13:40 28 26
13:41 19 19
<…>



Example 1. ASH content (2/3):
SQL> select sample_time, count(*) c, sum(case when sql_id='9g7dr4k9qj1r1' then 1 else 0 end) s from dba_hist_active_sess_history h where h.instance_number=1 and h.snap_id between 118265 and 118265 and h.sample_time>=to_date('27.12.2011 13:30','dd.mm.yyyy hh24:mi') and h.sample_time <to_date('27.12.2011 13:45','dd.mm.yyyy hh24:mi') group by sample_time order by sample_time;



Example 1. ASH content (3/3):
 
SAMPLE_TIME C S
------------------------------ ---------- ----------
27-DEC-11 01.30.05.551 PM 6 0
27-DEC-11 01.30.15.591 PM 5 0
<…>
27-DEC-11 01.31.15.850 PM 4 0
27-DEC-11 01.31.25.959 PM 410 395
27-DEC-11 01.31.36.069 PM 980 963
27-DEC-11 01.31.46.652 PM 887 857
27-DEC-11 01.31.56.951 PM 848 822
27-DEC-11 01.32.06.994 PM 862 844
27-DEC-11 01.32.17.056 PM 852 831
27-DEC-11 01.32.27.118 PM 869 842
27-DEC-11 01.32.37.165 PM 855 840
27-DEC-11 01.32.47.225 PM 800 776
27-DEC-11 01.32.57.516 PM 608 570
27-DEC-11 01.33.07.566 PM 25 10
27-DEC-11 01.33.17.613 PM 8 1
27-DEC-11 01.33.27.658 PM 5 1
27-DEC-11 01.33.37.690 PM 4 0
<…> 




Example 2. sql_exec_id (Oracle 11g) (1/2).
 
SQL_EXEC_ID is counter of SQL executions on a given instance.
(See Tanel Poder blog entry.)
 

SQL> declare j number;
begin
dbms_job.submit(job=>j,
what=>'declare p number; begin for i in 1..3 loop select count(*) into p from positions e_test where mrp=1; end loop; end;',
instance=>1);
end;
/


PL/SQL procedure successfully completed.


SQL> declare j number;
begin
dbms_job.submit(job=>j,
what=>'declare p number; begin for i in 1..2 loop select count(*) into p from positions e_test where mrp=1; end loop; end;',
instance=>2);
end;
/
 



Example 2. sql_exec_id (Oracle 11g) (2/2).

SQL> select sql_id, sql_text from v$sql where sql_id='2dxypmkd8p3qk';
-------------
2dxypmkd8p3qk

SELECT COUNT(*) FROM POSITIONS E_TEST WHERE MRP=1

SQL> select inst_id, sql_exec_id, to_char(sql_exec_id,'XXXXXXXX') sql_exec_id_h, count(*) s
from gv$active_session_history
where sql_id='2dxypmkd8p3qk'
group by inst_id, sql_exec_id
order by 1,2;


INST_ID SQL_EXEC_ID SQL_EXEC_ S
---------- ----------- --------- ----------
1 16777216 1000000 4
1 16777217 1000001 4
1 16777218 1000002 5
2 33554432 2000000 5
2 33554433 2000001 4
 


ASH is great tool for:
- Investigate load spikes within an AWR snapshot.
- Analyze ‘long’ queries executions (especially in 11g).
- Investigate locking/blocking issues (if locking time not too short).


Limitations:
- The information is not complete. ASH on disk is more incomplete.
- DB activity spikes makes ‘retention window’ of ASH in memory shorter (sometimes dramatically).
- Tends to sample ‘long’ events.
- Licensing as part of AWR.
 




References 
Oracle® Database Documentation, 11g Release 2 (11.2)
Oracle® Database Performance Tuning Guide
5 Automatic Performance Statistics
http://docs.oracle.com/cd/E11882_01/server.112/e16638/autostat.htm


Oracle® Database PL/SQL Packages and Types Reference
161 DBMS_WORKLOAD_REPOSITORY
http://docs.oracle.com/cd/E11882_01/appdev.112/e25788/d_workload_repos.htm


Mike Ault. AWR Report Detailed Analysis.
http://www.nyoug.org/Presentations/2008/Sep/Ault_AWR.pdf


John Beresniewicz. Practical Active Session History
Tanel Poder. What the heck is the SQL Execution ID – SQL_EXEC_ID?
http://blog.tanelpoder.com/2011/10/24/what-the-heck-is-the-sql-execution-id-sql_exec_id/