db file sequential - read Single block read
db file scattered - read Multi block read
db file parallel - read Non-contiguous multi block read
read by other session - wait for another session to do the io
select parameter1, parameter2, parameter3 from v$event_name where ...
NAME P1 P2 P3
----------------------- ----------- --------- --------
db file sequential read file# block# blocks
db file scattered read file# block# blocks
db file parallel read files blocks requests
read by other session file# block# class#
db file sequential read
![]()
Example
- Top reported wait
- Single block read
- block accessed via
- index
- rowid
- rollback
select * from emp where empno=99;
where there is an index on emp(empno)
algorythm
search buffer cache for block by rowid,
if fail, read block off disk via file# and block#
Note: "sequential" means a sequence as in rowid
db file scattered read
Example
- Multi Block Read
- Full Table Scan
- Index Fast Full Scan
algorythmselect * from emp;
search buffer cache for block by rowid,
if fail, read block off disk via file# and block# and # of blocks
Note: "scattered" means blocks are scattered throughout the buffer cache (even though they are laid out sequential on disk)
db_file_multiblock_read_count set the target block read size
db file parallel read
Process issues multiple single block reads in parallel
Examples
- Documentation says only for recovery
- seems to happen for normal read ops
- Async Call – wait for all reads to complete
Example
- Not contiguous multi block read
- Index full or range scan
- Where values in
Select * from emp where empno in (1,2,3,4,5,6);
read by other session
This mechanism lets the client bypass the buffer cache for I/O intensive operations.The disk blocks are written into and read from process private memory.
select parameter1, parameter2, parameter3 from v$event_name
NAME P1 P2 P3
----------------------- ----------- --------- --------
direct path read file number first dba block cnt
direct path read temp file number first dba block cnt
direct path write file number first dba block cnt
direct path write temp file number first dba block cnt
Direct I/O read and write size are controlled by the parameters
PARAMETER INSTANCE_VALUE
_smm_auto_min_io_size 56
_smm_auto_max_io_size 248
direct path read
parallel query
(or using _serial_direct_read = true)
further reading:
http://afatkulin.blogspot.com/2009/01/11g-adaptive-direct-path-reads-what-is.html
http://antognini.ch/2009/10/hints-for-direct-path-insert-statements/
http://dioncho.wordpress.com/2009/07/21/disabling-direct-path-read-for-the-serial-full-table-scan-11g/
Grouping by Query Coordinator:
selectash.SQL_ID,QC_SESSION_ID qsid,count(*) cnt,count (distinct session_id) deg,nvl(o.object_name,to_char(CURRENT_OBJ#)) obj,o.object_type otype,decode(session_state, 'WAITING',event,'CPU') eventfrom v$active_session_history ash,all_objects owhere o.object_id (+)= ash.CURRENT_OBJ#and qc_session_id is not nullgroup by qc_session_id, sql_id, o.object_name,o.object_type, CURRENT_OBJ#, event, session_stateOrder by qc_session_id, sql_id
SQL_ID QSID CNT DEG OBJ OTYPE EVENT
------------- ---- ---- --- ------------ ----- --------------------------
7p3jt75phub2d 144 386 4 WAIT_OBJECTS TABLE PX Deq Credit: send blkd
144 4 3 WAIT_OBJECTS TABLE PX qref latch
144 37 1 WAIT_OBJECTS TABLE db file sequential read
144 3 2 WAIT_OBJECTS TABLE direct path read
144 70 1 WAIT_OBJECTS TABLE CPU
144 21 4 0 PX Deq Credit: send blkd
144 12 4 0 db file sequential read
direct path write
sqlldr
loading LOBs
parallel DMLs
create table as select
create index
Seems to be a bug on 10g where direct path write waits are incorrectly recorded as CPU
(from metalink)
"If asynchronous IO is not being used then the IO write request blocks until completed but this dies not show as a wait at the time the IO is issued. The session returns later to pick up the completed IO data but can then show a wait on "direct path write" even though this wait will return immediately. "
11g : There is hidden parameter _direct_path_insert_features, which seems controls direct path behaviour. alter session set "_direct_path_insert_features" = 1 turns off direct path for insert .. values().further reading:
http://afatkulin.blogspot.com/2009/01/insert-append-into-values.html
direct path read temp
This event happens when sorting overflows the memory buffers and has to be written to disk.
If it's a problem consider increasing paramater
pga_aggregate_target
The explain plan can give estimated sort size requirements and extended row source stats in the explain plan can give the actual sort size usage of the sql statement
select s.sid, --s.osuser, s. process,
s.sql_id, tmp.segtype,
((tmp.blocks*8)/1024)MB, tmp.tablespace
from
v$tempseg_usage tmp,
v$session s
where tmp.session_num=s.serial#
and segtype in ('HASH','SORT')
order by blocks desc
SID SQL_ID SEGTYPE MB TABLESPACE
--- ------------- ------- -- ----------
90 gm5s0vc0vzppc SORT 2 TEMP03
105 gm5s0vc0vzppc SORT 2 TEMP03
80 gm5s0vc0vzppc SORT 2 TEMP03
105 gm5s0vc0vzppc HASH 2 TEMP03
102 gm5s0vc0vzppc SORT 2 TEMP03
90 gm5s0vc0vzppc HASH 2 TEMP03
The following information from Jonathan Lewis, Randolf Geist and Joze Senegacnik :
On Oracle starting in 10.2.0.3 to 11.1.0.6, this doesn't work:
alter session set workarea_size_policy = manual;alter session set sort_area_size = 104857600;
broke in 10.2.0.3fixed in 11.1.0.7
Workaround for serial connections, just run the command twice:
alter session set sort_area_size = 104857600;alter session set sort_area_size = 104857600;
But this doesn't work for parallel sessions that don't pick up the manual size at all but use the system default size (64K).You can change the system default, but this will be a one size fits all workaround.Does not seem to affect HASH_AREA_SIZE. hash_area_size parameter is an integer in bytes so it is might be limited to approx 2G
On 9i the upper limit for sort area size using PGA_AGGREGATE_TARGET for a single session was fixed to 100M unless using undocumented parameters:
_pga_max_size: Maximum PGA size for a single processDefaults to 200M_smm_max_size: Maximum workarea size for one processDefaultsMin(0.05*pga_aggregate_target,0.5*_pga_max_size,100M)I believe this parameter is in KILOBYTES, from testing, see also comments onhttp://dba-blog.blogspot.com/2005/08/pgamaxsize-hidden-parameter.html
In 10gR2 the 100M does not apply anymore when using a PGA_AGGREGATE_TARGET > 1000M.In 10.2 it is limited to 200M as long as pga_aggregate_target is smaller than 1GB.
When pga_aggregate_target is large than 1GB then _pga_max_size= 20% of pga_aggregate_target .The pga_aggregate_target now drives in most cases _smm_max_size:
pga_aggregate_target <=500MB, _smm_max_size = 20%* pga_aggregate_targetpga_aggregate_target between 500MB and 1000MB, _smm_max_size = 100MBpga_aggregate_target >1000MB, _smm_max_size = 10%* pga_aggregate_targetand _smm_max_size in turns now drives _pga_max_size: _pga_max_size = 2 * _smm_max_size
A pga_aggregate_target larger than 1000MB will now allow much higher default thresholds in 10gR2: pga_aggregate_target set to 5GB will allow an _smm_max_size of 500MB (was 100MB before) and _pga_max_size of 1000MB (was 200MB).
set
linesize 100
col
block_type
for a18
col
objn
for a25
col
otype
for a15
col
event for
a25
col
blockn
for
999999
col
p1 for
9999
col
aas
for
999.99
col
f_minutes
new_value
v_minutes
select &minutes
f_minutes
from
dual;
select io.cnt,
round(io.cnt/(&
v_minutes
*60),2)
aas
,
io.event
,
io.p1 p1, f.tablespace_name
from (select count(*)
cnt
,
substr
(event,0,25) event,
ash.p1 p1
from
v$active_session_history
ash
where ( event like 'db
file s%' or event like 'direct%' )
and
sample_time
>
sysdate
- &
v_minutes
/(60*24)
group by event , ash.p1
)
io
,
dba_data_files
f
where f.file_id
= io.p1
Order by io.cnt;
CNT AAS EVENT P1 TABLESPACE
---- --- ----------------------- --- ----------
1 .00 db file sequential read 1 SYSTEM
2 .00 db file sequential read 3 SYSAUX
38 .06 db file sequential read 6 SOE
179 .30 db file sequential read 7 SOEINDEX
select
count(*) cnt,
CURRENT_OBJ#||' '||o.object_name
objn,
o.object_type otype
from
v$active_session_history ash, all_objects o
where ( event like
'db file s%' or event like 'direct%' )
and
o.object_id (+)= ash.CURRENT_OBJ#
and
sample_time > sysdate - &minutes/(60*24)
and
session_state='WAITING'
group by
CURRENT_OBJ#, o.object_name , o.object_type
Order by count(*);
CNT AAS OBJN OTYPE
---- ------- ------------------------- ---------------
79 .00 52949 ORDER_ITEMS TABLE PARTITION
97 .00 -1
130 .00 53117 ORD_STATUS_IX INDEX
498 .01 53120 CUST_EMAIL_IX INDEX
512 .01 0
1632 .03 53112 ITEM_ORDER_IX INDEX
col block_type for a18
col obj for a20
col otype for a15
col event for a15
col blockn for
999999
col f_minutes
new_value v_minutes
col p1 for 9999
col tablespace_name
for a15
select &minutes
f_minutes from dual;
select io.cnt cnt,
io.aas aas, io.event event,
substr(io.obj,1,20) obj, io.p1 p1, f.tablespace_name
tablespace_name
from (select
count(*) cnt, round(count(*)/(&v_minutes*60),2) aas,
substr(event,0,15) event,
nvl(o.object_name,decode(CURRENT_OBJ#,-1,0,CURRENT_OBJ#))
obj,
ash.p1, o.object_type otype
from v$active_session_history ash, all_objects o
where ( event like 'db file s%' or event like
'direct%' )
and o.object_id (+)= ash.CURRENT_OBJ#
and sample_time > sysdate -
&v_minutes/(60*24)
group by substr(event,0,15) , CURRENT_OBJ#,
o.object_name , o.object_type , ash.p1
) io,
dba_data_files f
where f.file_id =
io.p1
Order by io.cnt;
CNT AAS EVENT OBJ P1 TABLESPACE_NAME
--- --- --------------- ----------------- --- ---------------
1 .02 db file sequent ORDER_ITEMS_PK 7 SOEINDEX
1 .02 db file sequent 0 3 SYSAUX
2 .03 db file scatter WAIT_OBJECTS 1 SYSTEM
2 .03 db file sequent ORDER_ITEMS 6 SOE
3 .05 db file sequent CUST_EMAIL_IX 7 SOEINDEX
4 .07 db file sequent CUST_LNAME_IX 7 SOEINDEX
6 .10 db file sequent ORD_ORDER_DATE_IX 7 SOEINDEX
13 .22 db file sequent ITEM_ORDER_IX 7 SOEINDEX
25 .42 db file sequent 0 2 UNDOTBS1
TCNT SQL_ID CNT AAS OBJN OBJ P1 TABLESPAC
---- ------------- --- --- ----- --------------- -- ----------
30 0yas01u2p9ch4 1 .01 53113 ITEM_PRODUCT_IX 7 SOEINDEX
1 .01 53079 ORDER_ITEMS_UK 7 SOEINDEX
28 .16 53112 ITEM_ORDER_IX 7 SOEINDEX
58 6v6gm0fd1rgrz 4 .02 54533 WAIT_OBJECTS 1 SYSTEM
54 .30 0 0 2 UNDOTBS1
AAS SQL_ID PCT OBJ SUB_OBJ OTYPE EVENT F# TABLESPAC CONTENTS
---- ----------------- ----------- ------- ---------- ---------- -- --------- ---------
.00 f9u2k84v884y7 33 CUSTOMERS SYS_P27 TABLE PART sequentia 1 SYSTEM PERMANENT
33 ORDER_PK INDEX sequentia 4 USERS PERMANENT
33 sequentia 2 UNDOTBS1 UNDO
.01 0tvtamt770hqz 100 TOTO1 TABLE scattered 7 NO_ASSM PERMANENT
.06 75621g9y3xmvd 3 CUSTOMERS SYS_P36 TABLE PART sequentia 4 USERS PERMANENT
3 CUSTOMERS SYS_P25 TABLE PART sequentia 4 USERS PERMANENT
3 CUSTOMERS SYS_P22 TABLE PART sequentia 4 USERS PERMANENT
3 CUSTOMERS SYS_P29 TABLE PART sequentia 4 USERS PERMANENT
3 CUSTOMERS SYS_P21 TABLE PART sequentia 4 USERS PERMANENT
3 CUSTOMERS SYS_P35 TABLE PART sequentia 4 USERS PERMANENT
3 CUSTOMERS SYS_P32 TABLE PART sequentia 4 USERS PERMANENT
3 CUSTOMERS SYS_P27 TABLE PART sequentia 4 USERS PERMANENT
3 CUSTOMERS SYS_P34 TABLE PART sequentia 4 USERS PERMANENT
4 CUSTOMERS SYS_P23 TABLE PART sequentia 4 USERS PERMANENT
5 CUSTOMERS SYS_P33 TABLE PART sequentia 4 USERS PERMANENT
50 CUSTOMERS_PK INDEX sequentia 4 USERS
select segment_name,
segment_type
from dba_extents
where &file = file_id -- File = P1
and &block between -- Block = P2
block_id and block_id + blocks – 1
Create table my_extents as select * from DBA_EXTENTS;
Create index my_extents_i on my_extents(FILE_ID,BLOCK_ID);
Enter value for file: 3
Enter value for block: 6619
OWNER SEGMENT_NAME SEGMENT_TYPE
--------------- -------------------- ------------------
WMSYS LOG_TAB_PK INDEX
Elapsed: 00:00:41.25
create table myextents as select * from dba_extents
Table created.
Elapsed: 00:01:25.73
col own for A7col segment_name for a40col segment_type for a15col cnt for 9999Selectcount(*) cnt,owner own,segment_name ,segment_typefromv$active_session_history ash,myextents extwhere( event like 'db file s%' or event like 'direct%' )and (current_obj# in (0,-1) or current_obj# is Null)and sample_time > sysdate - &v_minutes/(60*24)and session_state='WAITING'and P1 = file_idand P2 between block_id and block_id + blocks - 1group byowner, segment_name, segment_type/
CNT OWN SEGMENT_NAME SEGMENT_TYPE
--- ---- ---------------- ------------
11 SYS SMON_SCN_TO_TIME CLUSTER
993 SYS _SYSSMU7$ TYPE2 UNDO
...
150 rows selected.
Elapsed: 00:00:01.03
Check average read times per file
Should be between 5-20 ms
ADDM expects I/Os to be under 20ms
The parameter defaults to 10000 micro seconds
change behavior with DBMS_ADVISOR.SET_DEFAULT_TASK_PARAMETER( 'ADDM', 'DBIO_EXPECTED', <new value>)
Check time inData in Statspack under “File IO Stats” (8i and up)DBA_HIST_FILESTATXS for historical (10g)V$FILEMETRIC_HISTORY for recent (10g)
Check Cache buffer Hit ratio
Check db_cache_advice 9i and higherData in Statspack under “Buffer Pool Advisory”Want to optimize data caching
Tune High IO SQL
Check reads off of UNDO
File IO Stats DB/Inst:labsf03 Snaps: 1-2
Av Mx Av
Av Rd Rd Av Av Buffer BufWt
Tablespace Filename Reads Reads/s (ms) Bkt Blks/Rd Writes Writes/s Waits (ms)
---------- ----------------- ----- ------- ---- ---- ------- ------- -------- ------- ------
SYSTEM /u01/system01.dbf 445 15 0.4 16 1.0 1,157 39 2,744 93.3
USERS /u01/users01.dbf 223 7 0.5 ### 1.0 9,725 324 4 100.0
Buffer Cache Hit Ratio (BCHR) has a sordid history and it's tempting to just not mention it. For years the BCHR has been used to identify IO performance bottlenecks despite the fact that it can be completely unreliable. Connor McDonald even wrote a script (http://www.oracledba.co.uk/tips/choose.htm) that allows you to set the BCHR as high as you want without changing the size of the cache. The point being that IO bottlenecks should be identified with IO wait events and not BCHR, but if an IO bottleneck does happen, it is worth knowing if the buffer cache is supporting the IO load sufficiently. The question of whether the Buffer Cache is sufficiently sized can be addressed with Oracle's "DB Cache Advice".
Buffer Pool Advisory
Size for Size Buffers for Read Estimated
P Est (M) Factor Estimate Factor Physical Reads
--- -------- ------ ------------ ------ --------------
D 56 .1 6,986 2.3 58,928
D 112 .2 13,972 1.6 42,043
D 224 .4 27,944 1.0 25,772
D 336 .6 41,916 1.0 25,715
D 448 .8 55,888 1.0 25,715
D 596 1.0 74,351 1.0 25,715
D 728 1.2 90,818 1.0 25,715
D 840 1.4 104,790 1.0 25,715
D 952 1.6 118,762 1.0 25,715
D 1,064 1.8 132,734 1.0 25,715
The Buffer Pool Advisory gives expected physical reads at different cache sizes. The column "Factor" shows size relative to the current. I "Factor" of "1" is the current cache size. We can see in the above case, we can actually drop the cache size to .4 of the current size without significantly impacting the physical IO and on the other hand increasing it to 1.8 the current size won't help us at all. In this case the buffer cache size is probably fine an won't help us address IO performance issues on this particular database.
If the IO response time of the disks is good and if the buffer cache is correctly sized, then at this point it's clear it's time to tune the SQL with the most IO wait time:
select count(*),sql_id, event
from v$active_session_history
where event in ('db file sequential read',
'db file scattered read',
'db file parallel read')
and session_state='WAITING'
group by sql_id, event
order by count(*);
COUNT(*) SQL_ID EVENT
--------- ------------- ------------------------
10 8hk7xvhua40va db file sequential read
335 3hatpjzrqvfn7 db file sequential read
343 0uuqgjq7k12nf db file sequential read
549 75621g9y3xmvd db file sequential read
1311 0bzhqhhj9mpaa db file sequential read
1523 41zu158rqf4kf db file sequential read
1757 0yas01u2p9ch4 db file sequential read
The direct path read wait event has three parameters: file#, first block#, and block count. In Oracle Database 10g, this wait event falls under the User I/O wait class. Keep the following key thoughts in mind when dealing with the direct path read wait event.
These are waits that are associated with direct read operations. An Oracle direct read operation reads data directly into the session’s PGA (Program Global Area), bypassing the SGA. The data in the PGA is not shared with other sessions.
Direct reads may be performed in synchronous or asynchronous mode, depending on the platform and the value of the DISK_ASYNC_IO parameter. The systemwide direct path read wait event statistics can be very misleading when asynchronous I/O is used.
A significant number
of direct path
read waits is most likely an application
issue.
The direct path read waits are driven by SQL statements that perform direct read operations from the temporary or regular tablespaces.
SQL statements with functions that require sorts, such as ORDER BY, GROUP BY, UNION, DISTINCT, and ROLLUP, write sort runs to the temporary tablespace when the input size is larger than the work area in the PGA. The sort runs in the temporary tablespace are subsequently read and merged to provide the final result. An Oracle session waits on the direct path read wait event while the sort runs are being read. SQL statements that employ the merge join, either through a hint or as decided by the optimizer, also require sort.
SQL statements that employ the hash join, either as directed by a hint or the optimizer, flush the hash partitions that don’t fit in memory to the temporary tablespace. The hash partitions in the temporary tablespace are read back into the memory to be probed for rows that match the SQL predicates. An Oracle session waits on the direct path read wait event while the hash partitions are being read.
SQL statements that employ parallel scans also contribute to the systemwide direct path read waits. In a parallel execution, the direct path read waits are associated with the query slaves and not the parent query. The session that runs the parent query waits mostly on the PX Deq: Execute Reply wait event (in Oracle8i Database and above).
It is highly unlikely that
the direct path
read wait event will show up as the
leading bottleneck within a session even though it
may actually be it.
Therefore you shouldn’t evaluate direct path read waits based on the TOTAL_WAITS or TIME_WAITED in the V$SESSION_EVENT view. Instead, you can find current sessions that perform a lot of direct read operations from the V$SESSTAT view using the following query. The physical reads direct is comprised of the direct reads that are originated by the parent session itself as well as the sum of all direct reads that are originated by the query slaves that the parent session employs. The direct reads that are initiated by query slaves are only reflected in the parent session when the slaves complete their jobs. The downside to this approach is that there is no time element.
select a.name, b.sid, b.value,
round((sysdate - c.logon_time) * 24) hours_connected
from v$statname a, v$sesstat b, v$session c
where b.sid = c.sid
and a.statistic# = b.statistic#
and b.value > 0
and a.name = 'physical reads direct'
order by b.value;
NAME SID VALUE HOURS_CONNECTED
------------------------- ---- ---------- ---------------
physical reads direct 2 41 980
physical reads direct 4 41 980
physical reads direct 5 445186 980
Apart from finding the
sessions that perform a lot of direct reads, you
must also find out where the sessions are reading
from (temporary tablespace or data files), the SQL
statements that initiate the waits, and the type of
segments being read. The following query gives you
the answers. Sessions that read from the temporary
tablespace may be reading sort or hash segments.
Sessions that read from data files normally belong
to parallel query slaves.
select a.event,
a.sid,
c.sql_hash_value hash_value,
decode(d.ktssosegt,
1,'SORT', 2,'HASH', 3,'DATA',
4,'INDEX',5,'LOB_DATA',6,'LOB_INDEX',
null) as segment_type,
b.tablespace_name,
b.file_name
from v$session_wait a, dba_data_files b, v$session c, x$ktsso d
where c.saddr = d.ktssoses(+)
and c.serial# = d.ktssosno(+)
and d.inst_id(+) = userenv('instance')
and a.sid = c.sid
and a.p1 = b.file_id
and a.event = 'direct path read'
union all
select a.event,
a.sid,
d.sql_hash_value hash_value,
decode(e.ktssosegt,
1,'SORT', 2,'HASH', 3,'DATA',
4,'INDEX',5,'LOB_DATA',6,'LOB_INDEX',
null) as segment_type,
b.tablespace_name,
b.file_name
from v$session_wait a, dba_temp_files b, v$parameter c,
v$session d, x$ktsso e
where d.saddr = e.ktssoses(+)
and d.serial# = e.ktssosno(+)
and e.inst_id(+) = userenv('instance')
and a.sid = d.sid
and b.file_id = a.p1 - c.value
and c.name = 'db_files'
and a.event = 'direct path read'
order by 1,2;
-- File name output is edited to fit page.
EVENT SID HASH_VALUE SEGMENT TABLESPACE_N FILE_NAME
------------------ --- ---------- ------- ------------ -----------------
direct path read 8 511952958 SORT TEMP_BATCH temp_batch_01.dbf
direct path read 9 3138787393 ORDERS orders_01.dbf
direct path read 11 3138787393 ORDERS orders_01.dbf
direct path read 12 3138787393 ORDERS orders_01.dbf
direct path read 14 3138787393 ORDERS orders_01.dbf
If you catch a session reading sort segments from the temporary tablespace, this indicates the SORT_AREA_SIZE (or work area size if you use the PGA_AGGREGATE_TARGET in Oracle9i Database) is not large enough to accommodate a cache sort (or in memory sort). This is fine. It is unrealistic to expect all SQL statements to perform cache sorts. However, you should avoid multipass sorts because they create a lot of I/Os to the temporary tablespace and are very slow. How can you tell if a SQL statement is doing a multipass sort? Well, it is not that easy in versions prior to Oracle9i Database. You have to trace the session with the event 10032 and examine the trace file. However, beginning in Oracle9i Database, you can simply query the V$SQL_WORKAREA or V$SQL_WORKAREA_ACTIVE views with the SQL hash value that performs the sort.
The goal of tuning in this case is to minimize the number of sorts as a whole and, more specifically, disk sorts. Increasing the SORT_AREA_SIZE (or PGA_AGGREGATE_ TARGET) may help reduce the number of disk sorts, but that’s usually the workaround rather than the cure, unless your SORT_AREA_SIZE is unreasonably small to begin with. You should examine the application and the SQL statements to see if sorts are really necessary. Applications have the tendency to abuse the DISTINCT and UNION functions. Whenever possible use UNION ALL instead of UNION, and where applicable use HASH JOIN instead of SORT MERGE and NESTED LOOPS instead of HASH JOIN. Make sure the optimizer selects the right driving table. Check to see if the composite index’s columns can be rearranged to match the ORDER BY clause to avoid sort entirely. Also, consider automating the SQL work areas using PGA_AGGREGATE_TARGET in Oracle9i Database. Statistically, the automatic memory management delivers a higher percentage of cache sorts.
Note |
Be careful when switching from UNION to UNION ALL as this can produce different results depending on the data. The UNION operator returns only distinct rows that appear in either result, while the UNION ALL operator returns all rows. |
Note |
By default, the HASH_AREA_SIZE is twice the SORT_AREA_SIZE. A larger HASH_AREA_SIZE will influence the optimizer toward the hash joins (full table scan) rather than nested loops operation. |
Similarly, if you catch a session reading hash segments from the temporary tablespace, all that tells you is that the HASH_AREA_SIZE (or work area size in case of the PGA_AGGREGATE_TARGET in Oracle9i Database) is not big enough to accommodate the hash table in memory. The solution is similar to the one just mentioned: cure it from the application and SQL tuning before adjusting the HASH_AREA_SIZE (or PGA_AGGREGATE_TARGET). Unless, of course, your HASH_AREA_SIZE is too small to begin with.
If you discover that the direct reads belong to parallel query slaves, you should verify if parallel scans are appropriate for the parent SQL statement and that the degree of parallelism is right. Make sure the query slaves do not saturate your CPUs or disks. Identifying the parent SQL statement can be a bit tricky because the hash value of the parent statement is not the same as the hash value of child statements that the query slaves execute.This query can help you identify the parent SQL statements when parallel queries are involved:
select decode(a.qcserial#, null, 'PARENT', 'CHILD') stmt_level,
a.sid,
a.serial#,
b.username,
b.osuser,
b.sql_hash_value,
b.sql_address,
a.degree,
a.req_degree
from v$px_session a, v$session b
where a.sid = b.sid
order by a.qcsid, stmt_level desc;
STMT_L SID SERIAL# USERNAME OSUSER HASH_VALUE SQL_ADDR DEG REQ_DEG
------ --- ------- -------- ------- ---------- -------- --- -------
PARENT 20 779 INTREPID cdh8455 3663187692 A0938E54
CHILD 17 3617 INTREPID cdh8455 817802256 A092E1CC 4 4
CHILD 21 370 INTREPID cdh8455 817802256 A092E1CC 4 4
CHILD 12 841 INTREPID cdh8455 817802256 A092E1CC 4 4
CHILD 14 2241 INTREPID cdh8455 817802256 A092E1CC 4 4
The DB_FILE_DIRECT_IO_COUNT initialization parameter can impact the direct path read performance. It sets the maximum I/O buffer size for direct reads and writes operations. Up to Oracle8i Database, the default value on most platforms is 64 blocks. So if the DB_BLOCK_SIZE is 8K, the maximum I/O buffer size for direct reads and writes operations is 512K. This number is further subject to hardware limits.
The DB_FILE_DIRECT_IO_COUNT parameter is hidden in Oracle9i Database, and the value is expressed in bytes instead of blocks. The default value in Oracle9i Database is 1MB. The actual direct I/O size depends on your hardware configuration and limits.
You can discover the actual direct read I/O size in three ways:
Trace the Oracle session that performs direct reads operations using the trace event 10046 at level 8. The P3 parameter indicates the number of blocks read. Based on the following example, the direct path readI/O size is 64K since the block size is 8K. Alternatively, you can query the V$SESSION_WAIT view for the P3 value of the direct path read event.
WAIT #1: nam='direct path read' ela= 4 p1=4 p2=86919 p3=8
WAIT #1: nam='direct path read' ela= 5 p1=4 p2=86927 p3=8
WAIT #1: nam='direct path read' ela= 10 p1=4 p2=86935 p3=8
WAIT #1: nam='direct path read' ela= 39 p1=4 p2=86943 p3=8
WAIT #1: nam='direct path read' ela= 5 p1=4 p2=86951 p3=8
WAIT #1: nam='direct path read' ela= 38 p1=4 p2=86959 p3=8
. . .
Trace the Unix session that performs direct reads or writes operations using the operating system trace facility such as truss, tusc, trace, or strace. The snippet of the truss report from an Oracle9i Database reveals the direct I/O size is 65536 bytes or 64K:
9218/6: kaio(AIONOTIFY, -14712832) = 0
9218/1: kaio(AIOWAIT, 0xFFBECE98) = 1
9218/1: lwp_cond_signal(0xFEB7BFA0) = 0
9218/3: pread64(256, "0602\0\001\0 ~13C19AEE }".., 65536, 0x0FC26000) = 65536
9218/1: lwp_cond_signal(0xFEB69FA0) = 0
9218/4: pread64(256, "0602\0\001\0 ~1BC19AEFE7".., 65536, 0x0FC36000) = 65536
. . .
Enable the debug information for the session that performs direct I/O operations using the trace event 10357 at level 1. Example: alter session set events '10357 trace name context forever, level 1'. The snippet of the trace file is provided here:
Unix process pid: 4375, image: oracle@kccdeds73 (P000)
*** SESSION ID:(9.18) 2004-02-08 21:47:01.908
DBA Range Initialized: length is 1570, start dba is 0100602b
kcblin: lbs=fc86c1cc flag=8 slot_cnt=32 slot_size=65536 state obj=24321224
kcblin: state objects are: Call=243a2210,Current Call=243a2210, Session=24321224
kdContigDbaDrCbk:starting from tsn 5
kdContigDbaDrCbk:starting from rdba 0100602b
kdContigDbaDrCbk:returning 1570 blocks
kcblrs:issuing read on slot : 0
kcbldio:lbs=fc86c1cc slt=fc86408c typ=0 async=1 afn=4 blk=602b cnt=8 buf=fc87fe00
kcblrs:issuing read on slot : 1
kcbldio:lbs=fc86c1cc slt=fc864210 typ=0 async=1 afn=4 blk=6033 cnt=8 buf=fc89fe00
kcblcio: lbs=fc86c1cc slt=fc86408c type=0 afn=4 blk=602b cnt=8 buf=fc87fe00
. . .
In the preceding example, the trace file belongs to query slave #0 (P000). There are 32 I/O slots available for the direct read operation (slot_cnt=32). A slot is a unit of I/O, and each slot is 65536 bytes (slot_size=65536). Asynchronous I/O is enabled during the read operation (async=1). The query slave reads data file #4 (afn=4). The number of blocks read is 8 (cnt=8). Since the block size is 8K, this translates to 65536 bytes.
In this case, the direct I/O slot size prevents the process from achieving the full 1MB, which is the default limit of the _DB_FILE_DIRECT_IO_COUNT parameter. The slot size can be modified by event 10351. The number of slots can be modified by event 10353.
Caution |
The preceding information gives you a sense of the direct I/O throughput in your system. Don’t simply change the default slot size or the number of direct I/O slots. You need to know your hardware limits before doing so. Besides, you should focus on optimizing the application and SQL statements first. |
Lastly, in Oracle8i Database, direct reads can also be enabled for serial scans with the _SERIAL_DIRECT_READ initialization parameter. Earlier releases may do the same by setting event 10355. Doing so will cause data from full scans to be read into the session’s PGA and not shared with other sessions. This can sharply increase memory usage.
SQL*Net message to client - time to pack a message (no network time included) possibly tune SDUSQL*Net more data from client - possible network issues, possibly tune SDUSQL*Net more data to client - time to pack a message (no network time included) possibly tune SDU
SQL*Net message to dblinkSQL*Net more data from dblink - possible network issues, possibly tune SDUSQL*Net more data to dblink
Idle EventWaiting for work from ClientIncludes network transmission times for messages coming from shadowTypically indicative of Client “think time” or “processing time”
Example from Egor Starostin, http://oracledba.ruFrom a 10046 trace=====================PARSING IN CURSOR #1 len=43 dep=0 uid=0 oct=3 lid=0 tim=1304096237hv=2707617103 ad='89a03e18'select * from all_objects where rownum < 20END OF STMTPARSE #1:c=0,e=143,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1304096209EXEC #1:c=0,e=744,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1304097036WAIT #1: nam='SQL*Net message to client' ela= 3 driver id=1650815232#bytes=1 p3=0 obj#=-1 tim=1304097096FETCH #1:c=10000,e=6903,p=0,cr=9,cu=0,mis=0,r=1,dep=0,og=1,tim=13041040571->WAIT #1: nam='SQL*Net message from client' ela= 721 driverid=1650815232 #bytes=1 p3=0 obj#=-1 tim=1304104865 # [non-idle]WAIT #1: nam='SQL*Net message to client' ela= 1 driver id=1650815232#bytes=1 p3=0 obj#=-1 tim=1304105319FETCH #1:c=0,e=627,p=0,cr=21,cu=0,mis=0,r=15,dep=0,og=1,tim=13041055242->WAIT #1: nam='SQL*Net message from client' ela= 253 driverid=1650815232 #bytes=1 p3=0 obj#=-1 tim=1304105818 # [non-idle]WAIT #1: nam='SQL*Net message to client' ela= 1 driver id=1650815232#bytes=1 p3=0 obj#=-1 tim=1304105867FETCH #1:c=0,e=63,p=0,cr=6,cu=0,mis=0,r=3,dep=0,og=1,tim=13041059003->WAIT #1: nam='SQL*Net message from client' ela= 1960753 driverid=1650815232 #bytes=1 p3=0 obj#=-1 tim=1306066946 # [idle]=====================PARSING IN CURSOR #1 len=21 dep=0 uid=0 oct=3 lid=0 tim=1306069444hv=2200891488 ad='89913b50'select user from dualEND OF STMTPARSE #1:c=0,e=60,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1306069440...The first two "SQL*Net message from client' are in the middle of cursor processing and are considered non-idle waits.The third "SQL*Net message from client" is between cursors and considered an idle event, ie we are waiting for the next command from the client.
Doesn’t include network timingsee Tanel Poder's analysis of SQL*Net message to client
The only SQL*Net wait that can indicate a possible NETWORK problemClient is sending data to shadow that spans packets (think large data inserts, possibly large code blocks, large SQL statements)Shadow waits for next packet.Can indicate network latency.Can indicate a problem with the client tool
Here is an example with ASHMON where the application server died mid-stream on inserts. The shadow processes were left waiting for completion of the message. You can see the regular load on the database on the left, then just past the middle the load crashes, and all that's left is waits on "SQL*Net more data from client"
Possibly set SDU=32768 as well as setting RECV_BUF_SIZE and SEND_BUF_SIZE to 65536.
CREATE TABLE T1 (C1 NUMBER);ALTER TABLE T1 ADD(CONSTRAINT T1_CHECK1 CHECK (C1 IN ('J','N')));ALTER SESSION SET EVENTS'10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';INSERT INTO T1 VALUES (1);
Trace File
PARSING IN CURSOR #2 len=25 dep=0 uid=0 oct=2 lid=0 tim=5009300581224 hv=981683409 ad='8e6a7c10'INSERT INTO T1 VALUES (1)END OF STMTPARSE #2:c=0,e=2770,p=0,cr=2,cu=0,mis=1,r=0,dep=0,og=1,tim=5009300581220BINDS #2:EXEC #2:c=0,e=128,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=5009300581418ERROR #2:err=1722 tim=512952379WAIT #2: nam='SQL*Net break/reset to client' ela= 31 driver id=1650815232 break?=1 p3=0 obj#=-1 tim=5009300581549WAIT #2: nam='SQL*Net break/reset to client' ela= 92 driver id=1650815232 break?=0 p3=0 obj#=-1 tim=5009300581662
Unfortunately Oracle doesn't give much information about debugging unless you are trace. If you don't trace, the SQL won't be captured because from Oralce's point of view the problem statement isn't an acceptable SQL statement so there is no SQL ID to track down.
These waits are the same as
SQL*Net message to dblinkSQL*Net more data from dblinkSQL*Net more data to dblinkSQL*Net break/reset to dblink
The default SDU can be set in the sqlnet. oraIf it's not set, the default is 2048The max is 32768The default,or the value in sqlnet.ora, can be overridden in the tnsnames. ora and the listener.ora. The client and server negotiate the size aggreeing on the smaller of the two settings.(TDU – Transmission Data Unit – see note 44694.1 The TDU parameter has been deprecated in the Oracle Net v8.0 and beyond and is ignored. It is only mentioned here for backward compatibility.)
tnsnames.ora
V10G = (DESCRIPTION =(SDU=32768)(ADDRESS = (PROTOCOL = TCP)(HOST = fuji)(PORT = 1522))(CONNECT_DATA =(SERVER = DEDICATED) (SERVICE_NAME = v10g)) )
listener.ora
SID_LIST_LISTENER =(SID_LIST =(SID_DESC =(SDU=32768)(SID_NAME = v10g)(ORACLE_HOME = /export/home/oracle10)))
Tracing
sqlnet.ora
trace_level_client=16trace_directory_client=/tmptrace_file_client=client.trctrace_unique_client = truetrace_level_server=16trace_directory_server=/tmptrace_file_server=server.trc
client.trc
client_3582.trc:[12-JAN-2008 11:37:39:237] nsconneg: vsn=313, gbl=0xa01, sdu=32768, tdu=32767
more from Jonathan Lewis at http://www.jlcomp.demon.co.uk/sdu.html
The recommended size for these buffers (from Oracle's docs) is at least
Network bandwidth * roundtrip = buffer min size
For example if the network bandwidth is 100mbs and the round trip time (from ping) is 5ms then
100,000,000 bits 1 byte 5 seconds
---------------- x ------ x --------- = 62,500 bytes
1 second 8 bits 1000
tnsnames.ora
V10G = (DESCRIPTION =(SEND_BUF_SIZE=65536)(RECV_BUF_SIZE=65536)(ADDRESS = (PROTOCOL = TCP)(HOST = fuji)(PORT = 1522))(CONNECT_DATA =(SERVER = DEDICATED) (SERVICE_NAME = v10g)) )
listener.ora
SID_LIST_LISTENER =(SID_LIST =(SID_DESC =(SEND_BUF_SIZE=65536)(RECV_BUF_SIZE=65536)(SID_NAME = v10g)(ORACLE_HOME = /export/home/oracle10)))
sqlnet.ora
RECV_BUF_SIZE=65536SEND_BUF_SIZE=65536
log buffer spacelog file switch (archiving needed)log file switch (checkpoint incomplete)log file switch (private strand flush incomplete)log file switch completion
Redo is written to disk when
User commitsLog Buffer 1/3 full (_log_io_size)Log Buffer fills 1MEvery 3 secondsDBWR asks LGWR to flush redo
Sessions Commiting wait for LGWR
CommitRollback
Arguments
P1 = buffer# in log buffer that needs to be flushedP2 = not usedP3 = not used
Often possible in loops that commit every loop
Commit every 50 or 100 instead
Batching vs Single row operations
Single row operations, lots of communication , slowest-- slowest : single row inserts with commit
insert into foo values ...;
insert into foo values ...;
insert into foo values ...;
insert into foo values ...;
Send all the info over the wire to the database, one communication-- For Loop
FOR i IN 1..i LOOP
INSERT INTO foo VALUES (id(i), data(i));
END LOOP;
COMMIT;
Send all the info over the wire to the database, one communication, bulk insert-- FORALL
FORALL i IN 1..i
INSERT INTO foo VALUES (id(i), data(i));
COMMIT;
Put redo on dedicated diskUse Raw Device or Direct IOMore RadicalConsider Ram DisksCan stripe if redo writes are comparable to stripe sizeStriping shouldn’t hurtStriping can helpEx: imp – can have large redo writes – can improve by 10-30%Alternate disks for redo and archiving of redoPossibly 10gR2ALTER SYSTEM SET COMMIT_WRITE = BATCH, NOWAITCommit could be lost if machine crashOr IO errorManly-Men Only Use Solid State Disk for Redo Logging - Kevin Clossonhigh log file sync because of high CPU demand:
Tuning Log File Sync Wait Events - Riyaj Shamsudeen
Solution
Increase log_buffer parameter in init.ora
Above 3M log_buffer little affect, if still a problem then backup is at disk level
Improve disk IO for redo
Faster diskRaw fileDirect IODedicated disk
p1, p2, p3 – no values
No p1,p2,p3Database “hangs” for transactions
archive log stop;-- make room in log_archive_destarchive log start;
Wait for checkpoint to complete because all log files are fullSolutionsAdd more log filesIncrease size of log files
New wait 10gLike a “log file switch Completion”
Wait for lgwr to switch log files when generating redoSolution:
Increase redo log file size
Reduces frequency of switches
What happens when a log file switch occurs:
Get next log file from control fileGet Redo Copy and Redo Allocation latchFlush redoClose FileUpdate Controlfile
Set new file to CurrentSet old file to ActiveIf in Archivelog mode add file to archive listOpen all members of new logfile groupWrite the SCN to the headersEnable redo log generation
DBWR makes a list of blocks that need to be written out in order to over write the Redo log file a list of blocks that need to be written out in order to over write the Redo log file
Same as log file switch completion but the command is executed by the dbaAlter system switch logfile;
alter system set fast_start_mttr_target=17 scope=both;
SQL> select ESTIMATED_MTTR from V$INSTANCE_RECOVERY;
ESTIMATED_MTTR
--------------
21
select o.object_name obj, o.object_type otype, ash.SQL_ID, w.class from v$active_session_history ash, ( select rownum class#, class from v$waitstat ) w, all_objects o where event='buffer busy waits' and w.class#(+)=ash.p3 and o.object_id (+)= ash.CURRENT_OBJ# Order by sample_time;
OBJ OTYPE SQL_ID CLASS ------ ------ ------------- ------------------ TOTO1 TABLE 8gz51m9hg5yuf data block TOTO1 TABLE 8gz51m9hg5yuf data block TOTO1 TABLE 8gz51m9hg5yuf segment header TOTO1 TABLE 8gz51m9hg5yuf data block
set linesize 120
col block_type for a20
col objn for a25
col otype for a15
col filen for 9999
col blockn for 9999999
col obj for a20
col tbs for a10
select
bbw.cnt,
bbw.obj,
bbw.otype,
bbw.sql_id,
bbw.block_type,
nvl(tbs.name,to_char(bbw.p1)) TBS,
tbs_defs.assm ASSM
from (
select
count(*) cnt,
nvl(object_name,CURRENT_OBJ#) obj,
o.object_type otype,
ash.SQL_ID sql_id,
nvl(w.class,'usn '||to_char(ceil((ash.p3-18)/2))||' '||
decode(mod(ash.p3,2),
1,'header',
0,'block')) block_type,
--nvl(w.class,to_char(ash.p3)) block_type,
ash.p1 p1
from v$active_session_history ash,
( select rownum class#, class from v$waitstat ) w,
all_objects o
where event='buffer busy waits'
and w.class#(+)=ash.p3
and o.object_id (+)= ash.CURRENT_OBJ#
and ash.session_state='WAITING'
and ash.sample_time > sysdate - &minutes/(60*24)
--and w.class# > 18
group by o.object_name, ash.current_obj#, o.object_type,
ash.sql_id, w.class, ash.p3, ash.p1
) bbw,
(select file_id,
tablespace_name name
from dba_data_files
) tbs,
(select
tablespace_name NAME,
extent_management LOCAL,
allocation_type EXTENTS,
segment_space_management ASSM,
initial_extent
from dba_tablespaces
) tbs_defs
where tbs.file_id(+) = bbw.p1
and tbs.name=tbs_defs.name
Order by bbw.cnt
/
CNT OBJ OTYPE SQL_ID BLOCK_TYPE TBS ASSM
----- ------- ------- ------------- ---------------- ---------- ------
3 TOTO1 TABLE 8gz51m9hg5yuf segment header NO_ASSM MANUAL
59 TOTO1 TABLE 8gz51m9hg5yuf data block NO_ASSM MANUAL
select
tablespace_name,
extent_management LOCAL,
allocation_type EXTENTS,
segment_space_management ASSM,
initial_extent
from dba_tablespaces
TABLESPACE_NAME LOCAL EXTENTS ASSM
--------------- ---------- --------- ------
SYSTEM LOCAL SYSTEM MANUAL
UNDOTBS1 LOCAL SYSTEM MANUAL
SYSAUX LOCAL SYSTEM AUTO
TEMP LOCAL UNIFORM MANUAL
USERS LOCAL SYSTEM AUTO
EXAMPLE LOCAL SYSTEM AUTO
DATA LOCAL SYSTEM MANUAL
create tablespace data2
datafile '/d3/kyle/data2_01.dbf'
size 200M
segment space management auto;
Select CURRENT_OBJ#||' '||o.object_name objn,
o.object_type otype,
CURRENT_FILE# filen,
CURRENT_BLOCK# blockn,
ash.SQL_ID,
w.class ||' '||to_char(ash.p3) block_type
from v$active_session_history ash,
(select rownum class#, class from v$waitstat ) w,
all_objects o
where event='buffer busy waits'
and w.class#(+)=ash.p3
and o.object_id (+)= ash.CURRENT_OBJ#
Order by sample_time;
IF the block is of class > 18, the there will be no object name, so we have to look it up ourselves to be sure:OBJN OTYPE FILEN BLOCKN SQL_ID BLOCK_TYPE
----------- ------ ------ ------ ------------- ------------
54962 TOTO1 TABLE 16 45012 8gz51m9hg5yuf data block
54962 TOTO1 TABLE 16 161 8gz51m9hg5yuf segment header
0 14 9 8gz51m9hg5yuf 87
0 14 9 8gz51m9hg5yuf 87
select segment_name,
segment_type
from dba_extents
where
&P2 between
block_id and block_id + blocks – 1
and
file_id = &P1 ;
Plug in 14 for P1 the file # and 9 for P2 the block number:
SEGMENT_NAME SEGMENT_TYPE
-------------- --------------
R2 ROLLBACK
alter system set undo_management=auto scope=spfile;
Time P1 P2 OBJN OTYPE FN BLOCKN BLOCK_TYPE
----- --- --- ---- ----- -- ------ -----------------
11:44 202 2 -1 0 0 file header block
11:44 202 2 TOTO TABLE 1 60218 file header block
11:44 202 2 TOTO TABLE 1 60218 file header block
11:44 202 2 TOTO TABLE 1 60218 file header block
11:44 202 2 TOTO TABLE 1 60218 file header block