This will decide the mode in which the optimizer engine should run in. Valid values are RULE, CHOOSE, ALL_ROWS, FIRST_ROWS (_n).
Optionally, the OPTIMIZER_MODE can be set to CHOOSE. This is kind of an intermediate option between RBO and CBO. In fact, it tries to run the query in either CBO or RBO depending on the availability or unavailability of statistics. Therefore, if the tables present in the query have statistics generated on them, CBO (ALL_ROWS only) is preferred or RBO is taken up.
e.g.: optimizer_mode = first_rows
For an OLTP system, FIRST_ROWS would be the ideal option for fast response time. ALL_ROWS is meant for batch-oriented applications. Note, a plan that produces the first n rows with the fastest response time might not be an optimal plan if requirement is to obtain the entire result, so decide as per the need of the application
Optimizer_Index_Cost_Ad takes a value between 1 and 10000 (default 100). Setting this value makes the index more or less friendly to the optimizer. For example setting it to 50 tells the optimizer that an execution plan that uses index(es) is half as expensive as other execution plans that are not using them. The lower the value (less than 100), the less full table scan executions will take place in the system.
Use of indexes in OLTP is a crucial factor for deciding the response time, but sometimes, full table scans provide better throughput. Nested loop joins are influenced by this parameter value. Set this parameter to a value that is optimal for your application. I recommend set this parameter between 10 - 20 for OLTP and 50 for DSS Systems..
If you are having slow performance because the CBO first_rows optimizer mode is favoring too many full-table scans, you can reset the optimizer_index_cost_adj parameter to immediately tune all of the SQL in your database to favor index scans over full-table scans. This is a "silver bullet" that can improve the performance of an entire database in cases where the database is OTLP and you have verified that the full-table scan costing is too low.
This tells optimizer to favor nested loops and IN-list iterators over sort-merge and hash joins. The default value is 100 and makes nested loops and IN-list operator look less expensive and more appealing to Optimizer. The value basically indicates the percentage of index blocks that the optimizer should assume are present in cache. I recommend set this parameter to 85.
This parameter determines the number of database blocks read in one input/output operation during a full table scan. The value set is used in computing the cost of full table scans. OLTP systems will not benefit by setting a high value for this parameter as the computed cost for full table scan execution plans would reduced. The maximum size is Operating system dependent.
e.g.: db_file_multiblock_read_count = 1 (may be set to a higher value)
This parameter defines the maximum memory space allocated per user process that requires sorting of data and for insert and updates to bitmap indexes. Optimizer uses the set value for determining the cost of sorting in a query. This defaults to 64K. Normally a size of 64K to 1M is appropriate for OLTP systems.
Sort-merge joins are influenced by this parameter value. The bigger the size the more appealing will be sort-merge joins over nested-loop and hash joins. The lower the size the more sorting will take place in temporary segments. So assign a value that has been evaluated for your setup.
Oracle recommends the use of PGA_AGGREGATE_TARGET instead of this parameter from Oracle 9i.
This specifies the number of database blocks to be read each time a sort performs a read from a temporary segment. The default value is 2. General recommendation is to keep the default value. However, if you intend to hike up the SORT_AREA_SIZE considerably, you may evaluate the improvement by changing this parameter.
e.g.: sort_multiblock_read_count = 2
Hash joins are available only in CBO. In hash joins, a hash table is created on the join key of the smallest sized table. It then joins the other tables to find the match. Hash joins may prove to be faster than other type of joins in some conditions, especially when the index is missing or search criteria is not very selective. Hash joins require a large amount of memory as the hash tables are retained; this may sometimes result in memory swapping.
Nested-loop joins return the first row faster than sort-merge and hash joins and are preferred for OLTP, but other types of joins cannot be ignored for running other aspects of the applications.
e.g.: hash_join_enabled = true
This specifies the maximum amount of memory in bytes to be used for a hash join per process. It is defaulted to 2 times SORT_AREA_SIZE.
Oracle recommends the use of PGA_AGGREGATE_TARGET instead of this parameter from Oracle 9i.
e.g.: hash_area_size = 2097152
Setting this to a very low number may sometimes result in the following error.
ORA-6580: Hash Join ran out of memory while keeping large rows in memory.
This specifies how many sequential blocks a hash join reads and writes in one Input/Output activity. Oracle recommends not changing or assigning a value to this parameter; this will let oracle decide on the appropriate value for each individual query. In such casees, the value of the parameter will appear as 0 in the V$PARAMETER view.
This parameter is renamed to _HASH_MULTIBLOCK_IO_COUNT as of Oracle 9i.
e.g.: hash_multi_block_io_count = 0
This parameter is relevant for systems using bitmap indexes. It specifies the amount of memory Oracle uses to merge bitmaps retrieved from a range scan of a bitmap index. The default value is 1 MB, which is considered sufficient for most setups.
Oracle recommends use of PGA_AGGREGATE_TARGET instead of this parameter from Oracle 9i.
e.g.: bitmap_merge_area_size = 1048576
This parameter is relevant for systems using Materialized views, Function based indexes and stored outlines. Setting this parameter enables query rewrite for the database. The materialized views should also be query rewrite enabled to allow the optimizer to redirect a query to it rather than the actual tables listed in the FROM clause. Query rewriting is internally done by the optimizer based on what results are expected and whether these are already present in materialized view form.
e.g.: query_rewrite_enabled = true
This parameter determines the extent to which the optimizer must enforce query rewriting to Materialized views. This determines the accuracy of the query rewrite. It can be set to ENFORCED, TRUSTED or STALE_TOLERATED. ENFORCED option uses Materialized views if they contain fresh data, thus guarantying the integrity and consistency of data. TRUSTED uses relationships that have been defined in the system for rewriting the query. STALE_TOLERATED allows the use of relationships that are not enforced and may use stale data. For OLTP systems, ENFORCED would be the ideal setup, as reports need to be up-to-date.
e.g.: query_rewrite_integrity = enforced
This parameter specifies the join method for anti-joins, for example when a NOT IN operator is present in your query. It can be set to NESTED_LOOPS, MERGE or HASH. It is defaulted to NESTED_LOOPS in Oracle 8i and CHOOSE in Oracle 9i.
This parameter is renamed to _ALWAYS_ANTI_JOIN as of Oracle 9i.
e.g.: always_anti_join = nested_loops
This parameter specifies the join method for semi-joins. These types of joins are carried out by Optimizer after transforming a query. In such joins, duplicate values from the inner table are removed and then the type of join specified in the parameter is used to perform a semi-join. It can be set to NESTED_LOOPS, MERGE or HASH. In Oracle 8i, it is defaulted to STANDARD and in Oracle 9i it is defaulted to CHOOSE, to pick up an appropriate join.
This parameter is renamed to _ALWAYS_SEMI_JOIN as of Oracle 9i.
e.g.: always_semi_join = nested_loops
This specifies whether query transformation will be applied to star queries. It can be set to TRUE, FALSE or TEMP_DISABLE (transformation will take place but will not use temporary tables). I presently set it to FALSE due to some known issues of sub-optimal queries being generated. If you intend to use this, please upgrade your version to 188.8.131.52 and above.
e.g.: star_transformation_enabled = false
This parameter refers to parallel executions in cluster databases. It is meant for improving hash and sort-merge join operations where a very large result set is joined with a very small result set. When this option is enabled, the optimizer broadcasts a copy of all rows in the smaller result set to all cluster databases that are processing some rows of the larger result set.
It is obsolete in release 9.2.0.
e.g.: parallel_broadcast_enabled = false
This parameter is introduced in release 9i. It is meant for situations where tables are not analyzed. As CBO depends heavily on statistics, the parameter tells the optimizer to sample the unanalyzed tables that are being used in a query. A level of 0 to 10 can be specified, the higher the value the more time optimizer spends in sampling.
e.g.: optimizer_dynamic_sampling = 1
This parameter is meant for backward compatibility to support partition views. Oracle recommends use of partition tables rather than partition views. If you are migrating to CBO, chances are that you may not be using partition views.
e.g.: partition_view_enabled = false
This parameter determines what kind of SQL statements can share the same cursor. It can be set to FORCE, SIMILAR or EXACT. FORCE will try to squeeze statements that may differ in some literals to share the same cursor. SIMILAR is somewhat the same but will try to maintain the plan optimization for identical statements. EXACT allows statements with exact identical text to share a cursor.
Using FORCE may sometimes result in unexpected results.
e.g.: cursor_sharing = exact
METHOD_OPTargument to the
REBUILDenables this gathering of statistics.
ESTIMATE_PERCENTparameter of the
DBMS_STATSgathering procedures to
AUTO_SAMPLE_SIZEto maximize performance gains while achieving necessary statistical accuracy.
AUTO_SAMPLE_SIZElets Oracle determine the best sample size for good statistics.
SELECT INDEX_NAME "NAME", NUM_ROWS, DISTINCT_KEYS "DISTINCT",
LEAF_BLOCKS, CLUSTERING_FACTOR "CF", BLEVEL "LEVEL",
WHERE OWNER = 'SH'
ORDER BY INDEX_NAME;
Typical output is:
NAME NUM_ROWS DISTINCT LEAF_BLOCKS CF LEVEL ALFBPKEYVerifying Column Statistics
-------------------------- -------- -------- ----------- ------- ------- ----------
CUSTOMERS_PK 50000 50000 454 4405 2 1
PRODUCTS_PK 10000 10000 90 1552 1 1
PRODUCTS_PROD_CAT_IX 10000 4 99 4422 1 24
PRODUCTS_PROD_SUBCAT_IX 10000 37 170 6148 2 4
SALES_PROD_BIX 6287 909 1480 6287 1 1
SALES_PROMO_BIX 4727 459 570 4727 1 1
dbms_stats.get_index_stats(ownname => 'SYS', indname => 'AM01_N1',
numrows => l_numrows, numlblks => l_numlblks,
numdist => l_numdist, avglblk => l_avglblk,
avgdblk => l_avgdblk, clstfct => l_clstfct,
indlevel => l_indlevel);
dbms_output.put_line('No. of rows: ' || l_numrows);
dbms_output.put_line('No. of blks: ' || l_numlblks);
dbms_output.put_line('No. of distinct values: ' || l_numdist);
dbms_output.put_line('Avg leaf blocks for distinct keys: ' || l_avglblk);
dbms_output.put_line('Avg data blocks pointed to in the table: ' || l_avgdblk);
dbms_output.put_line('Clustering factor: ' || l_clstfct);
dbms_output.put_line('Index height: ' || l_indlevel);
No. of rows: 3819711
No. of blks: 11092
No. of distinct values: 1
Avg leaf blocks for distinct keys: 11092
Avg data blocks pointed to in the table: 14616
Clustering factor: 14616
Index height: 2
SELECT COLUMN_NAME, NUM_DISTINCT, NUM_NULLS, NUM_BUCKETS, DENSITY
WHERE TABLE_NAME ="PA_EXPENDITURE_ITEMS_ALL"
ORDER BY COLUMN_NAME;
This returns the following data:
COLUMN_NAME NUM_DISTINCT NUM_NULLS NUM_BUCKETS DENSITYExporting and importing statistics with DBMS_STATS
------------------------------ ------------ ---------- ----------- ----------
BURDEN_COST 4300 71957 1 .000232558
BURDEN_COST_RATE 675 7376401 1 .001481481
CONVERTED_FLAG 1 16793903 1 1 poor
COST_BURDEN_DISTRIBUTED_FLAG 2 15796 1 .5 poor
COST_DISTRIBUTED_FLAG 2 0 1 .5 poor
COST_IND_COMPILED_SET_ID 87 6153143 1 .011494253
EXPENDITURE_ID 1171831 0 1 8.5337E-07 GOOD
TASK_ID 8648 0 1 .000115634
TRANSFERRED_FROM_EXP_ITEM_ID 1233787 15568891 1 8.1051E-07 GOOD
dbms_stats.get_column_stats(ownname => 'SYS', tabname => 'AM01',
colname => 'COL1', distcnt => l_distcnt, density => l_density,
nullcnt => l_nullcnt, srec => l_srec, avgclen => l_avgclen);
dbms_output.put_line('No. of distinct values: ' || l_distcnt);
dbms_output.put_line('Density: ' || l_density);
dbms_output.put_line('Count of nulls: ' || l_nullcnt);
dbms_output.put_line('Avg. column length: ' || l_avgclen);
No. of distinct values: 2
Count of nulls: 0
Avg. column length: 3
DBMS_STATS also includes routines for gathering statistics and storing them outside the dictionary. This does not influence the optimizer. Most of the procedures in this package as mentioned above have three common parameters - STATID, STATTAB and STATOWN that are related to user processing of statistics.
Advantages of this feature:
1. Estimated statistics at different percentages could be stored and used for testing.
Statistics generated on one database could be transferred to another
The cost-based optimizer can use data value histograms to get accurate estimates of the distribution of column data. A histogram partitions the values in the column into bands, so that all column values in a band fall within the same range. Histograms provide improved selectivity estimates in the presence of data skew, resulting in optimal execution plans with nonuniform data distributions.
One of the fundamental tasks of the cost-based optimizer is determining the selectivity of predicates that appear in queries. Selectivity estimates are used to decide when to use an index and the order in which to join tables. Some attribute domains (a table's columns) are not uniformly distributed.
The cost-based optimizer uses height-based histograms on
specified attributes to describe the distributions of nonuniform
domains. In a height-based histogram, the column values are divided
into bands so that each band contains approximately the same number of
values. The useful information that the histogram provides, then, is
where in the range of values the endpoints fall.
Histograms can affect performance and should be used only
when they substantially improve query plans. Histogram statistics data
is persistent, so the space required to save the data depends on the
sample size. In general, create histograms on columns that are used
WHERE clauses of queries and have a highly
skewed data distribution. For uniformly distributed data, the
cost-based optimizer can make fairly accurate guesses about the cost of
executing a particular statement without the use of histograms.
Histograms, like all other optimizer statistics, are static. They are useful only when they reflect the current data distribution of a given column. (The data in the column can change as long as the distribution remains constant.) If the data distribution of a column changes frequently, you must recompute its histogram frequently.
Histograms are not useful for columns with the following characteristics:
You generate histograms by using the
package. You can generate histograms for columns of a table or
partition. For example, to create a 10-bucket histogram on the
column of the
emp table, issue the following statement:
SIZE keyword declares the maximum
number of buckets for the histogram. You would create a histogram on
column if there were an unusually high number of employees with the
same salary and few employees with other salaries. You can also collect
histograms for a single partition of a table.
Oracle Corporation recommends using the
package to have the database automatically decide which columns need
histograms. This is done by specifying
This query shows the skewed distribution of data values
s6. In this case, there are two distinct non-null
values: 10 and 18. The majority of the rows consists of
= 10 (1,589,464), while a small number of rows consist of
= 18 (13,091).
The selectivity of column
If No Histogram is Used: The selectivity of column
is assumed to be 50%, uniformly distributed across 10 and 18. This is
not selective; therefore,
s6 is not an ideal choice for
use as an index.
If a Histogram is Used: The data distribution information
is stored in the dictionary. This allows the optimizer to use this
information and compute the correct selectivity based on the data
distribution. In this example,
the selectivity, based on the histogram data, is 0.008. This is a
relatively high, or good, selectivity, which leads the optimizer to use
an index on column
s6 in the execution plan.
The view v$sysstat is the accumulated statistics for the complete system since the instance has started except for what is active now and shown in v$sesstat. The numbers in this view are held in "tens of milliseconds," (sounds like hundredths of seconds to me). So to convert to seconds, we divide the value by 100. Note that TIMED_STATISTICS must be set to true in the init.ora for these views to have timing numbers. So to see overall parsing and read consistency waits, we can use the following:
column parsing heading 'Parsing|(seconds)'We can see from this output that this instance has consumed a lot of CPU seconds, but a relatively small amount of that was used for parsing. Of the parsing, about 20 percent of the time was waiting for read consistency, but since the parsing percent is so small, this is not impacting our system performance.
column total_cpu heading 'Total CPU|(seconds)'
column waiting heading 'Read Consistency|Wait (seconds)'
column pct_parsing heading 'Percent|Parsing'
select total_CPU, parse_CPU parsing,
from (select value/100 total_CPU
from v$sysstat where name = 'CPU used by this session'),
(select value/100 parse_CPU
from v$sysstat where name = 'parse time cpu'),
(select value/100 parse_elapsed
from v$sysstat where name = 'parse time elapsed');
Total CPU Parsing Read Consistency Percent
(seconds) (seconds) Wait (seconds) Parsing
---------- ---------- ---------------- ----------
1236263327 21072.43 4277.93 0
If we need to scan buffers to find a free one, then we are wasting CPU resource. This can be checked with the script below:
select inspected,requested,(inspected/requested) ratio
from (select value inspected from v$sysstat where name = 'free buffer inspected') a,
(select value requested from v$sysstat where name = 'free buffer requested') b;
INSPECTED REQUESTED RATIO
---------- ---------- ----------
5100 12288112 .000415035
that, for the vast majority of the time, the first buffer we look at is
available (low ratio). So there is no problem with getting free buffers
need them. Let's look further.
Waiting and Latch
Active waiting can be compared to waiting at the betting window with your two dollars in your hand. You are not just passively standing there, but are focused on all the windows to see when a resource becomes available. We can see these active waits in v$system_event with the following:
select event,time_waited/100 seconds
where event not in
('rdbms ipc message'
,'SQL*Net more data to client'
,'SQL*Net message from client'
,'SQL*Net message to client'
,'wakeup time manager')
order by seconds;
buffer deadlock 0
. . .
latch free 3422.98
. . .
control file parallel write 80356.43
log buffer space 106812.85
The previous code shows me that we have waited the most time for log buffer space. I will check that I have the log buffer sized correctly. This is still a small fraction of the CPU seconds used by the instance as we saw above; it was over one billion, so we are way under one percent of the time spent waiting. This also shows us that we have been waiting for latches. So let's see which latch we are not getting when needed with the following code:
Column name format a35
SELECT name, gets, misses, immediate_gets, immediate_misses
FROM v$latch l
WHERE (misses > .1*gets)
or (immediate_misses > .1*(immediate_misses+immediate_gets));
NAME GETS MISSES IMMEDIATE_GETS IMMEDIATE_MISSES
--------------------------------- ------ -------- -------------- ----------------
channel operations parent latch 86845 6 43300 78881
a latch that we can't tune and we also see that the number of misses
is not increasing as we rerun this query, so we will leave it alone.
Given this, we have to conclude that the instance is not using too much CPU resource for its own maintenance — so let's check the users.
User Code Execution
In our instance, we have checked that the environment for the instance is not forcing us to consume more CPU resources. So we will check the code that is being executed. Most often, the CPU consumption problem is caused by what a user is doing.
We can see which user is consuming the most CPU at this time by creating a table with the cumulative values so we can pull out the current usage:
drop table t_CPU;
create table t_CPU storage
(initial 64k next 64k pctincrease 0) pctfree 0
as select a.sid,a.value
from v$sesstat a,v$statname b
where b.name = 'CPU used by this session'
and a.statistic# = b.statistic#;
column name format a25
column username format a20
column CPU heading 'CPU|Seconds'
select a.sid, c.username,b.name,(a.value-t.value)/100 CPU
from v$session c,v$sesstat a,v$statname b,t_CPU t
where c.sid = a.sid
and b.name = 'CPU used by this session'
and a.statistic# = b.statistic#
and a.value > 0
and a.value - t.value > 0
order by a.value-t.value;
drop table t_CPU;
SID USERNAME NAME Seconds
------ -------------------- ------------------------- ----------
52 CPU used by this session .01
. . .
23 SYSTEM CPU used by this session 35.82
177 ORADBA CPU used by this session 38.26
167 ORADBA CPU used by this session 52.07
You can see that SID 167 is the current largest consumer. What we are looking for, then, is code that is in a tight loop in memory, just processing over and over. If this CPU is being consumed by normal usage, then no changes are needed. So let's see what SID 167 is running:
select c.executions,trunc(a.last_call_et/60) min,b.sql_text
from v$session a, v$sqltext b, v$sqlarea c
and a.sql_address = b.address
and a.sql_address = c.address
order by b.piece;
EXECUTIONS MIN SQL_TEXT
---------- ------ ----------------------------------------------------------------
94784 0 select daily_notes_id, center_code, TO_CHAR(day,'MM/DD/YYYY')
day, notes from daily_notes where center_code = :1
and day = TO_DATE(:2,'MM/DD/YYYY') order by 1
There is nothing wrong with this code and after watching for a while we see that this user is not running any code that jumps out as bad. We conclude that this session is just doing the most work.
We can also check for the statements with the most buffer gets. These are probably working the most in memory:
set linesize 140
column executions format 999,999,999 heading "Executions"
column disk_reads format 999,999,999 heading "Block Reads"
column buffer_gets format 999,999,999 heading "Buffer Reads"
column sql_text format a90 heading "Statement"
break on executions on buffer_gets skip 1
--we want the 2 largest consumers
variable max_buffer_read number
select max(buffer_gets) into :max_buffer_read
where parsing_user_id >0;
select max(buffer_gets) into :max_buffer_read
where parsing_user_id >0
and buffer_gets < :max_buffer_read;
from v$sqlarea a,v$sqltext b
where a.buffer_gets >= :max_buffer_read
and a.address = b.address
and a.parsing_user_id > 0
order by a.buffer_gets desc, piece asc;
Executions Buffer Reads Statement
---------- ------------ -------------------------------------------------------------
6,300,634 51,221,671 SELECT DISTINCT B.FIRST_NAME || ' ' || B.LAST_NAME
FROM BLOCK_PHYSICIAN A,PHYSICIAN B
WHERE A.PHYSICIAN_FID=B.HYSICIAN_ID AND A.BLOCK_TIME_FID=:b1
75,265 27,227,798 select client_id,greeting_lu,first_name,last_name
,TO_CHAR(start_time, 'MM/DD/YYYY HH:MI AM') start_datetime
,appointment_type_lu, appointment_id , number_attending
,offsite, offsite_location,treatment_info, bgcolor
,TO_CHAR(start_time, 'HH:MI AM') start_display_time,note
where center_code = :1
and start_time between TO_DATE(:2,'MM,DD,YYYY')
and appointment_type_lu in
and (current_status_lu not in
order by start_time,last_name,first_name
that these numbers are those that are emerging since the instance has
We see that the first statement is doing less than 10 buffer reads per execution on an average. Because I know that these tables are in the keep buffers, this makes sense. The second statement is doing about 361 buffer gets per execution. This will be one to investigate and see how this view is joining. Maybe we can make this more efficient.