All About Oracle Latches
What are latches
Latches vs Enqueues
How Latches work
Causes of Contention for specific latches
Reducing Contention for Internal Latches
Measuring Latch Contention
Tuning Applications to avoid Latch contention
Identifying Hot Blocks

What Are Latches?
Latches are serialization mechanisms that protect areas of Oracle’s shared memory (the SGA). In simple terms, latches prevent two processes from simultaneously updating — and possibly corrupting — the same area of the SGA. A latch is a type  of a lock that can be very quickly acquired and freed. Latches are typically used to prevent more than one process from executing the same  piece of  code at  a given time.
Oracle sessions need to update or read from the SGA for almost all database operations. For instance:
• When a session reads a block from disk, it must modify a free block in the buffer cache and adjust the buffer cache LRU chain
• When a session reads a block from the SGA, it will modify the LRU chain.
• When a new SQL statement is parsed, it will be added to the library cache within the SGA.
• As modifications are made to blocks, entries are placed in the redo buffer.
• The database writer periodically writes buffers from the cache to disk (and must update their status from “dirty” to “clean”).
• The redo log writer writes entries from the redo buffer to the redo logs.
Latches prevent any of these operations from colliding and possibly corrupting the SGA.

Latches vs Enqueues
Enqueues are another type of locking mechanism used in Oracle. An enqueue is a more sophisticated mechanism which permits several concurrent processes to have varying degree of sharing of "known" resources. Any object which can be concurrently used, can be protected with enqueues. A good example is of locks on tables. We allow varying levels of sharing on tables e.g. two processes can lock a table in share mode or in share update mode etc. One difference is that the enqueue is obtained using an OS specific locking mechanism. An enqueue allows the user to store a value in the lock, i.e the mode in which we are requesting it. The OS lock manager keeps track of the resources locked. If a process cannot be granted the lock because it is incompatible with the mode requested and the lock is requested with wait, the OS puts the requesting process on a wait queue which is serviced in FIFO. Another difference between latches and enqueues is that in latches there is no ordered queue of waiters like in enqueues. Latch waiters may either use timers to wakeup and retry or spin (only in multiprocessors). Since all waiters are concurrently retrying (depending on the scheduler), anyone might get the latch and conceivably the first one to try might be the last one to get

How Latches Work
Because the duration of operations against memory is very small (typically in the order of nanoseconds) and the frequency of latch requests very high, the latching mechanism needs to be very lightweight.
If the latch is already in use, Oracle can assume that it will not be in use for long, so rather than go into a passive wait (e.g., relinquish the CPU and go to sleep) Oracle will retry the operation a number of times before giving up and going to passive wait. This algorithm is called acquiring a spinlock and the number of “spins” before sleeping is controlled by the Oracle initialization parameter “_spin_count”.
The first time the session fails to acquire the latch by spinning, it will attempt to awaken after 10 milliseconds. Subsequent waits will increase in duration and in extreme circumstances may exceed one second. In a system suffering from intense contention for latches, these waits will have a severe impact on response time and throughput.

Causes of contention for specific latches
If a required latch is busy, the process requesting it spins, tries again and if still not available, spins again. The loop is repeated up to a maximum number of times determined by the initialization parameter _SPIN_COUNT. If after this entire loop, the latch is still not available, the process must yield the CPU and go to sleep. Initially is sleeps for one centisecond. This time is doubled in every subsequent sleep. This causes a slowdown to occur and results in additional CPU usage, until a latch is available. The CPU usage is a consequence of the "spinning" of the process. "Spinning" means that the process continues to look for the availability of the latch after certain intervals of time, during which it sleeps.
The latches that most frequently affect performance are those protecting the buffer cache, areas of the shared pool and the redo buffer.
Library cache and shared pool latches:  These latches protect the library cache in which sharable SQL is stored.  In a well defined application there should be little or no contention for these latches, but in an application that uses literals instead of bind variables (for instance “WHERE surname=’HARRISON’” rather that “WHERE surname=:surname”, library cache contention is common.  
Redo copy/redo allocation latches: These latches protect the redo log buffer, which buffers entries made to the redo log. Recent improvements (from Oracle 7.3 onwards) have reduced the frequency and severity of contention for these latches.
Cache buffers chain latches (Row cache objects latch, Library cache latch, Shared pool latch): These latches are held when sessions read or write to buffers in the buffer cache.   There are typically a very large number of these latches each of which protects only a handful of blocks.  Contention on these latches is typically caused by concurrent access to a very “hot” block and the most common type of such a hot block is an index root or branch block (since any index based query must access the root block).  

Reducing contention for internal latches
We can reduce contention for these latches and tune them by adjusting certain init.ora parameters.

Cache buffer chain latch:
Contention in this latch might be related with the Buffer cache size, but it might be present due to a "hot block" (meaning a block highly accessed). Before incrementing the parameter DB_BLOCK_BUFFERS check that specific blocks are not causing the contention avoiding memory wasting.

Cache buffers LRU chain latch:
Multiple Buffer pools and adjusting the parameter DB_BLOCK_LRU_LATCHES to have multiple LRU latches will help on reducing latch contention.

Redo Allocation Latch:
Contention for this latch in Oracle7 can be reduced by decreasing the value of LOG_SMALL_ENTRY_MAX_SIZE on multi-cpu systems to force the use of the redo copy latch. In Oracle8i this parameter is obsolete, so you need to consider to increase the size of the LOG_BUFFER or reduce the load of the log buffer using NOLOGGING features when possible.

Redo copy latch:
This latch is waited for on both single and multi-cpu systems. On multi-cpu systems, contention can be reduced by increasing the value of LOG_SIMULTANEOUS_COPIES (Hidden in Oracle8i) and/or increasing LOG_ENTRY_PREBUILD_THRESHOLD (undocumented in Oracle7).

Row cache objects latch:
In order to reduce contention for this latch, we need to tune the data dictionary cache. In Oracle7 this basically means increasing the size of the shared pool (SHARED_POOL_SIZE) as the dictionary cache is a part of the shared pool.

Library cache and Shared pool latches
The first resource to reduce contention on this latch is to ensure that the application is reusing as mush as possible SQL statement representation. If the application is already tuned the SHARED_POOL_SIZE can be increased. Be aware that if the application is not using appropriately the library cache the contention might be worst with a larger structure to be handled.

Measuring Latch Contention

A - Ratio-based Techniques (old wrong way)
We see that for each latch, the number of gets (requests for the latch), misses (number of times the first request fails) and sleeps (number of times a session failed to obtain a latch by spinning) are recorded. In the past, queries such as the following were often used to determine latch health:

select substr(name,1,45) name, gets, misses, misses*100/gets misspct
from v$latch
where gets > 0
order by 4 desc;

NAME                                     GETS      MISSES MISSPCT
--------------------------------- ----------- ----------- -------
latch wait list                            32           1    3.13
process allocation                         28           0     .00
session allocation                  1,223,068          84     .01
session switching                       8,009           0     .00
process group creation                     40           0     .00
session idle bit                    2,426,940           1     .00
shared java pool                        1,188           0     .00
event group latch                          28           0     .00
messages                            2,128,851         461     .02
enqueues                            3,168,279           7     .00
enqueue hash chains                 1,747,312           2     .00
channel handle pool latch                  40           0     .00
checkpoint queue latch              1,459,929          26     .00
cache buffers chains               13,851,179      16,254     .12

This approach was flawed on a number of levels:
• It is actually the number of sleeps that most accurately influences the impact of the latch contention on response time.
• A high miss rate is expected for certain latches.
• A latch with a high miss rate (or sleep rate) that is not frequently accessed is probably not impacting performance.
• Even if a latch is experiencing a high sleep rate, we can’t determine the impact on performance without taking into account waits for other resources. So if sessions are waiting 90% for IO, 8% for CPU and 2% for latch, expending effort on halving the latch sleep wait only provides a 1% improvement in response time – probably not noticeable.

In the above example the “latch wait list” latch has the highest miss rate. However, this is totally irrelevant since it was only requested 26 times, while the “cache buffer chains” latch appears to have only a moderate miss rate, but has been requested almost three million times and — as we shall see — is the latch most affecting performance.

B - Wait interface-based techniques (better way)
A better approach to estimating the impact of latch contention is to consider the relative amount of time being spent waiting for latches. The following query gives us some indication of this:
SELECT substr(event,1,50) event,
       round(time_waited*100/ SUM (time_waited) OVER(),2) wait_pct
  FROM (SELECT event, time_waited
         FROM v$system_event
         WHERE event NOT IN
                  ('Null event',
                   'client message',
                   'rdbms ipc reply',
                   'smon timer',
                   'rdbms ipc message',
                   'PX Idle Wait',
                   'PL/SQL lock timer',
                   'file open',
                   'pmon timer',
                   'WMON goes to sleep',
                   'virtual circuit status',
                   'dispatcher timer',
                   'SQL*Net message from client',
                   'parallel query dequeue wait',
                   'pipe get'
              FROM v$sysstat
              WHERE NAME LIKE 'CPU used when call started'))

VENT                          TIME_WAITED WAIT_PCT
------------------------------ ----------- ----------
latch free                           40144     31.67
CPU used when call started           30341     23.94
control file sequential read         12341      9.74
direct path read                     11933      9.41
control file parallel write           6487      5.12
file identify                         5666      4.47
log file sync                         3492      2.75
log file parallel write               3213      2.53
instance state change                 3064      2.42
log file switch completion            3049      2.41
db file sequential read               2290      1.81

Now we can look at the sleeps in v$latch to determine which latches are likely to be contributing most to this problem:

select substr(name,1,35) name, gets, sleeps,
       sleeps*100/sum(sleeps) over() sleep_pct, sleeps*100/gets sleep_rate
from v$latch
where gets > 0
order by sleeps desc;

NAME                                 GETS        SLEEPS SLEEP_PCT SLEEP_RATE
------------------------------ ----------- ------------ --------- ----------
cache buffers chains            13,863,552       38,071     99.48      .2746
session allocation               1,223,982          110       .29      .0090
checkpoint queue latch           1,461,039           39       .10      .0027
library cache                    9,239,751           22       .06      .0002
shared pool                        869,652           16       .04      .0018
messages                         2,130,515            6       .02      .0003
redo writing                     1,330,987            6       .02      .0005
latch wait list                         33            0       .00      .0000
session switching                    8,014            0       .00      .0000
session idle bit                 2,428,851            0       .00      .0000
enqueues                         3,171,018            0       .00      .0000
channel handle pool latch               40            0       .00      .0000
message pool operations parent           3            0       .00      .0000

Now we are in a position to make some reasonable conclusions:
• Latch sleeps contribute to about 30% of database response time (very excessive),
• It’s the cache buffers chains latches that contributes to the vast majority of these waits.
Note that if we had used the conventional “ratio based” analysis outlined in the previous section we would have discounted cache buffers chains latches as a problem because the miss rate was “only” 0.15%.

Tuning the Application to Avoid Latch Contention
There are some things we can do within our application design that can reduce contention for latches.

Using Bind Variables
As noted earlier, failure to use bind variables within an application is the major cause of library cache latch contention. All Oracle applications should make use of bind variables whenever possible.
However, all is not lost if you are unable to modify your application code. From 8.1.6 onwards you can use the “CURSOR_SHARING” parameter to cause Oracle to modify SQL on the fly to use bind variables. A setting of FORCE causes all literals to be converted to bind variables. A setting of SIMILAR causes statements to be rewritten only if it would not cause the statements execution plan today (which can happen if there are histogram statistics defined on a column referenced in the WHERE clause). The default value for this option is EXACT.

Avoiding Hot Blocks
Cache buffers chains latch contention is one of the most intractable types of latch contention. There are a couple of things you can do at the application level to reduce the severity of this type of contention.
Firstly, identify the blocks that are “hot.” Metalink note 163424.1, “How to Identify a Hot Block Within The Database” describes how to do this. Having identified the identity of the hot block, you will most likely find that it is an index root or branch block. If this is the case, there are two application design changes that may help.
1) Consider partitioning the table and using local indexes. This might allow you to spread the heat amongst multiple indexes (you will probably want to use a hash partition to ensure an even spread of load amongst the partitions).
2) Consider converting the table to a hash cluster keyed on the columns of the index. This allows the index to be bypassed completely and may also result in some other performance improvements. However, hash clusters are suitable only for tables of relatively static size, and determining an optimal setting for the SIZE and HASHKEYS storage parameters are essential.

Investigating _spin_count
Prior to Oracle 8.1, the spin count parameter (_spin_count or latch_spin_count) was a documented parameter and many DBAs attempted to adjust it to resolve latch contention. However, as of Oracle8i the parameter is “undocumented” (e.g., does not appear in v$parameter and is not documented in the Oracle reference manual). Why did Oracle do this?
The official Oracle Corporate line is that the value of _spin_count is correct for almost all systems and that adjusting it can cause degraded performance. For instance, Metalink Note:30832.1 says: “If a system is not tight on CPU resource _spin_count can be left at higher values but anything above 2000 is unlikely to be of any benefit.” However, I believe that higher values of _spin_count can relieve latch contention in many circumstances and I think Oracle depreciated the parameter incorrectly.

Identifying HOT BLocks
Metalink Note:163424.1
How to identify blocks which cause latch contention on the 'cache buffers chains' latch.
How to identify a hot block within the database buffer cache

Possible hot blocks in the buffer cache normally can be identified by a high or rapid increasing wait count on the CACHE BUFFERS CHAINS latch.
This latch is acquired  when searching for data blocks cached in the buffer cache. Since the Buffer cache is implemented as a sum of chains of blocks, each of those chains is protected by a child of this latch when needs to be scanned. Contention in this latch can be caused by very heavy access to a single block. This can require the application to be reviewed.

By examining the waits on this latch, information about the segment and the specific block can be obtained using the following queries.

First determine which latch id(ADDR) are interesting by examining the number of  sleeps for this latch. The higher the sleep count, the more interesting the  latch id(ADDR) is:

select CHILD#  "cCHILD", ADDR    "sADDR",     
       GETS    "sGETS" , MISSES  "sMISSES",
       SLEEPS  "sSLEEPS"
from v$latch_children
where name = 'cache buffers chains'
order by 4, 1, 2, 3;

Run the above query a few times to to establish the id(ADDR) that has the most consistent amount of sleeps. Once the id(ADDR) with the highest sleep count is found then this latch address can be used to get more details about the blocks currently in the buffer cache protected by this latch. The query below should be run just after determining the ADDR with the highest sleep count.

column segment_name format a35
select /*+ RULE */
  e.owner ||'.'|| e.segment_name  segment_name,
  e.extent_id  extent#,
  x.dbablk - e.block_id + 1  block#,
  sys.v$latch_children  l,
  sys.x$bh  x,
  sys.dba_extents  e
  x.hladdr  = 'ADDR' and
  e.file_id = x.file# and
  x.hladdr = l.addr and
  x.dbablk between e.block_id and e.block_id + e.blocks -1
  order by x.tch desc ;

SEGMENT_NAME               EXTENT#       BLOCK#    TCH     CHILD#
--------------------- ------------ ------------ ------ ----------
SCOTT.EMP_PK                     5          474     17      7,668
SCOTT.EMP                        1          449      2      7,668

Depending on the TCH column (The number of times the block is hit by a SQL statement), you can identify a hotblock. The higher the value of the TCH column,  the more frequent the block is accessed by SQL statements.

In order to reduce contention for this object the following mechanisms can be put in place:

   1)Examine the application to see if the execution of certain DML and SELECT statements can be reorganized to eliminate contention on the object.
   2)Decrease the buffer cache -although this may only help in a small amount of cases.
   3)DBWR throughput may have a factor in this as well. If using multiple DBWR's then increase the number of DBWR's
   4)Increase the PCTUSED / PCTFREE for the table storage parameters via ALTER TABLE or rebuild. This will result in less rows per block.
   5)Consider implementing reverse key indexes (if range scans aren't commonly used against the segment)