Najeebuddin: WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=61

Row Cache Lock Wait

Definition
In order for DDL to execute, it must acquire a row cache lock to lock the Data Dictionary information. The shared pool contains a cache of rows from the data dictionary that helps reduce physical I/O to the data dictionary tables and allows locking of individual data dictionary rows. The locks on the data dictionary rows are called row cache enqueue locks. The enqueue lock structures are allocated from the shared pool as needed but when these requests wait and time out is when we see the row cache lock wait.

Analysis
Each row cache lock will be on a specific data dictionary object. This is called the enqueue type and can be found in the v$rowcache view. In this sample select from v$rowcache you can find the enqueue types and the type of activity being performed within the dictionary cache.

PARAMETER COUNT GETS GETMISSES MODIFICATIONS
--------------------- ----- ------------ ---------- -------------
dc_free_extents 0 0 0 0
dc_used_extents 0 0 0 0
dc_segments 5927 131379921 4142831 693734
dc_tablespaces 22 188609668 2436 0
dc_tablespace_quotas 12 22779303 3843 0
dc_files 0 165961 22493 21
dc_users 19 145681559 2078 21
dc_rollback_segments 67 3906307 66 232
dc_objects 1927 70725250 2247804 74803
dc_sequences 4 142714 1599 142714

Common Enqueue Types

The tuning of the row cache lock wait is dependant upon the activity for each of the enqueue types. Of these, the most common are:

DC_SEQUENCES, this row cache lock wait may occur during the use of sequences. Tune by checking sequences to see if they have the cache option specified and if that cache value is reflective of the anticipated simultaneous inserts by the application.
DC_USED_EXTENTS and DC_FREE_EXTENTS, this row cache lock wait may occur during space management operations where tablespaces are fragmented or have inadequate extent sizes. Tune by checking whether tablespaces are fragmented, extent sizes are too small, or tablespaces are managed manually.
DC_TABLESPACES, this row cache lock wait may occur during the allocation of new extents. If extent sizes are set too low the application may frequently request new extents which could cause contention. Tune by checking for rapidly increasing number of extents.
DC_OBJECTS, this row cache lock wait may occur during the recompilation of objects. If object compiles are occurring this can require an exclusive lock which will block other activity. Tune by examining invalid objects and dependencies.

The row cache lock wait event is associated with a specific enqueue type on a data dictionary row. Checking activity within the v$rowcache view is a good place to start for understanding this relationship as tuning can only be accomplished with analysis of the enqueue type. If a trace file is available you may also see the following error of “>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! <<”. Also realize that the row cache lock wait event may appear more frequently when using RAC. This is because the library cache and the row cache are global in RAC—causing the row cache lock wait to be more pronounced.

Trace File Dump Analysis

Sequence values are cached in instance memory and by default 20 values are cached. As instance cache is transient, loss of an instance can result in loss of cached sequence values. Permanent record of highest possible value from any instance is kept track in SEQ$ table. SQLs accessing this sequence within an instance will access instance SGA. As each instance caches its own sequence values it is highly likely that SQLs accessing this sequence from different instance will create gaps in sequence values. Common knee-jerk reaction to this issue is to set nocache or cache 1 for these sequences. In a single instance environment, this approach will backfire due to massive updates to SEQ$ tables, buffer busy waits, latch free waits etc. In a RAC environment, this issue is magnified and almost hangs the instance. I had the privilege of working with a client to resolve one of their performance issues. Problem Instances were hung. It was not possible to login to the database. Many existing connections are working fine though. We were lucky enough that one of the DBAs had active connection to the database. So, we took systemstate dump from that session to see whey there is an hang (or slowness). Alert log was printing following lines at this time frame. So, we know that there is a problem with row cache enqueue. Wed Feb 18 06:31:31 2008 >>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=45
Wed Feb 18 08:59:31 2008
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=44
Wed Feb 18 08:59:31 2008
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=43
Wed Feb 18 09:01:00 2008
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=46
Wed Feb 18 09:01:00 2008
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=27
systemstate dump

We took a systemstate dump with following command.

alter session set events 'immediate trace name systemstate level 4';

I won’t bore you with all the details. But let us review just necessary sections of the systemstate dump file. We know that processes were waiting for row cache enqueue and so I searched for row cache enqueue in the trace file. From the trace file, we see that many sessions are holding NULL mode row cache enqueue and requesting X mode.

row cache enqueue: session: 6126dfea0, mode: N, request: X
row cache parent object: address=65f5a2ce8 cid=13(dc_sequences) <<<<<< hash=5a31f94b typ=9 transaction=0 flags=00000000 own=65f5a2dc8[65f5a2dc8,65f5a2dc8] wat=65f5a2dd8[63c841c18,63c841e48] mode=N status=-/-/-/-/-/-/-/-/- request=X release=FALSE flags=2 instance lock id=QN 0010b020 00000000 <<<<<<< data= 0010b020 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 Of course, cid=13 indicating that it is for a sequence [ dc_sequence is the row cache area with cache_id=13]. It is also visible that row cache enqueue is maped to instance lock with QN as type and managed by Global enqueue Services. Wait is for that row cache enqueue lock protecting dc_sequences. We need to identify specific sequence though. Looking for state objects in the same session shows one sequence. owners: SO: 63e1d04b8, type: 46, owner: 63c29edb8, flag: INIT/-/-/0x00 Sequence State Object: kdnssflg = kdnsshiw = INACTIVE kdnsssqn = 0x64cdeba58 KGL Sequence Object #1093664: <<<<<<<<< kdnsqflg = kqdsnflg = KQDSNNOC kdnsqcus = 0 kdnsqisp INACTIVE Increment = +1 Minvalue = +1 Maxvalue = +999999999999999999999999999 Cachesize = +0 Highwater = +323649926 Nextvalue = ############################### Clearly above section shows that these lines are for sequence with object id 1093664. Querying dba_objects for that object_id shows sequence_name. select owner, object_name , object_type, created from dba_objects where object_id=1093664 SQL> /

OWNER OBJECT_NAME OBJECT_TYPE CREATED
------------- ----------- ------------ ----------
SCOTT STG__SEQ SEQUENCE 12-OCT-07
There is also library cache object accessing that sequence, confirming our analysis.

SO: 640cc9230, type: 51, owner: 6136d45b8, flag: INIT/-/-/0x00
LIBRARY OBJECT LOCK: lock=640cc9230 handle=64cdf5868 mode=N
call pin=63c28f6f8 session pin=0
htl=640cc92a0[643a01130,643a01130] htb=643a01130
user=6136d45b8 session=6136d45b8 count=1 flags=[00] savepoint=58
LIBRARY OBJECT HANDLE: handle=64cdf5868
name=SELECT SCOTT.STG__SEQ.NEXTVAL FROM DUAL
hash=5dbaf014 timestamp=02-18-2008 08:00:46
namespace=CRSR flags=RON/KGHP/TIM/PN0/SML/[12010000]
kkkk-dddd-llll=0000-0001-0001 lock=N pin=0 latch#=20
lwt=64cdf5898[64cdf5898,64cdf5898] ltm=64cdf58a8[64cdf58a8,64cdf58a8]
pwt=64cdf58c8[64cdf58c8,64cdf58c8] ptm=64cdf5958[64cdf5958,64cdf5958]
ref=64cdf5878[64cdf5878, 64cdf5878] lnd=64cdf5970[64cdb02f8,64ce05748]
LIBRARY OBJECT: object=64cdf4b40
type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0
Sequence cache

In summary, we know that sessions were waiting for row cache enqueues with QN type. That row cache enqueue was protecting a parent row cache entry for dc_sequences. SO [State Objects] for both sequences and cursors accessing that sequence is also exists in the systemstate dump file.

Let’s query properties of that sequence.

SQL> select * from dba_sequences where sequence_name='STG__SEQ';

SEQUENCE_OWNER SEQUENCE_NAME MIN_VALUE
------------------------------ ------------------------------ ----------
MAX_VALUE INCREMENT_BY C O CACHE_SIZE LAST_NUMBER
---------- ------------ - - ---------- -----------
SCOTT STG__SEQ 1
1.0000E+27 1 N N 0 338585872
Evidently cache size is set to 0. High last_number also indicates that this is an heavily accessed sequence.

Essentially, keeping sequence with nocache in RAC for highly accessed sequences is pretty bad idea. Login to the database accesses sequences too, so if row cache enqueues are not available, then logins will hang. This is why existing connections were working fine, but new logins were simply hung. We are able to pinpoint all these using systemstate dump.

Najeebuddin

Tuesday, March 23, 2010

WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=61

No comments:

Blog Archive

Pages

Projects