Blog dedicated to Oracle Applications (E-Business Suite) Technology; covers Apps Architecture, Administration and third party bolt-ons to Apps

Thursday, July 19, 2007

E-biz instance slow

Sometime back, Jimmy reported that the E-biz instance which had gone live today was seeming slow. These were the symptoms:

1. If you launched a forms session from self service it would not launch and display this on status bar:

waiting for http://justanexample.com/OA_HTML/....

2. We had blocking locks where the blocking program was FNDSCAUS and waiting program were self service OA Framework pages

3. Alert log stopped showing new entries after a new tablespace was added for installation of STATSPACK:

Tue Jul 10 19:04:53 2007
create tablespace PERFSTAT_TBLSPC
datafile '$ORACLE_HOME/dbdata/perfstat01.dbf' size 500M
Tue Jul 10 19:05:15 2007
Completed: create tablespace PERFSTAT_TBLSPC
datafile '/

4. SQL> select object_name from dba_objects
2 where object_id in (select object_id from v$locked_object);

OBJECT_NAME
--------------------------------------------------------------------------------
AQ$_WF_CONTROL_H
AQ$_WF_CONTROL_I
AQ$_WF_CONTROL_T
FND_CONCURRENT_PROCESSES
FND_CONCURRENT_QUEUES
FND_SVC_COMPONENTS
PA_COMMITMENT_TXNS_TMP
WF_JAVA_DEFERRED
AQ$_WF_JAVA_DEFERRED_T
AQ$_WF_JAVA_DEFERRED_H
AQ$_WF_JAVA_DEFERRED_I

OBJECT_NAME
--------------------------------------------------------------------------------
AQ$_FND_CP_GSM_OPP_AQTBL_T
AQ$_FND_CP_GSM_OPP_AQTBL_H
AQ$_FND_CP_GSM_OPP_AQTBL_I
AP_WEB_PROXY_ASSIGNMENTS
GEPS_TCARD_EXT_TEMP

16 rows selected.

5. We had waits on these events:
90 library cache lock
10 library cache pin

6. adapcctl.sh stop command went into hang state.

After about 2 hours, everything seemed to come back to life suddenly. The hanging adapcctl.sh stop command succeeded, The waiting forms session launched itself, blocking locks disappeared and after a 2 hour gap, entries started appearing in the alert log:

Tue Jul 10 19:05:15 2007
Completed: create tablespace PERFSTAT_TBLSPC
datafile '/
Tue Jul 10 20:58:08 2007
Beginning log switch checkpoint up to RBA [0x2f8.2.10], SCN: 0x08f3.2555e155
Thread 1 advanced to log sequence 760

Research on metalink revealed note 287059.1, which describes an instance hang situation when there is a library cache pin/lock pile, which is similar to our situation.


Subject: Library Cache Pin/Lock Pile Up hangs the application

Note:287059.1 Type: PROBLEM
Last Revision Date: 09-MAY-2007 Status: PUBLISHED

The information in this document applies to:
Oracle Server - Enterprise Edition - Version: <8.1.7.4 to 10.2.0.0
This problem can occur on any platform.
Symptoms
SystemState Dump shows 2 session on a library cache pin deadlock or a single session on a "library cache pin" self deadlock.
The Call stack Trace of one of the session shows the following pattern :

kqdgtc -> ... -> kkscls -> kxstcls -> ... -> kglpin

The sessions are doing a recursive operation.

Here is an example of the self deadlock.
It is requesting a Pin in S mode on handle 5eb723ec and at the same time it has the already acquired the pin in X mode.

----------------------------------------
SO: 0x6614b2ec, type: 52, owner: 0x54fc1324, flag: INIT/-/-/0x00
LIBRARY OBJECT PIN: pin=6614b2ec handle=5eb723ec request=S lock=8c04b694
user=54fc1324 session=54fc1324 count=0 mask=0000 savepoint=558512 flags=[00]
----------------------------------------
SO: 0x78093280, type: 52, owner: 0x54fc1324, flag: INIT/-/-/0x00
LIBRARY OBJECT PIN: pin=78093280 handle=5eb723ec mode=X lock=804ad2e4
user=54fc1324 session=54fc1324 count=0 mask=0041 savepoint=558293 flags=[00]
----------------------------------------
Changes
TIMED_STATISTICS=TRUE
Cause
Bug 3661076 INSTANCE HANG DUE TO DEADLOCK WAITING ON LIBRARY CACHE PIN

A session is in need of a row cache cursor from the LRU and it could pick any dictionary cursor to map which it has previously used.
Now consider a second process (or session) concurrently doing some different operation parsing a different row cache cursor but needing to close one from its LRU could end up deadlocked with a similar stack segment.

The X mode when closing the cursor is needed when updating the statistics of the cursor generated while having some tracing enabled like (but not limited to) TIMED_STATISTICS=TRUE, or SQL_TRACE.
The deadlock and the pile up will not happen until there is a request in X mode.

Some of the reasons to close a row cache cursor are

- Lots of Recursive operations
- Shared Pool too small
- runs out of cache

There is currently a limitation to detect deadlock at the row cache level.
Fix
Identify the blocking session and get 3 errorstack dumps to confirm.

SELECT s.sid, kglpnmod "Mode", kglpnreq "Req", SPID "OS Process"
FROM v$session_wait w, x$kglpn p, v$session s ,v$process o
WHERE p.kglpnuse=s.saddr
AND kglpnhdl=w.p1raw
and w.event like '%library cache pin%'
and s.paddr=o.addr
/


To free up the instance kill the blocking session at the OS level.

To prevent it from happening do any, some or all of the following :

- set TIMED_STATISTICS=FALSE
- set _row_cache_cursors=20 or more (10 default)
- don't do any kind of tracing
References
Note 34579.1 - WAITEVENT: "library cache pin" Reference Note
Bug 3661076 - Instance Hang Due To Deadlock Waiting On Library Cache Pin
Bug 3679080 - Deadlock On Library Cache Pin
Note 122793.1 - HOW TO FIND THE SESSION HOLDING A LIBRARY CACHE LOCK


Later on we were informed by Production Support team that this is a known issue with STATSPACK installation, and it should be done only during downtime and not when all apps services are up. Difficult to swallow this, but will go deeper in this one, another day.

No comments: