Blog dedicated to Oracle Applications (E-Business Suite) Technology; covers Apps Architecture, Administration and third party bolt-ons to Apps

Saturday, February 21, 2009

AppsLocalLogin.jsp page takes forever to appear, browser shows hourglass

After upgrading to JDK 1.6.0_11, on both of our Extranet application tiers, we could not reach the AppsLocalLogin.jsp page. On invoking /oa_servlets/AppsLogin or /OA_HTML/AppsLocalLogin.jsp, the browser would show an hourglass with the status bar message "Waiting for extranet.justanexample.com". There were no errors in apache or jserv logs. The browser was showing hourglass or was waiting forever. Load balancer or network issues were ruled out as we were able to reach the RapidInstall page by commenting out url_fw.conf in httpd.conf.

After trying a lot of things, I recalled a similar problem in 2005, when 11.5.10 was newly released. In jserv.properties, we set the value of wrapper.bin.parameters=-DLONG_RUNNING_JVM=false. This resolved the issue.

The context file variable s_long_running_jvm controls the value of -DLONG_RUNNING_JVM in jserv.properties. If you want this to be permanent, then change the value of s_long_running_jvm in your context file.

I dug up the TAR from my old notes and got this:

Q1) What is the affect of setting

a)wrapper.bin.parameters=-DLONG_RUNNING_JVM=false
           OR
b)wrapper.bin.parameters=-DCACHEMODE=LOCAL in jserv.properties?

Q2) Why did this resolve NoClassDefFound OR Internal Server Error ?  Is this the permanent solution?
Solution
A1) Distributed Caching was introduced in Framework 11.5.10, refer

From Note 275879.1 Oracle Applications Java Caching Framework Developer's Guide Release 11i (11.5.10)

The setting of the Distributed Mode option is optional. This is the default configuration of the caching framework. The LONG_RUNNING_JVM=true is set and it ensures that caching framework runs in distributed mode. For backward compatibility the same can be ensured by setting -DCACHEMODE=DISTRIBUTED. The above settings determines whether the updates and invalidation's to the objects in the Component Cache are distributed across other JVMs. This allows Component Caches where the data updates need to be seen instantaneously across the JVMs. If the flag is not checked, then updates to the data in the same JVM are seen right away, but updates are not be seen in other JVMs until the "Time to Live" or "Idle Time" expires, if the other JVMs happen to cache the same object. Distributed mode has network overhead and there are some additional steps required to enable this mode for the JVM.

Although turning off Distributed Caching in 11.5.10 will defeat Cache Invalidation, there have been known issues when trying to implement Distributed Caching in E-Business Suite. Until properly patched, Distributed Caching has caused problems with performance and/or rampant NoClassDefFound errors and/or other strange behaviors.

In summary

Cache Invalidation is a feature where some of the middle tier Java Caches for critical reference data are kept in sync when the reference information is changed. This feature was developed for Function/Data Security and Profile Option values. In 11.5.10 the Distributed Caching feature is used to propagate the changes to all "long running" JVMs. If one turns off Distributed Caching, then reference data changes will not be propagated out and one will be forced to bounce the middle tiers to pick up any changes, as done previously.
A2) The caching was out of sync between the web-server that were being load balanced, and with the additional network overhead, started to see apj12 ( Apache/Jserv connection protocol) errors. Once the first server started hanging, this then propagated to the next server, and the next server, faster and faster until all the servers eventually hung. By forcing the cache to reside locally, the network bandwidth was reduced, as well as the issue of trying to keep all the caches in sync with one another.

This is not the solution but a Test to confirm if the issue is occurring because of Java Object Caching.
Distributed Caching issues tend to be very sporadic across the different customers, as it tends to propagate itself based on different product patches that try and take advantage of the distributed caching, or implement new long running queries.

If this is an issue with JOC then best suggestion is to make sure the latest JOC patches ( or their super-seeded patch ) has been applied:
Patch 5639951 REHOST ORACLE JAVA OBJECT CACHE (FORMERLY OCS4J) FOR ORACLE APPLICATIONS 11i (Present in Foxtrot)
Patch 5455628 CACHE DIAGNOSTIC ARU ON TOP OF 11.5.10.3RUP (Not present in Foxtrot)
Patch 6047864 REHOST JOC FIXES (BASED ON JOC 10.1.2.2) FOR APPS 11i  (Not present in Foxtrot)
Also make sure that profile option Self Service personal Home Page mode set to Framework Only As this was written to be used with the new Framework Code and not with the old OSSWA code that is being phased out.
Note: Disabling Distributed Java Caching

1.  Because of the ATG_PF.H Rollup 3 (RUP 3) Patch 4334965, turning off the Java Object Cache (JOC) by setting
    "-DLONG_RUNNING_JVM=false" still calls underlying Apps JOC code.

To disable Java Cache completely follow these steps:

Edit the IAS_ORACLE_HOME/Apache/Jserv/etc/jserv.properties

a) Set  wrapper.bin.parameters=-DLONG_RUNNING_JVM=false

b) Add the line  wrapper.bin.parameters=-DCACHEMODE=LOCAL

c) Restart Apache for these changes to take effect

If the customer problem stop reproducing after these steps, then the issue is certainly related to JOC 

2. Alternatively, if on an ATG RUP4 Patch 4676589 environment to confirm whether its a JOC related issue or not:

Completely REMOVE the system properties -DLONG_RUNNING_JVM and -DCACHEMODE from jserv.properties (either comment out these two system properties or remove them completely but do NOT set them to false and LOCAL respectively) This allows the cache to run in JOC local mode (not in JOC distributed mode) If the problem goes away then it is a JOC distributed caching issue

4 comments:

Raptor Engine said...

did u tried any third party middle tier Caching solution like NCache? Might save u a lot of time.

Vikram Das said...

Hi Raptor,

This is a bug in E-Business Suite and I am not sure using a third party solution would be a feasible detour.

- Vikram

Shyam Enuganti said...

Your comments were helpful in resolving my issue

later on I found the real culprit i.e, by changing the port # for "s_java_object_cache_port" tag and running autoconfig

Anonymous said...

Hi Vikram,

We've set up a test site for our DR testing.

In that site, we faced this issue of AppsLocalLogin.jsp taking way too long to open up the login page.

I monitored the access.log and from the time it opened the main page (aplogon.html) it took around 10 minutes to bring up the login page.

Even after logging on and off a few times, it continues.

I understand that, very first time, it may take some time to cache the jsps, but is it the expected behavior every time ?

In our regular production site, this is not the case, though. It brings up the login page within a few seconds.

Thank you
Kumar