Blog dedicated to Oracle Applications (E-Business Suite) Technology; covers Apps Architecture, Administration and third party bolt-ons to Apps

Thursday, February 28, 2008

FND_ATTACHED_DOCUMENTS

FND_ATTACHED_DOCUMENTS is one of the largest tables in terms of number of rows. It contains reference of all attachments from all the iModules like iRecruitment, iSupplier, iSourcing etc. Here's the structure of the table

SQL> desc fnd_attached_documents
Name Null? Type
----------------------------------------- --------
ATTACHED_DOCUMENT_ID NOT NULL NUMBER
DOCUMENT_ID NOT NULL NUMBER
CREATION_DATE NOT NULL DATE
CREATED_BY NOT NULL NUMBER
LAST_UPDATE_DATE NOT NULL DATE
LAST_UPDATED_BY NOT NULL NUMBER
LAST_UPDATE_LOGIN NUMBER
SEQ_NUM NOT NULL NUMBER
ENTITY_NAME NOT NULL VARCHAR2(40)
PK1_VALUE VARCHAR2(100)
PK2_VALUE VARCHAR2(100)
PK3_VALUE VARCHAR2(100)
PK4_VALUE VARCHAR2(150)
PK5_VALUE VARCHAR2(150)
AUTOMATICALLY_ADDED_FLAG NOT NULL VARCHAR2(1)
PROGRAM_APPLICATION_ID NUMBER
PROGRAM_ID NUMBER
PROGRAM_UPDATE_DATE DATE
REQUEST_ID NUMBER
ATTRIBUTE_CATEGORY VARCHAR2(30)
ATTRIBUTE1 VARCHAR2(150)
ATTRIBUTE2 VARCHAR2(150)
ATTRIBUTE3 VARCHAR2(150)
ATTRIBUTE4 VARCHAR2(150)
ATTRIBUTE5 VARCHAR2(150)
ATTRIBUTE6 VARCHAR2(150)
ATTRIBUTE7 VARCHAR2(150)
ATTRIBUTE8 VARCHAR2(150)
ATTRIBUTE9 VARCHAR2(150)
ATTRIBUTE10 VARCHAR2(150)
ATTRIBUTE11 VARCHAR2(150)
ATTRIBUTE12 VARCHAR2(150)
ATTRIBUTE13 VARCHAR2(150)
ATTRIBUTE14 VARCHAR2(150)
ATTRIBUTE15 VARCHAR2(150)
COLUMN1 VARCHAR2(30)
APP_SOURCE_VERSION VARCHAR2(255)
CATEGORY_ID NUMBER
STATUS VARCHAR2(30)

In our largest environment, this table has 12752217 rows. We have had performance issues with patches which try to update this table. A datafix patch specifically developed by Oracle for our environment hangs while executing this statement:


UPDATE fnd_attached_documents fad
SET (created_by, last_updated_by) =
(SELECT nvl(paha.created_by,fad.created_by), nvl(paha.created_by,fad.last_updated_by)
FROM pon_auction_headers_all paha
WHERE fad.pk1_value = paha.auction_header_id
)
WHERE fad.entity_name = 'PON_AUCTION_ITEM_PRICES_ALL'
AND fad.created_by = 1
AND exists
(SELECT '1'
FROM pon_auction_item_prices_all pai,
pon_auction_headers_all paha1
WHERE fad.pk1_value = to_char(pai.auction_header_id)
AND fad.pk2_value = to_char(pai.line_number)
AND paha1.auction_header_id = pai.auction_header_id
AND pai.rowid BETWEEN l_start_rowid AND l_end_rowid);

We have waited for 24 hours before aborting the patch. Oracle is yet to give us a good reason why this query hangs.

Tuesday, February 26, 2008

Forms doesn't launch from self service

This is one of my favorite problems. I have posted about this problem before. The problem exhibited is same but the cause is always different. Here's the latest one I found. In one of the releases last year, I had tried to push Sun JRE client and replace Jinitiator. At that time, Sun JRE client was an early adopter program and was not certified. During testing, the team faced the lost focus problem and there was so much hue and cry that it was abandoned. In 2008, a new release and a new clone, I am told about a problem where Jinitiator doesn't launch on clicking any of the forms based options. Strangely it works on some PCs and doesn't work on others. It works on all DBA's PCs but doesn't work on the PCs of the people who have to do the actual testing. One common thing I noticed is that on the PCs where this was working, it was working with IE but not with Firefox. Launching with Firefox would give the message, you require to install Sun JRE 1.5.0_07. After almost half a day of trying out different stuff, Akhilesh suggested that we check the appsweb.cfg file and found that it had the plugin=jdk instead of plugin=jinit. Once this was changed to plugin=jinit and Apache was bounced, it started working for everyone. And yes it started working in Firefox browser too. A quick comparison of the context files revealed the variable s_sun_plugin whose value was set to jinit in a working instance and set to jdk in case of this instance.

Sametime meeting doesn't launch with browser error code -11

Sametime is an instant messenger from IBM/Lotus which is used in many organizations as the chat application on the intranet.

Akhilesh pinged me today as he was unable to launch sametime meeting. Sametime would error out with browser error code -11. I googled for this and found the answer in Sametime forums. The problem occurs if you have changed your default browser from Internet Explorer to something else, say Firefox. Since older versions of sametime do not work with firefox browser, they complain about it by throwing browser error code -11. If you reset your default browser as Internet Explorer, then the problem is resolved. To do this, open Internet Explorer - Tools menu - Internet Options - Programs tab. Here you'll find options to make IE the default browser. In IE7 you have a Make Default button and in IE6 there is a checkbox for checking whether IE is the default browser every time it is started. Use these options to make IE the default browser. Try launching the sametime meeting after you have made IE the default browser.

Monday, February 25, 2008

Workflow mails not moving after a fresh clone

After a fresh clone, notification mails sent through Workflow Notification Mailer were not going out. These were the steps taken to fix this issue:

a. Shut down the workflow listeners.

b. Follow the Note 372933.1 and recreate the wf_notification_out queue as follows:

sqlplus apps/ @$FND_TOP/patch/115/sql/wfntfqup.sql APPS
APPLSYS

c. Start up the workflow listeners.

The note 372933.1 describes How to purge e-mail notifications from the workflow queue so the e-mail is not sent.

I found another metalink note 432663.1 (Best Practice for Enabling Email Events in an Oracle EBS Testing Environment) authored by an Oracle customer which recommends:

Infrastructure Setups (One-time)

1. Created a generic email account TestOracle@yourdomain.com. In our case, we use Microsoft Exchange/Outlook with Active Directory.

2. Set the permissions of the INBOX folder of the generic email account to allow a development team distribution list (DL) to have OWNER privilege.

a. It's easier to use a distribution list to open up and manage access to the INBOX folder. This is an added benefit as it exposes the whole team to email activity generated during testing.

b. The OWNER privilege will allow any test team member of the distribution list to read and reply to Oracle generated email notifications.

3. If running Purchasing, we take the additional steps of setting up these default usernames in the system. By doing this in production, we establish default testing accounts. In production these accounts are disabled and re-enabled only in our testing environments. How to structure these accounts as people within the Purchasing Hierarchy is beyond the scope of this document. Listed below are some suggested people and levels:

TEST_APPROVER I
TEST_APPROVER II
TEST_APPROVER III
TEST_BUYER
TEST_REQ

Post Cloning Setups (After Each Clone)

There are numerous post cloning activities that are performed on an instance taken from the production copy. Below are the specific steps we used to enable email events and the use of a generic email account.

1. Update wf_notifications

set status = 'CLOSED',
mail_status = 'SENT',
end_date = sysdate;

2. Update fnd_user

set email_address = 'TestOracle@yourdomain.com ';

3. Update per_people_f

set email_address = 'TestOracle@yourdomain.com ';

4. Update po_vendor_sites_all =

email_address = 'TestOracle@yourdomain.com',
remittance_email = 'TestOracle@yourdomain.com';

5. Purge Workflow data :

Run the "Purge Obsolete Workflow Runtime Data" concurrent program

6. If test usernames are implemented, re-enable these accounts with the following code snippet:

 UPDATE apps.fnd_user
SET last_update_date = SYSDATE, -- set update date
last_updated_by = -1, -- set to ANONYMOUS
password_accesses_left = NULL,
password_lifespan_accesses = NULL,
password_lifespan_days = NULL,
end_date = NULL -- activate user account
WHERE user_name = 'TEST_APPROVER I'
OR user_name = 'TEST_APPROVER II'
OR user_name = 'TEST_APPROVER III'
OR user_name = 'TEST_BUYER'
OR user_name = 'TEST_REQ'

7. Alerts can be selectively disabled or changed based on your preferences and the type of alert.

Using the generic email account

Based on the version of Microsoft Outlook, the command to open another user's folder might be different.

File > Open > Other User's Folder

Summary

These simple steps allow us full use of email services in our testing environments and provide our developers with testing accounts for use in Requisition/PO creation, approval and generation.

I am pretty sure if the DBAs had purged the workflow data and cleaned up workflow tables as suggested above, this problem would not have occurred.


Thursday, February 21, 2008

Reverse patch 5307530

Patch 5307530 is a Receivables patch which is recommended in April 2006 Critical Patch Update. This patch can be applied on an instance where you have already applied Financials Family pack G. As an oversite, this patch was applied on an instance where Financials Family pack F was installed. As a result this patch had to be reversed. Now we all know that there is no official way to undo a patch other than restoring from backup. However in the case of this patch, we found that it had only two file which were being executed. The first file was a .sql which was creating a view. The second file was a .pls file which was creating a package body. The easiest way to reverse this patch is to run the .sql and .pls file it has kept in the patch/backup directory. However there was no .sql file in backup directory. Only the .pls file was present.

This means that the view could be dropped. We ran the .pls and it failed complaining about a procedure init which was present in the package specification but not in the package body. So we compared the versions of specification and body with a fresh 11.5.10.2 instance. The package body version in the backup directory matched with the fresh instance. So we took the package specification from the fresh 11.5.10 instance and the package and its body compiled ok. This way the patch was reversed.

The incorrect extranet site mystery

Many enterprise applications encounter a major obstacle when they deploy multiple servers for scalability: the challenge of maintaining session persistence. A transaction typically consists of several TCP connections between the client browser and the servers. Once multiple servers are deployed, connections for a given transaction could go to any of the servers. While many load balancers solve the problem of balancing load across multiple servers, not every one supports the different persistence needs. Because persistence by definition requires the load balancer to ignore the load conditions on the servers and select a server based on persistence rules, the trick for the load balancing products is to ensure that the required level of persistence is met without breaking load balancing as much as possible.

For the last few days, Akhilesh contacted me regularly for an issue in one extranet environment. In this environment, we have iSupplier running in extranet with the URL pon.justanexample.com which connects to the E-Business Suite. This has two app tiers on extranet which are running Oracle Apps configured for DMZ. There is another Java application which is called supplier.justanexample.com. This is also load balanced on two servers. The BigIP box is same for both but their webs (pool) are different. The following steps can be taken to reproduce the problem:

From your home machine which is connected to internet, if you access pon.justanexample.com the E-Business Suite login page appears. After this if you access supplier.justanexample.com, then also you are directed to pon.justanexample.com

Go to another machine with a different public IP and try to access supplier.justanexample.com. The site opens with its login page. Now try accessing pon.justanexample.com, you'll get directed to supplier.justanexample.com

This was a consistent behaviour which could be reproduced easily. We had network guys, DNS guys scratching their heads, trying to figure this one out. Finally we reached out to a network expert who had previously solved such tricky problems. He did a trace of the network calls and found that the global IP was resolving correctly and requests were coming in on the correct IP address, but it was opening the incorrect application. He said maybe it is being done by the Application itself. Application teams denied this. So he checked from the other side, that is from the load balancer inside the DMZ and found that it was indeed going to the incorrect server. It was very strange. He checked the persistence setting in the load balancer and saw that it was set to source IP based persistence. Since all other environments had cookie based persistence, he changed the persistence to cookie based on a hunch. Voila, the problem was solved. After this, if you typed http://pon.justanexample.com, it would take you to E-Business Suite and if you typed http://supplier.justanexample.com, it would take you to that application's username/password page.

He said that it is possible that source IP based persistence was not taking the URL into account and was routing traffic solely on the IP. The BigIp load balancer would check the source IP from which the your first request came for a particular application, and would always send any subsequent request coming from your IP to the same application disregarding the URL you were trying to actually reach. This could be a bug in the load balancer. We are not really sure about this. Changing the persistence method to cookie based, fixed the issue.

There are two ways to do cookie-based persistence: cookie based switching and cookie hashing. In the first approach, the real server sets a cookie value that indicates to the load-balancing switch which real server a connection must be directed to. In the second approach, the load balancer can hash on the entire cookie string to select a real server. Once the load balancer selects a real server for a given hashing value, it will stick with that real server for all such traffic.

I am not sure which method is used by BigIP to do the cookie based persistence, but it sure avoided the problem we had with source ip based persistence.

Tuesday, February 19, 2008

Adventures with ASCP Direct loader

For most of the day today we were trying to figure out why the ASCP Direct loader was failing with this error:

APP-MRP-22130: Cannot connect to database

Cause: The current routine cannot connect to the database.

Action: Contact your system administrator or customer support representative.

Concurrent program returned no reason for failure.
ORACLE error 1012 in FDPCLS

Cause: FDPCLS failed due to ORA-01012: not logged on
.

The SQL statement being executed at the time of the error was: and was executed from the file .
$MSC_TOP/bin/MSCPLD.sh
Program exited with status 1

We had configured direct load as per metalink note 307443.1:
Set profile options in ASCP instance:
MRP:Use Direct Load Option = Yes
MSC: Launch System Sql Loader = Yes

Uncomment these lines from $MSC_TOP/bin/MSCPLD.sh and ensure that they are pointing to the 10.2.0 RDBMS ORACLE_HOME. 10g Database doesn't accept a loader from an Oracle home below 9.0

export ORACLE_HOME=10.2.0 ORACLE_HOME
export LD_LIBRARY_PATH=$ORACLE_HOME/lib;$ORACLE_HOME/ctx/lib:$LD_LIBRARY_PATH
export PATH=$ORACLE_HOME/bin:$PATH

However the error was occuring even after we relinked the MSCSLD binary. The following environment variables were already set as per metalink note 254810.1

# APS64_ORACLE_HOME=$MSO_TOP/bin/nls8174
# APS64_ORA_NLS33=$MSO_TOP/bin/nls8174/ocommon/nls/admin/data

Finally a truss of MSCSLD binary revealed that it was looking for ORA_NLS33 environment variable. After we set the environment variable as

ORA_NLS33=$MSO_TOP/bin/nls8174/ocommon/nls/admin/data

in the customization section of adovars.env

the issue got resolved.

Monday, February 18, 2008

Query to identify locally managed tablespaces

Here's the query to identify locally managed tablespaces and dictionary managed tablespaces in your instance.

SQL> SELECT tablespace_name,extent_management
FROM dba_tablespaces;

TABLESPACE_NAME                EXTENT_MAN
------------------------------ ----------
SYSTEM DICTIONARY
TEMP LOCAL
USERS DICTIONARY
TAB DICTIONARY
IDX DICTIONARY
SYSAUX LOCAL
UNDO LOCAL
Ari Kaplan has written an excellent article describing the process of migrating dictionary managed tablespaces to locally managed.

Its possible that all your tablespaces may be locally managed. But it doesn't hurt to run this query and check. In one of our non-OATM environments I found these tablespaces still being dictionary managed:

1 SELECT tablespace_name,extent_management
2 FROM dba_tablespaces
3* where extent_management='DICTIONARY'
SQL> /

TABLESPACE_NAME EXTENT_MAN
------------------------------ ----------
SYSTEM DICTIONARY
AMWD DICTIONARY
AMWX DICTIONARY
ASND DICTIONARY
ASNX DICTIONARY
FPAD DICTIONARY
FPAX DICTIONARY
FUND DICTIONARY
FUNX DICTIONARY
GCSD DICTIONARY
GCSX DICTIONARY
IAD DICTIONARY
IAX DICTIONARY
LNSD DICTIONARY
ZPBX DICTIONARY
ZXD DICTIONARY
ZXX DICTIONARY
LNSX DICTIONARY
MSTD DICTIONARY
MSTX DICTIONARY
XLED DICTIONARY
XLEX DICTIONARY
ZPBD DICTIONARY

23 rows selected.

In OATM enabled instance only SYSTEM tablespace was dictionary managed:

SQL> SELECT tablespace_name,extent_management
FROM dba_tablespaces
where extent_management='DICTIONARY'
/
2 3 4
TABLESPACE_NAME EXTENT_MAN
------------------------------ ----------
SYSTEM DICTIONARY

Ari's article describes in detail how to migrate SYSTEM tablespace from dictionary managed to local.

Sunday, February 17, 2008

Prevent disk space issues during ASCP plan run

ASCP plan is one of the most likely things which fails after a fresh clone from Production. Most of the time it is due to space issues. Tablespaces which are populated during the plan run should be pre-checked for free space with this query (Run this on both OLTP (source) and ASCP(destination) instance):

select ts.tablespace_name, to_char(sum(nvl(fs.bytes,0))/1048576, '99,999,990.99') as MB_FREE,
count(*) as FRAGMENTS, to_char(max(nvl(fs.bytes,0))/1048576, '99,999,990.99') as BIGGEST_BIT
from dba_free_space fs, dba_tablespaces ts
where fs.tablespace_name(+) = ts.tablespace_name
and ts.tablespace_name in ('INVD','INVX','WSHD','WSHX','BOMD','BOMX','ONTD','ONTX','WSMD','WSMX',
'MRPD','MRPX','AHLD','AHLX','WIPD','WIPX','POD','POX')
group by ts.tablespace_name
/
TABLESPACE_NAME MB_FREE FRAGMENTS BIGGEST_BIT
------------------ -------------- ---------- --------------
AHLD 37.54 1 37.54
AHLX 5.12 1 5.12
BOMD 90,125.31 1339 1,225.20
BOMX 86,136.33 3677 914.88
INVD 1,344.96 3031 554.26
INVX 3,642.46 252 1,218.16
MRPD 7,521.95 322 228.24
MRPX 10,800.98 1226 845.20
ONTD 2,370.04 975 1,947.34
ONTX 1,222.62 2036 792.23
POD 13,663.32 2256 516.17
POX 1,881.17 1002 639.84
WIPD 177.07 2 176.99
WIPX 69.06 4 67.42
WSHD 3,369.14 96 1,625.55
WSHX 3,455.78 435 1,196.25
WSMD 181.41 1 181.41
WSMX 78.79 1 78.79

18 rows selected.

Add space to all the tablespaces where MB_FREE shows less than 200 MB. I usually advise adding a new 2GB datafile to each tablespace which has less than 200 MB. If you do not add space, plan may fail with these errors in a non-OATM instance:

Altering Snapshot : WIP_FLOW_SCHDS_SN
Index WIP_FLOW_SCHDS_SN_N1 does not exist...
ORA-01658: unable to create INITIAL extent for segment in tablespace WIPX
ORA-20001: Index Creation failed: ORA-01658: unable to create INITIAL extent for segment in tablespace WIPX

Also run this query:

select sum(bytes)/1024/1024/1024 from v$tempfile;

Result of the above query should be greater than or equal to 12

Which means you should have at least 12 GB of space in temp files for the plan to succeed.

Thursday, February 14, 2008

Redirect to third party site from extranet pages

In one of the iSupplier implementations, we have given a link to a third party site. Whenever this link http://www.anotherthirdpartysite.com is clicked, we got the following error:

Gone

Access to the requested URI has been blocked by the URL Firewall.

If you believe that you have reached this page while performing valid operations within the application, please send mail to applmgr@justanexample.com explaining what you were doing when you got this error.


This is expected because the url firewall file doesn't have this entry. To allow this we made the following change indicated in bold in the $IAS_ORACLE_HOME/Apache/Apache/conf/url_fw.conf file present on the extranet web tiers:

On line 617 of url_fw.conf:

#==========================================================
#Include all Custom files
#==========================================================
# Third party link
RewriteRule https://www.thirdpartysite.com - [L]

# If not allowed by list above - go away! [G] is '410 Gone'
RewriteRule .* - [G]
# END Positive list of URLs

After making this change, the Gone error stopped coming.

Wednesday, February 13, 2008

fatal: libm.so.2: open failed: No such file or directory

Anand called me today and informed me about the errors they were getting in a new clone:

Following errors appeared during adcfgclone:

adgendbc.sh started at Wed Feb 13 19:58:15 EST 2008


.end std out.
ld.so.1: adident: fatal: libm.so.2: open failed: No such file or directory
$COMMON_TOP/admin/install/$CONTEXT_NAME/adgendbc.sh: test: argument expected

.end err out.


Uploading Metadata file $AD_TOP/admin/template/adctxinf.tmp
Metadata file $AD_TOP/admin/template/adctxinf.tmp upload failed.
Check log file created by FNDCPUCF program.
ERRORCODE = 137 ERRORCODE_END
.end std out.
ld.so.1: FNDCPUCF: fatal: libm.so.2: open failed: No such file or directory
Killed
ld.so.1: FNDCPUCF: fatal: libm.so.2: open failed: No such file or directory
Killed

I did an ldd to see what libraries were being called by the binaries that were failing

$ ldd $AD_TOP/bin/adident
libclntsh.so.1.0 => $ORACLE_HOME/lib/libclntsh.so.1.0
libnsl.so.1 => /usr/lib/libnsl.so.1
libsocket.so.1 => /usr/lib/libsocket.so.1
libgen.so.1 => /usr/lib/libgen.so.1
libdl.so.1 => /usr/lib/libdl.so.1
libc.so.1 => /usr/lib/libc.so.1
libaio.so.1 => /usr/lib/libaio.so.1
libm.so.2 => (file not found)
libm.so.1 => /usr/lib/libm.so.1
libmp.so.2 => /usr/lib/libmp.so.2
/usr/platform/SUNW,Sun-Fire-15000/lib/libc_psr.so.1

On inquring about the OS version of the source instance and the target instance, I learned that source instance is on Solaris 8 OS and target instance is on Solaris 10.

The first workaround I suggested was to simply copy /usr/lib/libm.so.1 as /usr/lib/libm.so.2. This worked, but since we were not sure this was the correct method, I googled some more. This link on sunfreeware.com advises in red:
The SPARC/Solaris 8 package may need /usr/lib/libm.so.2 which may not exist on some systems. You can use libm.so.1 by just doing ln -s /usr/lib/libm.so.1 /usr/lib/libm.so.2.

Sun fixed bug 6380587 which has these comments:
Currently the official builds on Solaris 8 use -lm or /usr/lib/libm.so in links of libraries or programs that need libm (math library functions). On Solaris 8 this ends up being a dependency on libm.so.1 because libm.so is a softlink to libm.so.1. But when you build on Solaris 9/10/11, libm.so is a softlink to libm.so.2, which is a library that does not exist on Solaris 8.

On Solaris 8 and 9, /usr/lib/libm.so is a softlink to /usr/lib/libm.so.1.
On Solaris 10, /usr/lib/libm.so is a softlink to /usr/lib/libm.so.2.
This softlink is resolved at the time the library is built with -lm (which is 'as if' you listed /usr/lib/libm.so on your link line), so this dependency on either libm.so.1 or libm.so.2 becomes permanent in what you build.

libm.so.2 is actually the Standard C math library. (A new .2 version had to be added when adding C99 support in Solaris 10 since it was incompatible with some of the existing functions in libm.so.1.)

After learning so much, I realized that I had not searched on metalink yet. So I searched with keyword libm.so.2 and voila note 403543.1 appeared in the results and gives the best solution:

Symptoms

When starting services (e.g., webserver, concurrent manager server or report server), they fail with the following errors:

adapcctl.sh version 115.50

Apache Web Server Listener is not running.
Starting Apache Web Server Listener (dedicated HTTP) ...
Apache Web Server Listener (PLSQL) is not running.
Starting Apache Web Server Listener (dedicated PLSQL) ...

adapcctl.sh: exiting with status 0


.end std out.
ld.so.1: FNDSVCRG: fatal: libm.so.2: open failed: No such file or directory
Killed
ld.so.1: FNDSVCRG: fatal: libm.so.2: open failed: No such file or directory
Killed
ld.so.1: FNDSVCRG: fatal: libm.so.2: open failed: No such file or directory
Killed
ld.so.1: FNDSVCRG: fatal: libm.so.2: open failed: No such file or directory
Killed

Note : While trying to correct the iss it was found that ADADMIN utility was not working.

Changes

Cloned from Solaris 10 to solaris 8

Cause

Due to the different versions of the operating systems used, the libraries will not match.

Solution

To implement the solution, please execute the following steps:

1. adrelink.sh force=y "fnd FNDSVCRG"

2. adrelink.sh force=y "ad adadmin"

3. adrelink.sh force=y "ad all"

Based on the above note, we first ran adrelink.sh force=y "ad adadmin"

ldd command confirmed that libm.so.2 was gone from the library list being referenced:

$ ldd $AD_TOP/bin/adadmin
libclntsh.so.1.0 => $ORACLE_HOME/lib/libclntsh.so.1.0
libnsl.so.1 => /usr/lib/libnsl.so.1
libsocket.so.1 => /usr/lib/libsocket.so.1
libgen.so.1 => /usr/lib/libgen.so.1
libdl.so.1 => /usr/lib/libdl.so.1
libc.so.1 => /usr/lib/libc.so.1
libaio.so.1 => /usr/lib/libaio.so.1
libm.so.1 => /usr/lib/libm.so.1
libmp.so.2 => /usr/lib/libmp.so.2
/usr/platform/SUNW,Sun-Fire-15000/lib/libc_psr.so.1

Based on the above, we ran adadmin and relinked all binaries with force option.

Tuesday, February 12, 2008

11i sizing guidelines

Real World Recommendations for 11i sizing

1) CPU
2) Memory
3) Disk Sizing
4) Network

CPU
===
No. of CPU required = (No.of Business events per hr / 275 ) + 2 CPU
15 Active Users/CPU (super users)
50 Logged Users/CPU
1 CPU Dedicated – Database and Application Processes (DB, Forms,
Web, Concurrent Managers)

Memory
======
For OS = 1 Gb

For Apps
Basic = 2 GB
6 user = 128 mb
Apache = 20 mb / user
PL/SQL = 1 mb / user
25 mb per concurrent user

15MB for every active user
15MB for concurrent manager processes
20MB for database background processes
500MB SGA (pinning packages and sequences)

Disk Sizing
===========
Tier Space Required
Application tier file system (includes iAS 26 GB
ORACLE_HOME, 8.0.6 ORACLE_HOME,
COMMON_TOP, and APPL_TOP)

Database tier file system (fresh install with a 31 GB
production database)

Database tier file system (fresh install with a 65 GB
Vision Demo database)


Net Work
========
Data Entry = 1.4 kbps
Data Navigation = 2.4 kbps
Navigation = 4.8 kbps

ponUpgAucItemPrc.sql fails during 11.5.10.2 maintenance pack

While applying 11.5.10.2 maintenance pack 3480000, the worker running ponUpgAucItemPrc.sql failed with following error:

ORA-01407: cannot update
("APPLSYS"."FND_ATTACHED_DOCUMENTS"."CREATED_BY") to NULL
ORA-06512: at line 230

Oracle had originally provided a datafix patch 6490593, which used to fix the problem in previous iterations of the upgrade. However in the current iteration which is done on a latest clone of Production, the same error appears when the datafix patch 6490593 is applied. After a long SR, Oracle provided the following direction:

CAUSE DETERMINATION
===================
The cause of the issue is invalid / incorrect data in fnd_attached_documents wrt to auction lines and fnd_attached_documents wrt to bid lines.


0. Apply patch 6691493

1. Apply the following sql file:

-- Query to update fnd_attached_documents wrt to auction lines.
Update fnd_attached_documents fad
Set fad.created_by = -1
WHERE fad.entity_name = 'PON_AUCTION_ITEM_PRICES_ALL'
AND fad.pk1_value not in
(SELECT to_char(paha.auction_header_id)
FROM pon_auction_headers_all paha);

-- Query to update fnd_attached_documents wrt to bid lines.
Update fnd_attached_documents fad
Set fad.created_by = -1
WHERE fad.entity_name = 'PON_BID_ITEM_PRICES'
AND fad.pk2_value not in
(SELECT to_char(pbh.bid_number)
FROM pon_bid_headers pbh);

Commit;
exit;
.
2. Apply Patch 6490593.

Monday, February 11, 2008

This is a copy paste of "Testing Veritas Cluster" article from blacksheepnetworks.com, lest it disappears:

Testing Veritas Cluster

Actual commands are in bold black.

0. Check Veritas Licenses - for FileSystem, Volume Manager AND Cluster

vxlicense -p

If any licenses are not valid or expired -- get them FIXED before continuing! All licenses should say "No expiration". If ANY license has an actual expiration date, the test failed. Permenant licenses do NOT have an expiration date. Non-essential licenses may be moved -- however, a senior admin should do this.

1. Hand check SystemList & AutoStartList

On either machine:

    grep SystemList /etc/VRTSvcs/conf/config/main.cf
    You should get:
    SystemList = { system1, system2 }

    grep AutoStartList /etc/VRTSvcs/conf/config/main.cf
    You should get:
    AutoStartList = { system1, system2 }

Each list should contain both machines. If not, many of the next tests will fail.

    If your lists do NOT contain both systems, you will probably need to modify them with commands that follow.
      more /etc/VRTSvcs/conf/config/main.cf (See if it is reasonable. It is likely that the systems aren't fully set up)
      haconf -makerw (this lets you write the conf file)
      hagrp -modify oragrp SystemList system1 0 system2 1
      hagrp -modify oragrp AutoStartList system1system2
      haconf -dump -makero (this makes conf file read only again)

2. Verify Cluster is Running

First verify that veritas is up & running:

    hastatus -summary
      If this command could NOT be found, add the following to root's path in /.profile:
        vi /.profile
        add /opt/VRTSvcs/bin to your PATH variable
      If /.profile does not already exist, use this one:
        PATH=/usr/bin:/usr/sbin:/usr/ucb:/usr/local/bin:/opt/VRTSvcs/bin:/sbin:$PATH
        export PATH
      . /.profile
    Re-verify command now runs if you changed /.profile:
    hastatus -summary

Here is the expected result (your SYSTEMs/GROUPs may vary):

One system should be OFFLINE and one system should be ONLINE ie:
# hastatus -summary

  -- SYSTEM STATE
-- System State Frozen

A e4500a RUNNING 0
A e4500b RUNNING 0

-- GROUP STATE
-- Group System Probed AutoDisabled State

B oragrp e4500a Y N ONLINE
B oragrp e4500b Y N OFFLINE

If your systems do not show the above status, try these debugging steps:

  • If NO systems are up, run hastart on both systems and run hastatus -summary again.

  • If only one system is shown, start other system with hastart. Note: one system should ALWAYS be OFFLINE for the way we configure systems here. (If we ran oracle parallel server, this could change -- but currently we run standard oracle server)

  • If both systems are up but are OFFLINE and hastart did NOT correct the problem and oracle filesystems are not running on either system, the cluster needs to be reset. (This happens under strange network situations with GE Access.) [You ran hastart and that wasn't enough to get full cluster to work.]

      Verify that the systems have the following EXACT status (though your machine names will vary for other customers):

      gedb002# hastatus -summary

      -- SYSTEM STATE
      -- System State Frozen

      A gedb001 RUNNING 0
      A gedb002 RUNNING 0

      -- GROUP STATE
      -- Group System Probed AutoDisabled State


      B oragrp gedb001 Y N OFFLINE

      B oragrp gedb002 Y N OFFLINE

      gedb002# hares -display | grep ONLINE
      nic-qfe3 State gedb001 ONLINE
      nic-qfe3 State gedb002 ONLINE

      gedb002# vxdg list
      NAME STATE ID
      rootdg enabled 957265489.1025.gedb002

      gedb001# vxdg list
      NAME STATE ID
      rootdg enabled 957266358.1025.gedb001

      Recovery Commands:

        hastop -all
        on one machine hastart
        wait a few minutes
        on other machine hastart
        hastatus -summary (make sure one is OFFLINE && one is ONLINE)

    If none of these steps resolved the situation, contact Lorraine or Luke (possibly Russ Button or Jen Redman if they made it to Veritas Cluster class) or a Veritas Consultant.

3. Verify Services Can Switch Between Systems

Once, hastatus -summary works, note the GROUP name used. Usually, it will be "oragrp", but the installer can use any name, so please determine it's name.

First check if group can switch back and forth. On the system that is running (system1), switch veritas to other system (system2):

    hagrp -switch groupname -to system2 [ie: hagrp -switch oragrp -to e4500b]
Watch failover with hastatus -summary. Once it is failed over, switch it back:
    hagrp -switch groupname -to system1

4. Verify OTHER System Can Go Up & Down Smoothly For Maintanence

On system that is OFFLINE (should be system 2 at this point), reboot the computer.

    ssh system2
    /usr/sbin/shutdown -i6 -g0 -y
Make sure that the when the system comes up & is running after the reboot. That is, when the reboot is finished, the second system should say it is offline using hastatus.
    hastatus -summary
Once this is done, hagrp -switch groupname -to system2 and repeat reboot for the other system
    hagrp -switch groupname -to system2
    ssh system1
    /usr/sbin/shutdown -i6 -g0 -y
Verify that system1 is in cluster once rebooted
    hastatus -summary

5. Test Actual Failover For System 2 (and pray db is okay)

To do this, we will kill off the listener process, which should force a failover. This test SHOULD be okay for the db (that is why we choose LISTENER) but there is a very small chance things will go wrong .. hence the "pray" part :).

On system that is online (should be system2), kill off ORACLE LISTENER Process

    ps -ef | grep LISTENER
Output should be like:
  root  1415   600  0 20:43:58 pts/0    0:00 grep LISTENER
oracle 831 1 0 20:27:06 ? 0:00 /apps/oracle/product/8.1.5/bin/tnslsnr LISTENER -inherit
    kill -9 process-id (the first # in list - in this case 831)
Failover will take a few minutes

You will note that system 2 is faulted -- and system 1 is now online

You need to CLEAR the fault before trying to fail back over.

    hares -display | grep FAULT
    for the resource that is failed (in this case, LISTENER)
    Clear the fault
    hares -clear resource-name -sys faulted-system [ie: hares -clear LISTENER -sys e4500b]

6. Test Actual Failover For System 1 (and pray db is okay)

Now we do same thing for the other system first verify that the other system is NOT faulted

    hastatus -summary
Now do the same thing on this system... To do this, we will kill off the listener process, which should force a failover.

On system that is online (should be system2), kill off ORACLE LISTENER Process

    ps -ef | grep LISTENER
Output should be like:
  oracle   987     1  0 20:49:19 ?        0:00 /apps/oracle/product/8.1.5/bin/tnslsnr LISTENER -inherit
root 1330 631 0 20:58:29 pts/0 0:00 grep LISTENER
    kill -9 process-id (the first # in list - in this case 987)
Failover will take a few minutes

You will note that system 1 is faulted -- and system 1 is now online

You need to CLEAR the fault before trying to fail back over.

    hares -display | grep FAULT for the resource that is failed (in this case, LISTENER)
    Clear the fault

    hares -clear resource-name -sys faulted-system [ie: hares -clear LISTENER -sys e4500a]

Run:

    hastatus -summary
to make sure everything is okay.

How to find out the names of the nodes in a veritas cluster

Here is the command to find out the names of the nodes in a veritas cluster:

grep SystemList /etc/VRTSvcs/conf/config/main.cf

or

grep AutoStartList /etc/VRTSvcs/conf/config/main.cf

Here is some more information about Veritas Cluster commands:

Term/Cmd/Pkg Description Command / File
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
VRTSvcs VERITAS Cluster Server
VRTSvcswz VERITAS Cluster Server Wizard
VRTScsga VERITAS Cluster Server Graphical Administrator
VRTSgab VERITAS Group Membership and Atomic Broadcast
VRTSllt VERITAS Low Latency Transport
VRTSvcsor VERITAS Cluster Server Oracle Enterprise Extension
VRTSvcssy VERITAS Cluster Server Sybase Enterprise Extension
VRTSperl VERITAS Perl for VRTSvcs

Cluster Name of your HA environment
Nodes Physical systems that make up the cluster
Service group Abstract container of related resources
Resource Cluster components (i.e. NICs, IPs, disk groups, volumes, mounts, processes, etc...)
Attributes Parameter values that define the resources
Dependencies Links between resources or service groups

Cluster Mgr Cluster Monitor : Log in, add clusters, change preferences
Cluster Mgr Cluster Explorer: Monitor systems, service grps, resources, attributes & dependencies
Cluster Mgr Log Desk : Monitor log messages received from engine, view GUI commands
Cluster Mgr Command Center : Build VCS commands and send to engine
LLT Low Latency transport provides fast kernel-kernel comm. & monitors network connx.
GAB Grp membership & Atomic Broadcast maintains a synch. state & monitors disk comm.

Config files VCS etc directory $VCSETC=/etc/VRTSvcs
Config files VCS configuration directories $VCSCONF=/etc/VRTSvcs/conf/config
Binary files VCS opt directory $VCSOPT=/opt/VRTSvcs
Binary files VCS binary path $VCSBIN=/opt/VRTSvcs/bin
Log files VCS log path $VCSLOG=/var/VRTSvcs/log
Config files VCS configuration file /etc/VRTSvcs/conf/config/main.cf
LLT tab file LLT configuration file /etc/llttab
LLT hosts file LLT host name database /etc/llthosts
GAB file Grp membership & Atomic Broadcast file /etc/gabtab

quick-start VCS Quick-start wizard # $VCS_HOME/wizards/config/quick_start
quick-NFS VCS Quick-NFS wizard # $VCS_HOME/wizards/services/quick_nfs
llt Verify LLT # /sbin/llstat -n
llt Get interface MAC Address # /opt/VRTSllt/getmac device_name
llt Check network connectivity # /opt/VRTSllt/dlpiping -s|-c -v device_name
gab Verify GAB # /sbin/gabconfig -a ; /sbin/gabconfig -l
hasys List systems in cluster # /opt/VRTSvcs/bin/hasys -list
hasys Detailed info on each cluster node # /opt/VRTSvcs/bin/hasys -display (sysname)
hasys Increase system count in gabtab startup # /opt/VRTSvcs/bin/hasys -add (sysname)
hasys Delete a system # /opt/VRTSvcs/bin/hasys -delete (sysname)
hastart Start VCS cluster # /opt/VRTSvcs/bin/hastart
hastart Force start a stale VCS cluster # /opt/VRTSvcs/bin/hastart -force -stale
hastop Stop VCS on all systems # /opt/VRTSvcs/bin/hastop -all
hastop Stop VCS had, keep srvc-groups running # /opt/VRTSvcs/bin/hastop -local -force
hastop Stop VCS, migrate srvc-groups to sysname # /opt/VRTSvcs/bin/hastop -sys (sysname) -evacuate
hastatus Provide continual status of service grps # /opt/VRTSvcs/bin/hastatus
hastatus Verify status of service groups # /opt/VRTSvcs/bin/hastatus -summary
hacf Check for syntax errors in main.cf # /opt/VRTSvcs/bin/hacf -verify /etc/VRTSvcs/conf/config/main.cf
hacf Generate dependency tree in main.cf # /opt/VRTSvcs/bin/hacf -generate /etc/VRTSvcs/conf/config/main.cf

hares List all resources # /opt/VRTSvcs/bin/hares -list
hares List a resource's dependencies # /opt/VRTSvcs/bin/hares -dep (resource_name)
hares Get detailed info on a resource # /opt/VRTSvcs/bin/hares -display (resource)
hares Add a resource # /opt/VRTSvcs/bin/hares -add (resource_name (resource_type (service_group)
hares Modify attributes of the new resource # /opt/VRTSvcs/bin/hares -modify (resource_name (attribute_name (value)
hares Delete a resource # /opt/VRTSvcs/bin/hares -delete (resource_name)
hares Online a resource # /opt/VRTSvcs/bin/hares -online (resource_name) -sys (system_name)
hares Offline a resource # /opt/VRTSvcs/bin/hares -offline (resource_name) -sys (system_name)
hares Monitor resource on a system # /opt/VRTSvcs/bin/hares -probe (resource_name) -sys (system_name)
hares Clear a faulted resource # /opt/VRTSvcs/bin/hares -clear (resource_name) [-sys system_name]
hares Make a resource's attribute value local # /opt/VRTSvcs/bin/hares -local (resource_name) (attribute_name) value)
hares Make a resource's attribute value global # /opt/VRTSvcs/bin/hares -global (resource_name) (attribute_name) value)
hares Specify a dependency between 2 resources # /opt/VRTSvcs/bin/hares -link (parent_res) (child_res)
hares Remove dependency between 2 resources # /opt/VRTSvcs/bin/hares -unlink (parent_res) (child_res)
hares Modify a Share res. by adding options # /opt/VRTSvcs/bin/hares Share_cicgt-as4-p_apps Options "%-o rw,root=dcsa-cln1"

hagrp List all service groups # /opt/VRTSvcs/bin/hagrp -list
hagrp List a service group's resources # /opt/VRTSvcs/bin/hagrp -resources [service_group]
hagrp List a service group's dependencies # /opt/VRTSvcs/bin/hagrp -dep [service_group]
hagrp Detailed info about a service group # /opt/VRTSvcs/bin/hagrp -display [service_group]
hagrp Start service group, bring res. online # /opt/VRTSvcs/bin/hagrp -online (service_group) -sys (system_name)
hagrp Stop service group, bring res. offline # /opt/VRTSvcs/bin/hagrp -offline (service_group) -sys (system_name)

hagrp Switch service group between nodes # /opt/VRTSvcs/bin/hagrp -switch (service_group) -to (system_name)
hagrp Freeze svcgroup, (disable onl. & offl.) # /opt/VRTSvcs/bin/hagrp -freeze (service_group) [-persistent]
hagrp Thaw a svcgroup, (enable onl. & offl.) # /opt/VRTSvcs/bin/hagrp -unfreeze (service_group) [-persistent]
hagrp Enable a service group # /opt/VRTSvcs/bin/hagrp -enable (service_group) [-sys system_name]
hagrp Disable a service group # /opt/VRTSvcs/bin/hagrp -disable (service_group) [-sys system_name]
hagrp Enable all resources in a service group # /opt/VRTSvcs/bin/hagrp -enableresources (service_group)
hagrp Disable all resources in a service group # /opt/VRTSvcs/bin/hagrp -disableresources (service_group)
hagrp Specify dependenciy between 2 svc groups # /opt/VRTSvcs/bin/hagrp -link (parent_group) (child_group) (relationship)
hagrp Remove dependenciy between 2 svc groups # /opt/VRTSvcs/bin/hagrp -unlink (parent_group) (child_group)
hagrp Auto-Enable a servicegroup marked # /opt/VRTSvcs/bin/hagrp -autoenable (service_group) [-sys system_name]
disabled due to prob with system_name.

hatype List resource types # /opt/VRTSvcs/bin/hatype -list
hatype Detailed info on a resource type # /opt/VRTSvcs/bin/hatype -display (resource_type)
hatype List all resources of a part. type # /opt/VRTSvcs/bin/hatype -resources (resource_type)
hatype Add a resource type # /opt/VRTSvcs/bin/hatype -add (resource_type)
hatype Set static attribute values # /opt/VRTSvcs/bin/hatype -modify ...
hatype Delete a resource type # /opt/VRTSvcs/bin/hatype -delete (resource_type)
haattr Add Attribute to a Type definition # /opt/VRTSvcs/bin/haattr -add (resource_type) (attribute_name) (attribute_type -integer, -string, -vector)
haattr Delete a Entry in a Type definition # /opt/VRTSvcs/bin/haattr -delete (resource_type) (attribute_name)
haconf Set VCS configuration file to r/w mode # /opt/VRTSvcs/bin/haconf -makerw
haconf Set VCS configuration file to read mode # /opt/VRTSvcs/bin/haconf -dump -makero
hauser Add a user with r/w access to VCS # /opt/VRTSvcs/bin/hauser -add (user_name)
hauser Add a user with read access only to VCS # /opt/VRTSvcs/bin/hauser -add VCSGuest
hauser Update a user # /opt/VRTSvcs/bin/hauser -update (user_name)
hauser Delete a user # /opt/VRTSvcs/bin/hauser -delete (user_name)
hauser Display all users # /opt/VRTSvcs/bin/hauser -display
haagent Start agents manually # haagent -start (agent_name) -sys (system_name)
haagent Stop agents manually # haagent -stop (agent_name) -sys (system_name)
hagui Start Cluster Manager # /opt/VRTSvcs/bin/hagui
hagui Start Cluster Manager in debug mode # /opt/VRTSvcs/bin/hagui -D

Product Terminology comparisons
Sun SC 2.2 Veritas VCS 1.1
------------------------------------------------------
cluster name cluster name
admin workstation -
physical node A local system
physical node B remote system
physical node IP address maintenance IP address
logical host service group
logical host IP address service group IP address
- resources
disk group disk group
private heartbeats communication channels
- GAB disk (disk heartbeat)
Quorum disk -
Admin filesystem -
scinstall Quick-Start wizard
split-brain network partition

configuration files:
/etc/llthosts
/etc/llttab
/etc/gabtab
/etc/VRTSvcs/conf/config/main.cf
/etc/VRTSvcs/conf/config/sysname

Thursday, February 7, 2008

oracle.apps.fnd.sso.AppsLogin was not found

In an instance, when I clicked on the Oracle Applications Manager link: http://www.justanexample.com:8000/servlets/weboam/oam/oamLogin

The following error appeared:

HTTP 404 Not Found
The requested URL /oracle.apps.fnd.sso.AppsLogin was not found on this server.

The browser showed this in the site address field:

http://www.justanexample.com:8000/oracle.apps.fnd.sso.AppsLogin?requestUrl=/oa_servlets/weboam/oam/oamLogin&cancelUrl=http://11ioltp.justanexample.com:8069/oracle.apps.fnd.sso.AppsLogin

I compared this with a working instance where the address filed shows this:

http://www.anotherexample.com:8001/OA_HTML/AppsLocalLogin.jsp?requestUrl=
http%3A%2F%2Fwww.anotherexample.com%3A8068%2Foa_servlets%2Fweboam
%2Foam%2FoamLogin&cancelUrl=http%3A%2F%2Fwww.anotherexample.com
%3A8068%2Foa_servlets%2Foracle.apps.fnd.sso.AppsLogin&s2=
498FD27D425A66475CD8CD04DEE6BE5659B6DA334C5F84414D175249D8DFEB4C

Examining the Apache logs revealed this error:

[Thu Feb 7 23:25:54 2008] [error] [client 192.168.10.38] File does not exist: $COMMON_TOP/portal/$CONTEXT_NAME/oracle.apps.fnd.sso.AppsLogin

Why is it not giving the oa_servlets by default. Why is it going to portal ? It should go to http://justanexample.com:8000/oa_servlets/oracle.apps.fnd.sso.AppsLogin, but instead it is going to http://justanexample.com:8000/oracle.apps.fnd.sso.AppsLogin. This is resulting in apps looking for this class in $COMMON_TOP/portal instead of looking for it in $COMMON_TOP/java.

I logged an SR with Oracle for this and they suggested :

1. Please change the profile option "Apps Servlet Agent" at server level to have the same value a
s at Site level, i.e. it should be:
http://justanexample.com:8000/oa_servlets

2. Bounce Apache.

3. Retest the issue and advice the results.

Even without bouncing Apache, it started working. However the error still came when we try to logout of OAM or Self Service:

HTTP-404 Not Found
The requested URL /oracle.apps.fnd.sso.AppsLogout was not found on this server.

Oracle asked me to do this:

ACTION PLAN
============
1. Please review Note 364439.1 - "Tips and Queries for Troubleshooting Advanced Topologies", section "The Profile Options Query".

After I ran this query in the instance where everything was fine and the instance where it was failing, I realized that all the 130 profile values under the profile (D)-Apps Servlet Agent was missing /oa_servlets

Boiling it down to this query:

Instance where this error was occuring:

select profile_option_value
from fnd_profile_option_values
where profile_option_id=3804
and profile_option_value='http://justanexample.com:8000';

PROFILE_OPTION_VALUE
--------------------------------------
http://justanexample.com:8000
http://justanexample.com:8000
http://justanexample.com:8000
.
.
.
34 rows selected

Instance which was showing OAM page ok the result of query was:

PROFILE_OPTION_VALUE
----------------------------------------------------
http://www.anotherexample.com:8001/oa_servlets
http://www.anotherexample.com:8001/oa_servlets
http://www.anotherexample.com:8001/oa_servlets
http://www.anotherexample.com:8001/oa_servlets
http://www.anotherexample.com:8001/oa_servlets

5 rows selected

So the simple solution is to run an update command:

update fnd_profile_option_values
set profile_option_value='http://justanexample.com:8000/oa_servlets'
where profile_option_id=3804
and profile_option_value='http://justanexample.com:8000'

Bounce Apache after this and it works.

Wednesday, February 6, 2008

Unable to open file for reading: marker1.txt

While merging 255 functional patches with admrgpch, we got the following error:

AD Merge Patch error:
Unable to open file for reading:

/patchtop/source/5522715/marker1.txt

A truss of the admrgpch command showed that it is failing as soon as open files reaches 256:
openat(-3041965, "/stage/jul_2008/gpshpc76/func_patches/before/5522715", O_RDONLY|O_NDELAY|O_LARGEFILE) = 255
fcntl(255, F_SETFD, 0x00000001) = 0
fstat64(255, 0xFFBF55F0) = 0
close(255) = 0
openat(-3041965, "/stage/jul_2008/gpshpc76/func_patches/before/5522715", O_RDONLY|O_NDELAY|O_LARGEFILE) = 255
fcntl(255, F_SETFD, 0x00000001) = 0
fstat64(255, 0xFFBF1068) = 0
getdents64(255, 0xFE604000, 8192) = 264
open("/stage/jul_2008/gpshpc76/func_patches/before/5522715/marker1.txt", O_RDONLY) = 256
close(256) = 0
fstat64(2, 0xFFBF03F0) = 0

AD Merge Patch error:
Unable to open file for reading:
write(2, "\n A D M e r g e P a".., 56) = 56

write(2, "\n", 1) = 1
/stage/jul_2008/gpshpc76/func_patches/before/5522715/marker1.txtwrite(2, " / s t a g e / j u l _ 2".., 64) = 64

write(2, "\n", 1) = 1
write(3, "\n * * * * * * * * * * *".., 437) = 437
close(3) = 0
_exit(1)

However the soft and hard limit are set to 65535 on the OS.

So as a workaround we removed 6 patches from the 255 patches and it merged fine. I have logged an SR with Oracle for a resolution. Still not happy with the situation, I was googling for solution and found an article : Solaris OS Solutions to 32-Bit stdio's 256 File-Descriptors Limitation.

Historically, 32-bit applications running on the Solaris Operating System (OS) have been limited to using only the file descriptors 0 through 255 with the standard I/O functions in the C library. The extended FILE facility on Solaris 10 OS and later versions allows well-behaved 32-bit applications to use any valid file descriptor with the standard I/O functions.

A well-behaved application is one that meets all three of the following requirements:

* Does not directly access any member fields in the FILE structure pointed to by the FILE pointer associated with any standard I/O stream
* Checks all return values from standard I/O functions for error conditions
* Behaves appropriately when an error condition is reported

This article explains in detail the runtime and programming solutions that were introduced under the extended FILE facility. The following discussion is relevant only in 32-bit applications, as 64-bit applications are immune to the limitation to 256 file descriptors.

This is what I did as suggested in the article:

$ ulimit -n 65536 (My ulimit was already on 65536)

$ ulimit -n
65536

$ LD_PRELOAD_32=/usr/lib/extendedFILE.so.1
$ export LD_PRELOAD_32
$ admrgpch -s source -d destination -merge_name bigpatch

And it succeeded without issues:

0 copy drivers merged.
0 database drivers merged.
0 generate drivers merged.
257 mixed drivers merged to a unified driver.

Patch merge completed successfully

Now I'll close the SR I had opened with my observations.

Tuesday, February 5, 2008

How to find out which users are logged on to an Apps instance

FND_USER table stores the details of all the end users. If we give this query:

select user_name,to_char(last_logon_date,'DD-MON-YYYY HH24:MI:SS')
from apps.fnd_user
where to_char(last_logon_date,'DD-MON-YYYY')=to_char(sysdate,'DD-MON-YYYY');

USER_NAME TO_CHAR(LAST_LOGON_DATE,'D
------------------------------------
GUEST 05-FEB-2008 16:01:47
SYSADMIN 05-FEB-2008 16:02:06
USER1 05-FEB-2008 07:31:19
USER2 05-FEB-2008 04:58:12
USER3 05-FEB-2008 09:46:00
USER4 05-FEB-2008 05:00:38
USER5 05-FEB-2008 08:45:07
USER6 05-FEB-2008 09:09:17

The above query can give you a good idea who is logged on.

For a more accurate result, refer to metalink note 269799.1 which says:

You can run the Active Users Data Collection Test diagnostic script to get information about all active users currently logged into Oracle Applications.This diagnostic test will list users logged into Self Service Web Application, users logged into forms, and users running concurrent programs. Please note that to collect the audit information about the users who have logged into Forms, the "Sign-On:Audit Level" profile option should be set to Form. You can access this Diagnostic test via Metalink Note# 233871.1.

Monday, February 4, 2008

Can't modify constant item in scalar assignment

Akhilesh called me today and told me about a problem he was facing while running Software configuration manager:

$ORACLE_HOME/ccr/bin/emCCR collect
Oracle Configuration Manager - Release: 10.2.7.1.0 - Production
Copyright (c) 2005, 2007, Oracle. All rights reserved.
------------------------------------------------------------------
Error in processing ECM_ASSOCIATIONS metric collection for ebiz11i_apps_cm
Failed to execute command - "$CCR_HOME/engines/SunOS/perl/bin/perl" "$CCR_HOME/sysman/admin/scripts/ebs/oracle_apps_cm_assoc.pl"
Can't modify constant item in scalar assignment at $CCR_HOME/sysman/admin/scripts/ebs/ebs_assoc_common.pl line 171, near ");"
syntax error at $CCR_HOME/sysman/admin/scripts/ebs/ebs_assoc_common.pl line 177, near "col_values["
Compilation failed in require at $CCR_HOME/sysman/admin/scripts/ebs/oracle_apps_cm_assoc.pl line 38.

Error in processing cust_request_sets metric collection for ebiz11i_apps_db
Failed to execute command - "$CCR_HOME/engines/SunOS/perl/bin/perl" "$CCR_HOME/sysman/admin/scripts/ebs/ebsCustMetricCollector.pl" "$CCR_HOME/hosts/tsgsd1008/state/ebiz11i-apps_db.ll"
Undefined subroutine &main::getConfigProperty called at $CCR_HOME/sysman/admin/scripts/ebs/ebsCustMetricCollector.pl line 54.

Error in processing hpp_not_applied metric collection for ebiz11i_apps_db
Failed to execute command - "$CCR_HOME/engines/SunOS/perl/bin/perl" "$CCR_HOME/sysman/admin/scripts/ebs/ebsHPPNotAppliedCollector.pl" "$CCR_HOME/hosts/tsgsd1008/state/ebiz11i-apps_db.ll"
Undefined subroutine &main::getConfigProperty called at $CCR_HOME/sysman/admin/scripts/ebs/ebsHPPNotAppliedCollector.pl line 293.

Error in processing other_customizations metric collection for ebiz11i_apps_db
Failed to execute command - "$CCR_HOME/engines/SunOS/perl/bin/perl" "$CCR_HOME/sysman/admin/scripts/ebs/ebsCustMetricCollector.pl" "$CCR_HOME/hosts/tsgsd1008/state/ebiz11i-apps_db.ll"
Undefined subroutine &main::getConfigProperty called at $CCR_HOME/sysman/admin/scripts/ebs/ebsCustMetricCollector.pl line 54.


Collection and upload done.


For the first error, I did a googled for "Can't modify constant item in scalar assignment". The first hit was from Rohan's CGI documentation which said:

Can't modify constant item in scalar assignment...
You probably forgot to include a $ symbol before the variable. In Perl, scalar variables need to be referred to as $variable. The statement $variable = foo; is correct, but variable = foo; is not.

So accordingly, in $CCR_HOME/sysman/admin/scripts/ebs/ebs_assoc_common.pl, we changed:

Line 171 : col_values changed to $col_values
Line 177 : col_values changed to $col_values

After changing this when we ran the command again, the first error was gone and 3 errors remained:
======================================================
$ORACLE_HOME/ccr/bin/emCCR collect
Oracle Configuration Manager - Release: 10.2.7.1.0 - Production
Copyright (c) 2005, 2007, Oracle. All rights reserved.
------------------------------------------------------------------

Error in processing cust_request_sets metric collection for ebiz11i_apps_db
Failed to execute command - "$CCR_HOME/engines/SunOS/perl/bin/perl" "$CCR_HOME/sysman/admin/scripts/ebs/ebsCustMetricCollector.pl" "$CCR_HOME/hosts/tsgsd1008/state/ebiz11i-apps_db.ll"
Undefined subroutine &main::getConfigProperty called at $CCR_HOME/sysman/admin/scripts/ebs/ebsCustMetricCollector.pl line 54.

Error in processing hpp_not_applied metric collection for ebiz11i_apps_db
Failed to execute command - "$CCR_HOME/engines/SunOS/perl/bin/perl" "$CCR_HOME/sysman/admin/scripts/ebs/ebsHPPNotAppliedCollector.pl" "$CCR_HOME/hosts/tsgsd1008/state/ebiz11i-apps_db.ll"
Undefined subroutine &main::getConfigProperty called at $CCR_HOME/sysman/admin/scripts/ebs/ebsHPPNotAppliedCollector.pl line 293.

Error in processing other_customizations metric collection for ebiz11i_apps_db
Failed to execute command - "$CCR_HOME/engines/SunOS/perl/bin/perl" "$CCR_HOME/sysman/admin/scripts/ebs/ebsCustMetricCollector.pl" "$CCR_HOME/hosts/tsgsd1008/state/ebiz11i-apps_db.ll"
Undefined subroutine &main::getConfigProperty called at $CCR_HOME/sysman/admin/scripts/ebs/ebsCustMetricCollector.pl line 54.


Collection and upload done.
===============================================================

If you notice, all 3 errors are due to the fact that it is unable to find a subroutine called getConfigProperty. We searched for getConfigproperty string in the .pl files but could not find it. Most probably getConfigProperty is defined inside a different .pl file which is not present in the perl library path. Oracle has a few open bugs for this issue.

I got a call from Sadiq today for this problem. He is facing the very same problem in his environment. So I decided to revisit. I had noticed that this works in Configuration Managers installed by RDA and the Jan and April 2008 Critical patch update for database. I checked back with Akhilesh and found that he had raised an SR for this and Oracle had given a patch for this issue. The patch number is 6733104. However this patch is not created for all platforms. You could request this patch for your platform or follow this workaround:

1. Download patch 6854532 for your platform.
2. Unzip this patch. Patch will create a subdirectory called RDA
3. cd rda/ccr
4. cp ccr-Production-10.2.7.0.0*.zip $ORACLE_HOME
5. cd $ORACLE_HOME
6. unzip ccr-Production-10.2.7.0.0*
7. This will create a directory ccr
8. cd ccr/config
9. vi ccr.properties

Uncomment these lines and give the correct values:

http.proxyHost=proxy.justanexample.com
http.proxyPort=80

Use this ccr directory for your configuration manager tasks. You won’t get the perl errors.

I checked why the error doesn’t occur in this version:

cd $CCR_HOME/sysman/admin/scripts/ebs
grep getConfigProperty *

config_version.pl:my ($errMsg2, $collected_version) = getConfigProperty($configFile, "META_VER");

In the CCR_HOME in which this error was appearing, a grep on the very same directory returned no results. So Oracle has fixed their script config_version.pl which was missing this line in the version where it was erroring out:

my ($errMsg2, $collected_version) = getConfigProperty($configFile, "META_VER");