We got a problem in one of our environments where Mobile services hang. We have raised an SR with Oracle for this one, and here's the SR text:
Mobile services are not responding intermittently when logging in as a mobile user. Telnet to the service works fine, but when we try to login to the mobile application with the mobile user and password, it hangs. This is happening every 2 - 3 hours. We have to bounce the mobile services as a workaround. Error shown in logs is:
MWA_PH_DEVCFG_NOTFOUND: 192.168.3.60
configuration file is not found. Will use default configuration for now
[Sun Sep 09 07:12:40 EDT 2007] (Thread-9328) MWA_PH_DEVCFG_NOTFOUND: 192.168.3.58
configuration file is not found. Will use default configuration for now
[Sun Sep 09 07:12:41 EDT 2007] (Thread-9329) MWA_PH_DEVCFG_NOTFOUND: 192.168.3.61
configuration file is not found. Will use default configuration for now
[Sun Sep 09 07:12:41 EDT 2007] (Thread-9327) PH: User null got disconnected...
[Sun Sep 09 07:12:41 EDT 2007] (Thread-9327) PH: caught IOException
### Steps to Reproduce ###
telnet 11111
This shows the following screen:
Device List
1 Default
2 Symbol Device
3 Intermec Device
4 GUI Client
Press Enter for 1
It shows the following
Login
--------------------
Oracle
Mobile
Applications
--------------------
User Name:
Password :
Database :production
--------------------
Once we key in the username and password, it hangs/doesn't respond.
Oracle responded with 34 questions:
1. What is the version of JDK?
We would like to know the full version of JAVA/JDK that you are using.
Please go to your mwactl.sh file.
Find the "JAVA=..." parameter in this file.
cd to the director that is referenced by this JAVA parameter.
do a java -version and it should return something like this:
$java -version
java version "1.4.2_04"
Please ensure that the mwactl.sh using the jdk 1.4.2_20 or higher
With the Latest JDK 1.4.2_x and 1.5.x the overall Performance of the MWA Server increases
. We were able to UNOFFICALLY test the number of User Connection with 80 users
on one Telnet Server for a week without any issues. Basically the newer the JDK version is, the better MWA performed with each and every JDK signal version release.
2. Make sure Patch 4734840 - Oracle Inventory, Warehouse Management, Mobile Application Server (MWA), and Receiving (PO): Release 11.5.10, Rollup Patch 3
(INV/WMS/MWA/RCV 11.5.10 RUP3), has been applied.
OR
Patch 5855276 - Oracle Warehouse Management System : Release 11.5.10, Rollup Patch 4 ( WMS 11.5.10 RUP4)
This is the single most important Patch to have applied to a MWA Server. This Patch provided numberous fixes to the MWA Server, Dispatcher and all the Products that use the MWA Server all in one shot. This Patch will place the MWA Server on the latest Core Server file system versions.
If the above two suggestions do not help with this issue, then please proceed with the action plan below.
ACTION PLAN
----------------------
1. When did this problem start?
2. How long have you been using MWA?
Have there been any changes that might have brought on this issue - patching, cloning, upgrades, ...?
3. Does this issue happen everytime for the same transactions?
4. What are the particular transactions from Data Warehouse that have this issue?
5. What are the specific steps to reproduce this issue?
6. Does this issue happen everytime for the same transactions?
7. Are there some MWA transactions that do not have this issue?
8. Are you using WMS, INV, or other Applications that use MWA?
9. How many MWA users do you have connected during the peak load?
10. Does this issue occur with telnet sessions as well as the hand-held devices?
11. What is the version of JDK?
We would like to know the full version of JAVA/JDK that you are using.
Please go to your mwactl.sh file.
Find the "JAVA=..." parameter in this file.
cd to the director that is referenced by this JAVA parameter.
do a java -version and it should return something like this:
$java -version
java version "1.4.2_04"
Please ensure that the mwactl.sh using the jdk 1.4.2_20 or higher
With the Latest JDK 1.4.2_x and 1.5.x the overall Performance of the MWA Server increases. We were able to UNOFFICALLY test the number of User Connection with 80 users on one Telnet Server for a week without any issues. Basically the newer the JDK version is, the better MWA performed with each and every JDK signal version release.
12. What are the hardware and network configurations?
13. How is your instance configured - how many nodes, what is on each node, ....?
Are the database server and the server hosting the MWA servers on different machines?
Are there any 3rd party Load Balancers configured?
14. Is there a firewall between the database server and the server hosting the MWA servers?
If firewall is used, what is the firewall timeout value?
15. Are you using Oracles Dispatcher or are you using a third party product to dispatch the request?
16. Does the issue reproduces the issue if the MWA server is bounced daily? If not checked, then as documented in the MWA Quick Reference Guide:
The MWA server should be stopped and started (bounced) at least daily so that memory can be flushed.
http://logistics.us.oracle.com/collat/repository/docs/MwaQuickRef.pdf
If you haven't already done so, please review the following note for some good information about bouncing your MWA and Dispatcher:
Note 198543.1 How To Rebounce the Mobile Application Server for Industrial Applications v1.0.8
17. Please upload the mwa.cfg file located under $MWA_TOP/secure and the $MWA_TOP/bin/mwactl.sh file.
18. How many MWA Telnet Servers are running?
19. If using Oracle Dispatcher:
How many Dispatchers are you running?
Double check the MWA Quick Start Guide for Dispatcher setup info:
MWA Quick Start Guide can also be found in this note:
72450.1 Ext/Pub Oracle Inventory White Papers Template:
Click on the link: MWA Server and Dispatcher Configuration Quick Start Reference
20. What are the ports that your MWA Listeners and Dispatcher are running on?
21. What command are you using to start and stop the MWA Server?
22. Make sure that the jserv.properties file is edited correctly for MWA:
wrapper.bin.parameters=-Doracle.apps.mwa=[full path to $MWA_TOP]
23. Make sure Patch 4734840 - Oracle Inventory, Warehouse Management, Mobile Application Server (MWA), and Receiving (PO): Release 11.5.10, Rollup Patch 3 (INV/WMS/MWA/RCV 11.5.10 RUP3), has been applied.
OR
Patch 5855276 - Oracle Warehouse Management System : Release 11.5.10, Rollup Patch 4 ( WMS 11.5.10 RUP4)
This is the single most important Patch to have applied to a MWA Server. This Patch provided numberous fixes to the MWA Server, Dispatcher and all the Products that use the MWA Serv
er all in one shot. This Patch will place the MWA Server on the latest Core Serv
er file system versions.
24. You can see the MWA DBsession-id for a particular user using the Server Manager.
Server Manager will shows all the details of the users logged into the MWA server.
Please make sure that you download review the readme and pre-requisites, and apply Patch.5166627
Then navigate: MWA Server Manager -->Supply chain --> Monitor to monitor the users connection.
25. Please make sure that you have the latest Autoconfig patch and that Autoconfig has been
run. Ref. 5478710 TXK-O
26. Please provide a Performance trace and upload fresh set of logs
1. Clear the mobile log files
2. To enable trace-level logging, go to "$MWA_TOP/secure/" directory and in the file "mwa.cfg" edit the line starting with "mwa.LogLevel=" to contain "mwa.LogLevel=perform".
3. Search for mwa.SystemLog and note the string
4. Search for mwa.logdir and note the directory
5. Clear the existing *.INV.log and *.system.log files
6. After this you have to restart the server and reproduce the problem.
7. Upoad the logs specified in the above parameters as soon as the issue is reproduced.
· system.log
· dispatcher.log
In mwa.cfg, the values available for mwa.LogLevel are, fatal, error, warning, debug, trace, and 'Perform' is a valid, although undocumented value for this parameter. There should be a trace file generated where ever customer's normal trace dumps are written to. The 'Perform' value should create the trace with performance statistics for the session. Bounce the MWA telnet server after making this change. Then recrerate the issue and upload the system.log file
27. How much memory is allocated to the MWA server? (Hint: check the $MWA_TOP/bin/mwactl.sh script VM_CONFIG settings)
28. The INV_MOBILE_LOGIN_INFO_PVT.LOG_USER_INFO is called and should add a new record in the MTL_MOBILE_LOGIN_HIST table. Check the system.log to verify that INV_MOBILE_LOGIN_INFO_PVT.LOG_USER_INFO was called. Also, check the MTL_MOBILE_LOGIN_HIST table and verify that a row was added to the table for the users connection - query the table by USER_ID.
29. What version of apache are you running? If you are not sure, you can run the following command from your iAS_ORACLE_HOME/Apache/bin directory
httpd -v
30. What is your system profile option "Self Service Personal Home Page Mode' set to?
31. What version of Forms are you on - 6.0.8.x - what is the 'x' version? Use Help > About Oracle Applications to find this version.
Or, from sqlplus you can run the following command:
For 6i forms:
!f60gen \? | grep Forms | grep Version | awk '{print $6}'
For 10g forms:
!frmbld \? | grep Forms | grep Version | awk '{print $6}'
32. What is the version of the following files:
ident $MWA_TOP/bin/MWADIS
mwacfg.lc
mwadis.oc
$MWA_TOP/admin/template/mwactl.sh
$MWA_TOP/admin/template/mwactl.cmd
$MWA_TOP/bin/mwactl.sh
$MWA_TOP/bin/mwactl.cmd
$MWA_TOP/oracle/apps/mwa/presentation/telnet/ProtocolHandler.class
$JAVA_TOP/oracle/apps/fnd/security/SessionManager.class
$JAVA_TOP/oracle/apps/mwa/container/ApplicationsObjectLibrary.class
33. In the mwa.cfg, what are the values of
mwa.DispatcherWorkerThreadCount
mwa.DispatcherClientsPerWorker
set to?
34. It is important that the file descriptors are set before starting MWA on a Unix Instance. Use the command "ulimit -n 1024" before using mwactl.sh to start MWA Server.
Reference the following note for configuration tips and latest MWA patches: 269991.1 MWA Tips for Troubleshooting
Please make sure that you have the latest MWA patchset as noted in the 11.5. 9 section of this note. Also, this note has a section on Performance and configuration settings, so make sure to review these sections as well
The DBAs filled up this questionnaire. Oracle support studied the logs sent :
The system.log is still full of the broken pipe messages
[Fri Sep 21 04:05:34 EDT 2007] (Thread-3102) MWA_PH_DEVCFG_NOTFOUND: 192.168.1.1 configuration file is not found. Will use default
configuration for now
[Fri Sep 21 04:05:34 EDT 2007] (Thread-3103) MWA_PH_DEVCFG_NOTFOUND: 192.168.2.1 configuration file is not found. Will use default configu
ration for now
[Fri Sep 21 04:05:35 EDT 2007] (Thread-3102) PH: User null got disconnected...
[Fri Sep 21 04:05:35 EDT 2007] (Thread-3102) PH: caught IOException
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at sun.nio.cs.StreamEncoder$CharsetSE.writeBytes(StreamEncoder.java:336)
at sun.nio.cs.StreamEncoder$CharsetSE.implFlushBuffer(StreamEncoder.java:404)
at sun.nio.cs.StreamEncoder$CharsetSE.implFlush(StreamEncoder.java:408)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:152)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:213)
at java.io.BufferedWriter.flush(BufferedWriter.java:230)
at oracle.apps.mwa.presentation.telnet.ProtocolHandler.enterHighlight(ProtocolHandler.java:1998)
at oracle.apps.mwa.presentation.telnet.ProtocolHandler.run(ProtocolHandler.java:731)
[Fri Sep 21 04:05:35 EDT 2007] (Thread-3103) PH: User null got disconnected...
The inv.log is showing an error, but it is not on the same date as the error in the system.log so these don't appear to be related:
[Thu Sep 20 12:34:35 EDT 2007] (Thread-26) RCV: RcptGenFListner.nextItemExited 190 throwing exception
java.sql.SQLException: Exception while calling java.sql.SQLException: No more data to read from socket
at oracle.apps.inv.utilities.server.UtilFns.process(UtilFns.java:567)
at oracle.apps.inv.utilities.server.UtilFns.paramsProcessAPI(UtilFns.java:341)
at oracle.apps.inv.rcv.server.RcptGenFListener.nextItemExited(RcptGenFListener.java:5051)
at oracle.apps.inv.rcv.server.RcptGenFListener.doneButExited(RcptGenFListener.java:6001)
at oracle.apps.inv.rcv.server.RcptGenFListener.fieldExited(RcptGenFListener.java:421)
at oracle.apps.mwa.container.StateMachine.callListeners(StateMachine.java:1641)
at oracle.apps.mwa.container.StateMachine.handleEvent(StateMachine.java:526)
at oracle.apps.mwa.presentation.telnet.PresentationManager.handle(PresentationManager.jav
a:690)
at oracle.apps.mwa.presentation.telnet.ProtocolHandler.run(ProtocolHandler.java:818)
[Thu Sep 20 12:37:45 EDT 2007] (Thread-26) RCV: RcvFListner - Could not clear globals
java.sql.SQLException: No more data to read from socket
at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:134)
at oracle.jdbc.dbaccess.DBError.throwSqlException(DBError.java:179)
at oracle.jdbc.dbaccess.DBError.check_error(DBError.java:1160)
at oracle.jdbc.ttc7.MAREngine.unmarshalUB1(MAREngine.java:961)
at oracle.jdbc.ttc7.MAREngine.unmarshalSB1(MAREngine.java:893)
at oracle.jdbc.ttc7.Oopen.receive(Oopen.java:109)
at oracle.jdbc.ttc7.TTC7Protocol.open(TTC7Protocol.java:584)
at oracle.jdbc.driver.OracleStatement.open(OracleStatement.java:584)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:2905)
at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:656)
at oracle.jdbc.driver.OraclePreparedStatement.execute(OraclePreparedStatement.java:734)
at oracle.apps.inv.rcv.server.RcvFListener.cancelButExited(RcvFListener.java:145)
at oracle.apps.inv.rcv.server.RcptInfoFListener.cancelButExited(RcptInfoFListener.java:235)
at oracle.apps.inv.rcv.server.RcptInfoFListener.fieldExited(RcptInfoFListener.java:
92)
at oracle.apps.mwa.container.StateMachine.callListeners(StateMachine.java:1641)
at oracle.apps.mwa.container.StateMachine.handleEvent(StateMachine.java:526)
at oracle.apps.mwa.presentation.telnet.PresentationManager.handle(PresentationManager.java:690)
at oracle.apps.mwa.presentation.telnet.ProtocolHandler.run(ProtocolHandler.java:818)
The RcvFListner is an INV listener and is separate from the MWA listener.
There are no errors in the inv.log for Sept 21st.
UPDATE
======
We are still seeing the Broken Pipe error in the system.log, which still suggest there is either a network or a java problem.
The problem turned out to be that they had switched on the dispatcher service even though they were on BigIP load balancer. Once they shut down the dispatcher service, the error disappeared.