Recently, at a customer site, I faced an issue of WebLogic Admin server which crashes because it reached Java Virtual Machine (JVM) limit. Xms and Xmx were set to 1Gb. Expected behavior is that Node Manager detects it and restart Admin server automatically as you can see in log:


The WebLogic Server encountered a critical failure
java.lang.<strong>OutOfMemoryError</strong>: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3181)
at java.util.ArrayList.grow(ArrayList.java:265)
at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:23)
at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:21)
at java.util.ArrayList.add(ArrayList.java:462)
at garbage.Memory.(Memory.java:7)
at garbage.Heap.(Heap.java:18)
...
at weblogic.servlet.internal.ServletContext$ServletInvocationAction.run(ServletContext.java:1234)
Reason: <strong>There is a panic condition in the server. The server is configured to exit on panic</strong>
***************************************************************************
<Jun 17, 2022 8:52:13 AM CEST> <INFO> <NodeManager> <The server 'AdminServer' with process id 10181 is no longer alive; waiting for the process to die.>
<Jun 17, 2022 8:52:13 AM CEST> <FINEST> <NodeManager> <Process died.>
<Jun 17, 2022 8:52:13 AM CEST> <FINEST> <NodeManager> <get latest startup configuration before deciding/trying to restart the server>

Basic License

In the meantime, Pascal Brand ran the script provided in Oracle Note WebLogic Server Basic License Feature Usage Measurement Script (Doc ID 885587.1).

One of the item that was not compliant was the panic action setting:

Checking server mode and overload actions:
Feature usage measurement error: Server AdminServer has an incorrect overload protection <strong>panic action: system-exit</strong>
WebLogic Server has features for detecting, avoiding, and recovering from overload conditions. WebLogic Server's overload protection features help prevent the negative consequences - degraded application performance and stability - that can result from continuing to accept requests when the system capacity is reached.
In the license for WebLogic Server Basic, the configuration of any overload protection scheme at either a cluster or server level is not permitted.
See the documentation: http://download.oracle.com/docs/cd/E12839_01/web.1111/e13701/overload.htm
1 error(s) detected

So, for our customer, we are not allowed to leave that option even if it was convenient in the scenario. This means whenever a OOM happens, Node Manager will detect it but it will not recover the Admin server. OOM is not the only event that is considered a panic action.

Solution

Part 1 – Disable Panic Action

To comply to basic license we will disable panic action. Simply login to WebLogic console, go to server, click AdminServer, and, finally go to Overload tab.

Needless to say that you must “lock and edit” to change that option. Completed it with a server restart.

If you have many servers (it must be applied to all managed servers as well), you can also do this with REST API Interface. This is faster, will avoid errors and repetitive tasks. Here is the curl command:

curl -H X-Requested-By:MyClient -H Accept:application/json \
-u ${WL_USER}:${WL_ADMPWD} -H Content-Type:application/json \
-d "{panicAction: no-action}" \
http://vm01:7001/management/weblogic/latest/edit/servers/AdminServer/overloadProtection

Part 2 – Enable JVM Flag

One of the solution, among many, is to use a JVM parameter. The option I wanted to test is:

-XX:+CrashOnOutOfMemoryError

If this option is enabled, when an out-of-memory error occurs, JVM crashes and produces text and binary crash files. This is even better than before as it will gives us some clue on what used so much memory in JVM.

For testing purpose, I have used a war files which will eat all JVM memory until we can see OOM error. Once I have added this flag to “setUserOverrides.sh” in $DOMAIN_HOME/bin folder and restarted Admin server, we are good to go.

Let’s call the application and … bingo! Here is what we see in the log:

<Jun 17, 2022 12:24:29,963 PM CEST> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to RUNNING.>
Aborting due to java.lang.OutOfMemoryError: Java heap space
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (debug.cpp:308), pid=6366, tid=0x00007f2a7fda8700
#  <strong>fatal error: OutOfMemory encountered: Java heap space</strong>
#
# JRE version: Java(TM) SE Runtime Environment (8.0_202-b08) (build 1.8.0_202-b08)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.202-b08 mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /u01/app/oracle/config/domains/oid_domain/hs_err_pid6366.log
[thread 139820755035904 also had an error]
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
/u01/app/oracle/config/domains/oid_domain/bin/startWebLogic.sh: line 205:  6366 Aborted

Error log is different than before and we can directly notice that a report file has been generated (hs_err_pid6366.log). We can analyze this file to determine the root cause of the OOM error. Of course, in my case, application purpose was to cause it, thus it has not really sense to analyze it.

Next Steps

This is very convenient wlst script (named wls-basic-measurement.py) to check if WebLogic domain complies with WebLogic Server Basic license. This has never been an easy task for any Oracle Product and we can thank Oracle to provide this.

You should give it a try to see if you comply with your Basic license of Oracle WebLogic. To run it, it is has simple as that (once setDomainEnv.sh is in place):

java weblogic.WLST wls-basic-measurement.py \
username 'weblogic' password 'Password' \
url 't3://hostname:7002' output 'wlsreport.txt'