Cougaar Memory Profiler |
|
NOTE:
As of January 2007, the Cougaar memory profiler should be considered deprecated.
The Cougaar memory profiler was written to overcome limitations in Sun's JVMPI implementation, as used by hprof,
OptimizeIt(tm),
JProbe(tm),
etc. Most of these problems have been fixed in Java 1.6.
Instead of using the Cougaar memory profiler, we recommend that developers first
try Sun's new, free `jhat` profiler:
http://java.sun.com/javase/6/docs/technotes/tools/index.html#troubleshoot
See the above link's Developer's Guide for notes on how to use jhat and interpret the results.
For future reference, there may be interest in extending jhat to better
identify non-leak memory problems and potential solutions. For example, a developer
might want to estimate the potential memory savings of adding caching (e.g. intern
the most common Strings) or changing a data structure (e.g. change an "Integer" field
to a primitive "int"). These "identify potential memory saving" analysis methods would
best be implemented as enhancements to the post-processing jhat dump reader:
j2se/src/share/classes/com/sun/tools/hat/Main.java
The modified code could be a new Cougaar project (e.g. "jhat_extensions")
and offered back to Sun.
The remainder of this page describes the Cougaar memory profiler, which still works but
is more difficult to configure and use than jhat. Some of the concepts
may be of interest for use in future projects, e.g. bytecode instrumentation and weak
reference tables.
The Cougaar memory profiler is a tool for debugging memory usage
and leaks in any Java application. It features a scalable 100%
Java design that is lighter weight than existing
JVMPI-based profilers
(hprof,
OptimizeIt(tm),
JProbe(tm),
etc).
Memory leaks can arise in Java applications due to unintentional
object references. For example, an application may fail to remove
callback listeners or close streams, preventing these objects from
being garbage collected. Even minor leaks can grow to major
problems in deployed systems.
The profiler tracks memory usage within the application by using
tables of
WeakReferences.
The developer selects which classes should
be tracked and runs an automated classfile editor (using
BCEL) to
add profiling instructions to the constructors. The profiler
maintains pointers to the live instances and can display useful
debugging information:
An interactive servlet-based UI is included that displays all the
above profiling data. A servlet-based UI was developed instead of
a Swing UI since it matched the Cougaar Agent Architecture's
embedded servlet server. It'd be straight-forward to port
the servlet to Swing or create a JEditorPane/HTMLEditorKit adaptor.
The servlet has minimal Cougaar dependencies and can be modified
to run in any servlet server.
The developer controls the profiling overhead by selecting which
classes to profile and reducing the level of profiling (e.g.
turning off allocation stacktrace data). The runtime overhead is
less than JVMPI tools, partially because the JVMPI tools must
maintain similar heap tables (see the hprof source in
There are a couple noteworthy limitations of this approach vs JVMPI:
The main
project page has the latest release. The download contains
a "run" script that runs a simple example application.
Please send questions and comments to the
cougaar-developers@cougaar.org mailing list.
Here's an example HTML snapshot:
The above snapshot shows:
Here's a key for each table row:
For example, 1% of String allocations were tracked. Currently 56
of 28042 tracked Strings are live, so we can estimate that 5600
are live out of 2804200 allocated (0.2%). Each String costs 24
bytes excluding the char[] data:
If we click on the "java.lang.String" row a new page is generated:
This simply repeats the above data and allows the user to submit
a query on instances of "java.lang.String". Here we select
a decreasing view by "length()" and see:
In this run we disabled "java.*" allocation timestamps and
stacktraces, to minimize the performance overhead, but we can see
the "toString()" and other useful information. In this example
shows the largest 20 of all 30 Strings (of the 1% we tracked).
Additional snapshots and notes can be found here.
The release contains a "run" script that will run a simple example
application. A large HashMap is allocated, serialized, and
deserialized, and at each point the profiler is queried to
track this map.
BCEL 5.1 is required
to modify the classes but not required when running the profiler.
The "profiler_transform/bin/transform_jar" script can be used to
instrument complete jars (e.g. "lib/core.jar", "sys/tomcat_40.jar",
etc) or JDK packages (e.g. "java.lang", "java.util", etc).
For options run:
The "transform_jar" script allows the developer to specify an
"OptionsFactory" class that will define the profiling level at
runtime, e.g.:
The recommended usage is to specify a custom implementation that
you'll modify between runs. This will allow you to run the
"transform_jar" script on your jars only once (to instrument
the bytecode and specify your OptionsFactory) and in subsequent
runs you'll only need to change your OptionsFactory to change the
profiling config.
Here's an example:
This should be compiled, jar'ed (e.g. "profiler_config.jar"), and
placed at the head of the VM's bootclasspath along with the
"profiler_impl.jar" and other VM jar overrides, e.g.:
Custom "OptionsFactory" classes should minimize imports to avoid
VM loading errors. For example, a call to "new HashSet()" will
fail if HashSet is profiled, due to a stack overflow caused by
the circular reference. Similarily, if all of "java.lang" will
be profiled, then Strings should be carefully handled to avoid
string allocations, including any calls to "System.out"!
The class modifier can also be used within a ClassLoader at class
definition time. The "SelfProfiler" defines a standard BCEL
ClassLoader method:
Here are usage notes specific to ACME, which is Cougaar's automation
scripting environment (soon to be open-sourced). This is primarily
of interest to
Cougaar and
Ultra*Log developers.
Download the
latest profiler install and unzip it into your $CIP. It'll
overwrite:
To verify the basic install, run the
"$CIP/run" script. It should generate
output similar to "$CIP/run.log". After
this test, delete the temporary files:
First you must create an "OptionsFactory" Java class to control the
level of profiling. An example is
shown above named "foo.MyConfig". A modified zip containing
the source, jar, and compile script can be found
here. This implementation will enable minimal profiling for
all classes:
Next you must transform the jars that you will profile. You can
transform all the jars in sys and lib using:
At this point your $CIP/lib should look
something like:
Next you must modify your:
You should also modify the above "bootclasspath.rule" to uncomment
all the jars (e.g. uncomment the "javarmipatch", etc).
Lastly, you can optionally enable the "wget" action:
The config is now ready to run. You can run any "MOAS"
script and it'll profile the classes as defined in your
"foo.MyConfig" class, which in this case will track 1% of the instance
allocations. This is fine for the first run, since it'll provide a
rough idea of where to focus.
You should now have "/profiler" HTML snapshots to analyze. See the
above
snapshots and notes
for details on the "/profiler" servlet output and example analysis.
To enable offline javascript-based column sorting in the snapshots
you must save
this link
as "tablesort.js" and run the following script on "*.html":
Based upon your analysis, you may choose to:
The bytecode modification is done by the
SelfProfiler
class.
The basic idea is to transform a class like:
The full transform is more complex, since it must call call the
profiler in all constructors, "clone()", and "readObject()". The
class must also be tagged with the original serialVersionUID to
preserve serialization compatibility.
The "run()" method outlines the basic steps:
The note that "the default constructor always exists" refers to
javac's creation of a default constructor if one is not specified,
e.g.:
"clone" creates an object through "super.clone()", which doesn't call
the constructor. The result of "super.clone()" is tracked. If clone
doesn't call "super.clone()" then it presumable returns either an
existing object or uses a constructor to create a new instance.
"readObject" creates the object without calling the constructor.
Here's an example test-case:
Externalizable will call the zero-arg constructor. If there's no zero-arg
constructor then it'll throw a runtime InvalidClassException. Note that
non-static inner classes always pass the outer class as an implicit first
parameter, so they can't be Externalizable.
The profiler can't track primitive arrays since they're not created
through a class constructor. A possible workaround is to track
arrays created by a profiled class by tracking the result of
"newarray" bytecodes. This is different than the profiler's normal
operation, since arrays constructed by non-profiled classes won't
be tracked, as opposed to the current behavior where profiled classes
are tracked regardless of which code constructed them. A synthetic
"java.lang.Array" class would be ideal but would require VM changes.
A possible enhancement is to create a transform that tracks "new"
bytecodes to find allocations by a given class, e.g. track
all objects allocated by Xerces classes.
Overview
For example, find the largest HashMaps, ten newest "com.Foo",
oldest "com.Foo", etc.
For example, what's the most common allocation point for
ArrayLists, which are the most common "equals" values (for
potential caching), etc.
$JDK/j2se/src/share/tools/hprof for details).
We've found the JVMI overhead to be prohibitively expensive when
debugging large applications. Long-term
JSR-163
(Java(tm) Platform Profiling Architecture)
may address some of these performance issues, however the draft
specification suggests that bytecode-based object profiling may
be the recommended approach.
The advantages are noted above: scalability and Java-based APIs.
Also, since the profiler is 100% Java, it's easier to extend and
can be used from within the application itself.
Constant strings in class/interface APIs are also hidden within
the VM, as are the Classes themselves and some dynamic proxies.
Some odd JDK-internal classes may also be excluded (e.g.
"sun.misc.Ref").
Snapshots
Off to the right, out of view in the snapshot, are columns for:
1/sample1/sample
(elementData == null ? 0 : 12 + 4*elementData.length)
(elementData == null ? 0 : elementData.length)
header + fields =
header + "char[] value" + "int offset" + "int count" + "int hash" =
8 + 4 + 4 + 4 + 4 =
24
So there's about 24*5600 = 134400 bytes in Strings, excluding the
"char[] value". The "+Capacity Bytes" is 474800, so the sum of
all the "char[]" headers (12 bytes each) and chars
(2 bytes each) is 474800-134400 = 340400 bytes. This is an
estimate, since some of the Strings may point to the same "char[]"
(e.g. substrings and StringBuffers). The max "length()" of a String
is currently 49, and the max "value.length" is 98, where the
discrepency is likely due to substrings.
Usage
./profiler_impl/bin/transform_jar --help
./profiler_impl/bin/transform_jar --config=foo.MyConfig xerces
The default implementation is
DefaultOptionsFactory, which enables full profiling for all
classes. The class must contain a method matching this signature:
public static final Options getOptions(
String module,
String classname) {..}
package foo;
import org.cougaar.profiler.Options;
public class MyConfig {
/**
* Option that will track 100% of allocations,
* record allocation timestamp,
* record allocation stacktrace,
* record size/capacity/context.
* <p>
* This is the best detail but the most overhead.
*/
private static final Options FULL_DETAIL =
new Options(
true, // time
true, // stack
true, // size
true, // capacity
true, // context
1.00 // sampleRatio
);
/**
* Track only 1% of allocations,
* record only the instance itself,
* allow size/capacity.
* <p>
* This is very low overhead but only samples a small
* subset of the instances. This is ideal for the
* initial profiling runs, to get a rough idea of the
* allocation counts.
*/
private static final Options MIN_DETAIL =
new Options(
false, // time
false, // stack
true, // size
true, // capacity
false, // context
0.01 // sampleRatio
);
/*
* The profiled class will call this method to get its options.
*
* @param module optional module name, e.g. "xerces" or null
* @param classname non-null class name, e.g.
* "org.apache.xerces.dom.DocumentImpl$LEntry"
* @return profiling options for the class, or null if the
* class shouldn't be tracked.
*/
public static final Options getOptions(
String module,
String classname) {
if ("core".equals(module)) {
// we're interested in all classes in "lib/core.jar".
// we want to track every allocation, timestamp, and stack
return FULL_DETAIL;
} else if (classname.startsWith("java.")) {
// we're a little interested in "java.*", but
// not enough to suffer a big overhead. This is a good
// "scatter-shot" approach.
return MIN_DETAIL;
} else {
// everything else will be ignored
return null;
}
}
}
#!/bin/sh
..
BOOTCLASSPATH="\
-Xbootclasspath/p:$COUGAAR_INSTALL_PATH/lib/profiler_config.jar\
-Xbootclasspath/p:$COUGAAR_INSTALL_PATH/lib/profiler_impl.jar\
-Xbootclasspath/p:$COUGAAR_INSTALL_PATH/lib/javalangpatch.jar\
-Xbootclasspath/p:$COUGAAR_INSTALL_PATH/lib/javaiopatch.jar"
..
java $BOOTCLASSPATH ..
For details see the example "run" script included in the release.
public JavaClass modifyClass(JavaClass clazz) {..}
See BCEL's "org.apache.bcel.util.ClassLoader" for details. This
approach would avoid the "transform_jar" script but would add
runtime overhead.
ACME Usage
$CIP/build/data/default.build
to specify the compile dependencies of UL jars (e.g. "ar_mic.jar").
rm -rf $CIP/tmp
rm $CIP/lib/javautilpatch*
Also delete the example jar and config, which you won't need for testing:
rm $CIP/lib/profiler_examples.jar
rm $CIP/lib/profiler_config.jar
Track 1% of instance allocations at minimal detail
You may want to modify the "profiler_config/src/foo/MyConfig.java"
to profile more/less classes or capture more/less detail; see the above
usage section
for details.
#!/bin/sh
CONFIG="--config=foo.MyConfig"
cd $CIP || exit -1
# profile everything! This may take a while...
#
# The following errors are expected:
# icu4j: two class files are corrupt
# jena: missing jars (sesame, antlr, sleepycat)
# juddi: missing jars (axis)
# log4j: missing jars (jms, jmx)
# openmap: missing jars (j3d)
# tomcat_40: missing jars (jmx)
# vishnuServer: missing jars (vishnu, util)
# These errors and related NPEs can be ignored, since the classes
# that depend upon these missing jars will never be loaded at
# runtime.
JAVA_PKGS="applet awt beans io lang math net nio rmi security sql text util"
for x in $JAVA_PKGS; do ./transform_jar $CONFIG java.$x; done
for x in $CIP/lib/*.jar; do ./transform_jar $CONFIG $x; done
for x in $CIP/sys/*.jar; do ./transform_jar $CONFIG $x; done
This may take up to an hour on a slow machine (~500mhz). This is a
one-time step that is only required when/if you upgrade your jars.
After the transform you can delete the temporary files in:
rm -rf $CIP/tmp
In a profiling session you'll only need to modify the
"profiler_config" source and recompile the tiny
"$CIP/lib/profiler_config.jar"
to change the profiling level.
aggagent.jar -> /mnt/shared/socC/lib/aggagent.jar.new
aggagent.jar.new
aggagent.jar.old
...
The "*.jar.new" contains the instrumented classes and is signed
by the operator "sign" script.
$CIP/csmart/config/rules/isat/tic_env.rule
by commenting out line 55 (the Xbootclaspath). This allows ACME to
use the profiler's VM "-Xbootclasspath" rule:
$CIP/csmart/config/rules/isat/bootclasspath.rule
I'll submit an ISAT bug to split this create the basic
"bootclasspath.rule" in the CSMART module, so in the future you
won't need to modify your "tic_env.rule".
$CIP/csmart/acme_scripting/src/lib/cougaar/wget.rb
by modifying the very end of:
$CIP/csmart/acme_scripting/src/lib/cougaar/scripting.rb
to append:
require 'cougaar/wget'
Then add "wget" actions in your "MOAS" script, e.g. just before
stage-2 add:
do_action 'WGetAllNodes', '/profiler', 'profiler_beforeStage2'
to capture "/profiler" servlet snapshots to the archives. I'll
submit an ISAT bug to move this action into the CSMART module.
#!/bin/sh
# replace second line:
# <script LANGUAGE="JavaScript" SRC="/$~/profiler?action=script"></script>
# with:
# <script LANGUAGE="JavaScript" SRC="tablesort.js"></script>
for x in $*; do
cat <<"EOF" | ed $x
2c
<script LANGUAGE="JavaScript" SRC="tablesort.js"></script>
.
w
EOF
done
Design Details
public class Bar extends Foo {
public Bar() {
System.out.println("Hello, world!");
}
}
into:
import foo.MyConfig; // your config class
import org.cougaar.profiler.MemoryTracker;
public class Bar extends Foo {
public Bar() {
System.out.println("Hello, world!");
MY_PROFILER.add(this);
}
private static final MemoryTracker MY_PROFILER =
MemoryTracker.getInstance(
"Bar", // classname
8, // object header + fields size
MyConfig.getOptions("the_module", "Bar"));
}
private void run() {
// make sure the serialVersionUID field exists, so we don't
// break serialization
addSerialVersionUID();
if (ENABLE_SIZE_AND_CAPACITY) {
// add "$get_size()" and "$get_capacity()" methods
defineSize();
defineCapacity();
}
// add static profiler field
defineProfilerField();
// init profiler field in class init method
initProfilerField();
// override "profile_<super>" method to disable super's profiler
disableSuperProfiler();
// add our "profile_<class>" method
addClassProfiler();
// update constructors with "profile_<class>" calls
// (excluding constructors that call "this(..)")
//
// note that the default constructor always exists
callProfilerInConstructors();
// catch "readObject", which is a hidden constructor.
//
// we can ignore "readExternal" since it calls the no-arg
// constructor.
callProfilerInReadObject();
// handle "clone", which is yet another hidden constructor.
callProfilerInClone();
}
class Test {}
Javac will create a "public Test() { super(); }"
constructor. The VM requires at least one constructor for any class
that will be instantiated, so the bytecode instrumentor doesn't need
to add constructors.
import java.io.*;
class Test implements Serializable {
public static void main(String[] args) throws Exception {
Test t1 = new Test();
System.out.println("t1: "+t1);
Test t2 = (Test) testSerial(t1);
System.out.println("t2: "+t2);
}
public Test() {
System.out.println("constructor");
}
private void readObject(ObjectInputStream s) throws Exception {
System.out.println("readObject");
s.defaultReadObject();
}
private static Object testSerial(Object o) throws Exception {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutputStream os = new ObjectOutputStream(bos);
os.writeObject(o);
os.flush();
o = null;
byte[] b = bos.toByteArray();
ByteArrayInputStream bis = new ByteArrayInputStream(b);
ObjectInputStream is = new ObjectInputStream(bis);
Object newO = is.readObject();
return newO;
}
}
It generates output similar to:
constructor
t1: Main@1a16869
readObject
t2: Main@19efb05
ObjectInputStream uses ObjectStreamClass "newInstance()", which uses:
import sun.reflect.ReflectionFactory;
private static final ReflectionFactory reflFactory = ..;
..
Constructor cons = reflFactory.newConstructorForSerialization(cl, cons);
..
return cons.newInstance(..);
From there it's muddy, since we're in Sun's private code, but the end
result is that "readObject" never calls the constructor.