blob: 80211456ec938775d279dcbf5f9935ce222508fd [file] [log] [blame]
<title>Dalvik VM Debug Monitor</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<link href="" type="image/x-icon"
rel="shortcut icon">
<link href="../android.css" type="text/css" rel="stylesheet">
<script language="JavaScript1.2" type="text/javascript">
function highlight(name) {
if (document.getElementById) {
tags = [ 'span', 'div', 'tr', 'td' ];
for (i in tags) {
elements = document.getElementsByTagName(tags[i]);
if (elements) {
for (j = 0; j < elements.length; j++) {
elementName = elements[j].getAttribute("id");
if (elementName == name) {
elements[j].style.backgroundColor = "#C0F0C0";
} else if (elementName && elementName.indexOf("rev") == 0) {
elements[j].style.backgroundColor = "#FFFFFF";
<body onload="prettyPrint()">
<h1><a name="My_Project_"></a>Dalvik VM<br>Debug Monitor</h1>
<!-- Status is one of: Draft, Current, Needs Update, Obsolete -->
<p style="text-align:center"><strong>Status:</strong><em>Draft</em> &nbsp;
<small>(as of March 6, 2007)</small></p>
<!-- last modified date can be different to the "Status date." It automatically
whenever the file is modified. -->
<!-- this script automatically sets the modified date,you don't need to modify
it -->
<script type=text/javascript>
var lm = new Date(document.lastModified);
<p>It's extremely useful to be able to monitor the live state of the
VM. For Android, we need to monitor multiple VMs running on a device
connected through USB or a wireless network connection. This document
describes a debug monitor server that interacts with multiple VMs, and
an API that VMs and applications can use to provide information
to the monitor.
<p>Some things we can monitor with the Dalvik Debug Monitor ("DDM"):
<li> Thread states. Track thread creation/exit, busy/idle status.
<li> Overall heap status, useful for a heap bitmap display or
fragmentation analysis.
<p>It is possible for something other than a VM to act as a DDM client, but
that is a secondary goal. Examples include "logcat" log extraction
and system monitors for virtual memory usage and load average.
<p>It's also possible for the DDM server to be run on the device, with
the information presented through the device UI. However, the initial goal
is to provide a display tool that takes advantage of desktop tools and
screen real estate.
<p>This work is necessary because we are unable to use standard JVMTI-based
tools with Dalvik. JVMTI relies on bytecode insertion, which is not
currently possible because Dalvik doesn't support Java bytecode.
<p>The DDM server is written in the Java programming language
for portability. It uses a desktop
UI toolkit (SWT) for its interface.
<p>To take advantage of existing infrastructure we are piggy-backing the
DDM protocol on top of JDWP (the Java Debug Wire Protocol, normally spoken
between a VM and a debugger). To a
non-DDM client, the DDM server just looks like a debugger.
<p>The JDWP protocol is very close to what we want to use. In particular:
<li>It explicitly allows for vendor-defined packets, so there is no
need to "bend" the JDWP spec.
<li>Events may be posted from the VM at arbitrary points. Such
events do not elicit a response from the debugger, meaning the client
can post data and immediately resume work without worrying about the
eventual response.
<li>The basic protocol is stateless and asynchronous. Request packets
from the debugger side include a serial number, which the VM includes
in the response packet. This allows multiple simultaneous
conversations, which means the DDM traffic can be interleaved with
debugger traffic.
<p>There are a few issues with using JDWP for our purposes:
<li>The VM only expects one connection from a debugger, so you couldn't
attach the monitor and a debugger at the same time. This will be
worked around by connecting the debugger to the monitor and passing the
traffic through. (We're already doing the pass-through with "jdwpspy";
requires some management of our request IDs though.) This should
be more convenient than the current "guess the port
number" system when we're attached to a device.
<li>The VM behaves differently when a debugger is attached. It will
run more slowly, and any objects passed to the monitor or debugger are
immune to GC. We can work around this by not enabling the slow path
until non-DDM traffic is observed. We also want to have a "debugger
has connected/disconnected" message that allows the VM to release
debugger-related resources without dropping the net connection.
<li>Non-DDM VMs should not freak out when DDM connects. There are
no guarantees here for 3rd-party VMs (e.g. a certain mainstream VM,
which crashes instantly), but our older JamVM can be
configured to reject the "hello" packet.
<h3>Connection Establishment</h3>
<p>There are two basic approaches: have the server contact the VMs, and
have the VMs contact the server. The former is less "precise" than the
latter, because you have to scan for the clients, but it has some
<p>There are three interesting scenarios:
<li>The DDM server is started, then the USB-attached device is booted
or the simulator is launched.
<li>The device or simulator is already running when the DDM server
is started.
<li>The DDM server is running when an already-started device is
attached to USB.
<p>If we have the VMs connect to the DDM server on startup, we only handle
case #1. If the DDM server scans for VMs when it starts, we only handle
case #2. Neither handles case #3, which is probably the most important
of the bunch as the device matures.
<p>The plan is to have a drop-down menu with two entries,
"scan workstation" and "scan device".
The former causes the DDM server to search for VMs on "localhost", the
latter causes it to search for VMs on the other side of an ADB connection.
The DDM server will scan for VMs every few seconds, either checking a
range of known VM ports (e.g. 8000-8040) or interacting with some sort
of process database on the device. Changing modes causes all existing
connections to be dropped.
<p>When the DDM server first starts, it will try to execute "adb usb"
to ensure that the ADB server is running. (Note it will be necessary
to launch the DDM server from a shell with "adb" in the path.) If this
fails, talking to the device will still be possible so long as the ADB
daemon is already running.
<h4>Connecting a Debugger</h4>
<p>With the DDM server sitting on the JDWP port of all VMs, it will be
necessary to connect the debugger through the DDM server. Each VM being
debugged will have a separate port being listened to by the DDM server,
allowing you to connect a debugger to one or more VMs simultaneously.
<p>In the common case, however, the developer will only want to debug
a single VM. One port (say 8700) will be listened to by the DDM server,
and anything connecting to it will be connected to the "current VM"
(selected in the UI). This should allow developers to focus on a
single application, which may otherwise shift around in the ordering, without
having to adjust their IDE settings to a different port every time they
restart the device.
<h3>Packet Format</h3>
<p>Information is sent in chunks. Each chunk starts with:
u4 type
u4 length
and contains a variable amount of type-specific data.
Unrecognized types cause an empty response from the client and
are quietly ignored by the server. [Should probably return an error;
need an "error" chunk type and a handler on the server side.]
<p>The same chunk type may have different meanings when sent in different
directions. For example, the same type may be used for both a query and
a response to the query. For sanity the type must always be used in
related transactions.
<p>This is somewhat redundant with the JDWP framing, which includes a
4-byte length and a two-byte type code ("command set" and "command"; a
range of command set values is designated for "vendor-defined commands
and extensions"). Using the chunk format allows us to remain independent
of the underlying transport, avoids intrusive integration
with JDWP client code, and provides a way to send multiple chunks in a
single transmission unit. [I'm taking the multi-chunk packets into
account in the design, but do not plan to implement them unless the need
<p>Because we may be sending data over a slow USB link, the chunks may be
compressed. Compressed chunks are written as a chunk type that
indicates the compression, followed by the compressed length, followed
by the original chunk type and the uncompressed length. For zlib's deflate
algorithm, the chunk type is "ZLIB".
<p>Following the JDWP model, packets sent from the server to the client
are always acknowledged, but packets sent from client to server never are.
The JDWP error code field is always set to "no error"; failure responses
from specific requests must be encoded into the DDM messages.
<p>In what follows "u4" is an unsigned 32-bit value and "u1" is an
unsigned 8-bit value. Values are written in big-endian order to match
<h3>Initial Handshake</h3>
<p>After the JDWP handshake, the server sends a HELO chunk to the client.
If the client's JDWP layer rejects it, the server assumes that the client
is not a DDM-aware VM, and does not send it any further DDM queries.
<p>On the client side, upon seeing a HELO it can know that a DDM server
is attached and prepare accordingly. The VM should not assume that a
debugger is attached until a non-DDM packet arrives.
<h4>Chunk HELO (server --&gt; client)</h4>
<p>Basic "hello" message.
u4 DDM server protocol version
<h4>Chunk HELO (client --&gt; server, reply only)</h4>
Information about the client. Must be sent in response to the HELO message.
u4 DDM client protocol version
u4 pid
u4 VM ident string len (in 16-bit units)
u4 application name len (in 16-bit units)
var VM ident string (UTF-16)
var application name (UTF-16)
<p>If the client does not wish to speak to the DDM server, it should respond
with a JDWP error packet. This is the same behavior you'd get from a VM
that doesn't support DDM.
<h3>Debugger Management</h3>
<p>VMs usually prepare for debugging when a JDWP connection is established,
and release debugger-related resources when the connection drops. We want
to open the JDWP connection early and hold it open after the debugger
<p>The VM can tell when a debugger attaches, because it will start seeing
non-DDM JDWP traffic, but it can't identify the disconnect. For this reason,
we need to send a packet to the client when the debugger disconnects.
<p>If the DDM server is talking to a non-DDM-aware client, it will be
necessary to drop and re-establish the connection when the debugger goes away.
(This also works with DDM-aware clients; this packet is an optimization.)
<h4>Chunk DBGD (server --&gt; client)</h4>
<p>Debugger has disconnected. The client responds with a DBGD to acknowledge
receipt. No data in request, no response required.
<h3>VM Info</h3>
<p>Update the server's info about the client.
<h4>Chunk APNM (client --&gt; server)</h4>
<p>If a VM's application name changes -- possible in our environment because
of the "pre-initialized" app processes -- it must send up one of these.
u4 application name len (in 16-bit chars)
var application name (UTF-16)
<h4>Chunk WAIT (client --&gt; server)</h4>
<p>This tells DDMS that one or more threads are waiting on an external
event. The simplest use is to tell DDMS that the VM is waiting for a
debugger to attach.
u1 reason (0 = wait for debugger)
If DDMS is attached, the client VM sends this up when waitForDebugger()
is called. If waitForDebugger() is called before DDMS attaches, the WAIT
chunk will be sent up at about the same time as the HELO response.
<h3>Thread Status</h3>
<p>The client can send updates when their status changes, or periodically
send thread state info, e.g. 2x per
second to allow a "blinkenlights" display of thread activity.
<h4>Chunk THEN (server --&gt; client)</h4>
<p>Enable thread creation/death notification.
u1 boolean (true=enable, false=disable)
<p>The response is empty. The client generates THCR packets for all
known threads. (Note the THCR packets may arrive before the THEN
<h4>Chunk THCR (client --&gt; server)</h4>
<p>Thread Creation notification.
u4 VM-local thread ID (usually a small int)
u4 thread name len (in 16-bit chars)
var thread name (UTF-16)
<h4>Chunk THDE (client --&gt; server)</h4>
<p>Thread Death notification.
u4 VM-local thread ID
<h4>Chunk THST (server --&gt; client)</h4>
<p>Enable periodic thread activity updates.
Threads in THCR messages are assumed to be in the "initializing" state. A
THST message should follow closely on the heels of THCR.
u4 interval, in msec
<p>An interval of 0 disables the updates. This is done periodically,
rather than every time the thread state changes, to reduce the amount
of data that must be sent for an actively running VM.
<h4>Chunk THST (client --&gt; server)</h4>
<p>Thread Status, describing the state of one or more threads. This is
most useful when creation/death notifications are enabled first. The
overall layout is:
u4 count
var thread data
Then, for every thread:
u4 VM-local thread ID
u1 thread state
u1 suspended
<p>"thread state" must be one of:
<ul> <!-- don't use ol, we may need (-1) or sparse -->
<li> 1 - running (now executing or ready to do so)
<li> 2 - sleeping (in Thread.sleep())
<li> 3 - monitor (blocked on a monitor lock)
<li> 4 - waiting (in Object.wait())
<li> 5 - initializing
<li> 6 - starting
<li> 7 - native (executing native code)
<li> 8 - vmwait (waiting on a VM resource)
<p>"suspended" will be 0 if the thread is running, 1 if not.
<p>[Any reason not to make "suspended" be the high bit of "thread state"?
Do we need to differentiate suspend-by-GC from suspend-by-debugger?]
<p>[We might be able to send the currently-executing method. This is a
little risky in a running VM, and increases the size of the messages
considerably, but might be handy.]
<h3>Heap Status</h3>
<p>The client sends what amounts to a color-coded bitmap to the server,
indicating which stretches of memory are free and which are in use. For
compactness the bitmap is run-length encoded, and based on multi-byte
"allocation units" rather than byte counts.
<p>In the future the server will be able to correlate the bitmap with more
detailed object data, so enough information is provided to associate the
bitmap data with virtual addresses.
<p>Heaps may be broken into segments within the VM, and due to memory
constraints it may be desirable to send the bitmap in smaller pieces,
so the protocol allows the heap data to be sent in several chunks.
To avoid ambiguity, the client is required
to send explicit "start" and "end" messages during an update.
<p>All messages include a "heap ID" that can be used to differentiate
between multiple independent virtual heaps or perhaps a native heap. The
client is allowed to send information about different heaps simultaneously,
so all heap-specific information is tagged with a "heap ID".
<h4>Chunk HPIF (server --&gt; client)</h4>
<p>Request heap info.
u1 when to send
<p>The "when" values are:
0: never
1: immediately
2: at the next GC
3: at every GC
<h4>Chunk HPIF (client --&gt; server, reply only)</h4>
<p>Heap Info. General information about the heap, suitable for a summary
u4 number of heaps
For each heap:
u4 heap ID
u8 timestamp in ms since Unix epoch
u1 capture reason (same as 'when' value from server)
u4 max heap size in bytes (-Xmx)
u4 current heap size in bytes
u4 current number of bytes allocated
u4 current number of objects allocated
<p>[We can get some of this from HPSG, more from HPSO.]
<p>[Do we need a "heap overhead" stat here, indicating how much goes to
waste? e.g. (8 bytes per object * number of objects)]
<h4>Chunk HPSG (server --&gt; client)</h4>
<p>Request transmission of heap segment data.
u1 when to send
u1 what to send
<p>The "when" to send will be zero to disable transmission, 1 to send
during a GC. Other values are currently undefined. (Could use to pick
which part of the GC to send it, or cause periodic transmissions.)
<p>The "what" field is currently 0 for HPSG and 1 for HPSO.
<p>No reply is expected.
<h4>Chunk NHSG (server --&gt; client)</h4>
<p>Request transmission of native heap segment data.
u1 when to send
u1 what to send
<p>The "when" to send will be zero to disable transmission, 1 to send
during a GC. Other values are currently undefined.
<p>The "what" field is currently ignored.
<p>No reply is expected.
<h4>Chunk HPST/NHST (client --&gt; server)</h4>
<p>This is a Heap Start message. It tells the server to discard any
existing notion of what the client's heap looks like, and prepare for
new information. HPST indicates a virtual heap dump and must be followed
by zero or more HPSG/HPSO messages and an HPEN. NHST indicates a native
heap dump and must be followed by zero or more NHSG messages and an NHEN.
<p>The only data item is:
u4 heap ID
<h4>Chunk HPEN/NHEN (client --&gt; server)</h4>
<p>Heap End, indicating that all information about the heap has been sent.
A HPST will be paired with an HPEN and an NHST will be paired with an NHEN.
<p>The only data item is:
u4 heap ID
<h4>Chunk HPSG (client --&gt; server)</h4>
<p>Heap segment data. Each chunk describes all or part of a contiguous
stretch of heap memory.
u4 heap ID
u1 size of allocation unit, in bytes (e.g. 8 bytes)
u4 virtual address of segment start
u4 offset of this piece (relative to the virtual address)
u4 length of piece, in allocation units
var usage data
<p>The "usage data" indicates the status of each allocation unit. The data
is a stream of pairs of bytes, where the first byte indicates the state
of the allocation unit, and the second byte indicates the number of
consecutive allocation units with the same state.
<p>The bits in the "state" byte have the following meaning:
| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| P | U0 | K2 | K1 | K0 | S2 | S1 | S0 |
<li>'S': solidity
<li>1=has hard reference
<li>2=has soft reference
<li>3=has weak reference
<li>4=has phantom reference
<li>5=pending finalization
<li>6=marked, about to be swept
<li>'K': kind
<li>1=class object
<li>2=array of byte/boolean
<li>3=array of char/short
<li>4=array of Object/int/float
<li>5=array of long/double
<li>'P': partial flag (not used for HPSG)
<li>'U': unused, must be zero
<p>The use of the various 'S' types depends on when the information is
sent. The current plan is to send it either immediately after a GC,
or between the "mark" and "sweep" phases of the GC. For a fancy generational
collector, we may just want to send it up periodically.
<p>The run-length byte indicates the number of allocation units minus one, so a
length of 255 means there are 256 consecutive units with this state. In
some cases, e.g. arrays of bytes, the actual size of the data is rounded
up the nearest allocation unit.
<p>For HPSG, the runs do not end at object boundaries. It is not possible
to tell from this bitmap whether a run contains one or several objects.
(But see HPSO, below.)
<p>[If we find that we have many long runs, we can overload the 'P' flag
or dedicate the 'U' flag to indicate that we have a 16-bit length instead
of 8-bit. We can also use a variable-width integer scheme for the length,
encoding 1-128 in one byte, 1-16384 in two bytes, etc.]
<p>[Alternate plan for 'K': array of byte, array of char, array of Object,
array of miscellaneous primitive type]
<p>To parse the data, the server runs through the usage data until either
(a) the end of the chunk is reached, or (b) all allocation units have been
accounted for. (If these two things don't happen at the same time, the
chunk is rejected.)
<p>Example: suppose a VM has a heap at 0x10000 that is 0x2000 bytes long
(with an 8-byte allocation unit size, that's 0x0400 units long).
The client could send one chunk (allocSize=8, virtAddr=0x10000, offset=0,
length=0x0400) or two (allocSize=8, virtAddr=0x10000, offset=0, length=0x300;
then allocSize=8, virtAddr=0x10000, offset=0x300, length=0x100).
<p>The client must encode the entire heap, including all free space at
the end, or the server will not have an accurate impression of the amount
of memory in the heap. This refers to the current heap size, not the
maximum heap size.
<h4>Chunk HPSO (client --&gt; server)</h4>
<p>This is essentially identical to HPSG, but the runs are terminated at
object boundaries. If an object is larger than 256 allocation units, the
"partial" flag is set in all runs except the last.
<p>The resulting unpacked bitmap is identical, but the object boundary
information can be used to gain insights into heap layout.
<p>[Do we want to have a separate message for this? Maybe just include
a "variant" flag in the HPST packet. Another possible form of output
would be one that indicates the age, in generations, of each block of
memory. That would provide a quick visual indication of "permanent vs.
transient residents", perhaps with a 16-level grey scale.]
<h4>Chunk NHSG (client --&gt; server)</h4>
<p>Native heap segment data. Each chunk describes all or part of a
contiguous stretch of native heap memory. The format is the same as
for HPSG, except that only solidity values 0 (= free) and 1 (= hard
reference) are used, and the kind value is always 0 for free chunks
and 7 for allocated chunks, indicating a non-VM object.
u4 heap ID
u1 size of allocation unit, in bytes (e.g. 8 bytes)
u4 virtual address of segment start
u4 offset of this piece (relative to the virtual address)
u4 length of piece, in allocation units
var usage data
<h3>Generic Replies</h3>
The client-side chunk handlers need a common way to report simple success
or failure. By convention, an empty reply packet indicates success.
<h4>Chunk FAIL (client --&gt; server, reply only)</h4>
<p>The chunk includes a machine-readable error code and a
human-readable error message. Server code can associate the failure
with the original request by comparing the JDWP packet ID.
<p>This allows a standard way of, for example, rejecting badly-formed
request packets.
u4 error code
u4 error message len (in 16-bit chars)
var error message (UTF-16)
<h4>Chunk EXIT (server --&gt; client)</h4>
<p>Cause the client to exit with the specified status, using System.exit().
Useful for certain kinds of testing.
u4 exit status
<h4>Chunk DTRC (server --&gt; client)</h4>
<p>[TBD] start/stop dmtrace; can send the results back over the wire. For
size reasons we probably need "sending", "data", "key", "finished" as
4 separate chunks/packets rather than one glob.
<h2>Client API</h2>
<p>The API is written in the Java programming language
for convenience. The code is free to call native methods if appropriate.
<h3>Chunk Handler API</h3>
<p>The basic idea is that arbitrary code can register handlers for
specific chunk types. When a DDM chunk with that type arrives, the
appropriate handler is invoked. The handler's return value provides the
response to the server.
<p>There are two packages. android.ddm lives in the "framework" library,
and has all of the chunk handlers and registration code. It can freely
use Android classes. org.apache.harmony.dalvik.ddmc lives in the "core"
library, and has
some base classes and features that interact with the VM. Nothing should
need to modify the org.apache.harmony.dalvik.ddmc classes.
<p>The DDM classes pass chunks of data around with a simple class:
<pre class=prettyprint>
class Chunk {
int type;
byte[] data;
int offset, length;
<p>The chunk handlers accept and return them:
<pre class=prettyprint>
public Chunk handleChunk(Chunk request)
<p>The code is free to parse the chunk and generate a response in any
way it chooses. Big-endian byte ordering is recommended but not mandatory.
<p>Chunk handlers will be notified when a DDM server connects or disconnects,
so that they can perform setup and cleanup operations:
<pre class=prettyprint>
public void connected()
public void disconnected()
<p>The method processes the request, formulates a response, and returns it.
If the method returns null, an empty JDWP success message will be returned.
<p>The request/response interaction is essentially asynchronous in the
protocol. The packets are linked together with the JDWP message ID.
<p>[We could use ByteBuffer here instead of byte[], but it doesn't gain
us much. Wrapping a ByteBuffer around an array is easy. We don't want
to pass the full packet in because we could have multiple chunks in one
request packet. The DDM code needs to collect and aggregate the responses
to all chunks into a single JDWP response packet. Parties wanting to
write multiple chunks in response to a single chunk should send a null
response back and use "sendChunk()" to send the data independently.]
<h3>Unsolicited event API</h3>
<p>If a piece of code wants to send a chunk of data to the server at some
arbitrary time, it may do so with a method provided by
<pre class=prettyprint>
public static void sendChunk(Chunk chunk)
<p>There is no response or status code. No exceptions are thrown.
<h2>Server API</h2>
<p>This is similar to the client side in many ways, but makes extensive
use of ByteBuffer in a perhaps misguided attempt to use java.nio.channels
and avoid excessive thread creation and unnecessary data copying.
<p>Upon receipt of a packet, the server will identify it as one of:
<li>Message to be passed through to the debugger
<li>Response to an earlier request
<li>Unsolicited event packet
<p>To handle (2), when messages are sent from the server to the client,
the message must be paired with a callback method. The response might be
delayed for a while -- or might never arrive -- so the server can't block
waiting for responses from the client.
<p>The chunk handlers look like this:
<pre class=prettyprint>
public void handleChunk(Client client, int type,
ByteBuffer data, boolean isReply, int msgId)
<p>The arguments are:
<dd>An object representing the client VM that send us the packet.
<dd>The 32-bit chunk type.
<dd>The data. The data's length can be determined by calling data.limit().
<dd>Set to "true" if this was a reply to a message we sent earlier,
"false" if the client sent this unsolicited.
<dd>The JDWP message ID. Useful for connecting replies with requests.
<p>If a handler doesn't like the contents of a packet, it should log an
error message and return. If the handler doesn't recognize the packet at
all, it can call the superclass' handleUnknownChunk() method.
<p>As with the client, the server code can be notified when clients
connect or disconnect. This allows the handler to send initialization
code immediately after a connect, or clean up after a disconnect.
<p>Data associated with a client can be stored in a ClientData object,
which acts as a general per-client dumping around for VM and UI state.
<address>Copyright &copy; 2007 The Android Open Source Project</address>