blob: 2c9628f70cc9967a8bdd79098c98021577a983c8 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<link rel="stylesheet" href="resources/doc.css" charset="UTF-8" type="text/css" />
<link rel="stylesheet" href="../coverage/jacoco-resources/prettify.css" charset="UTF-8" type="text/css" />
<link rel="shortcut icon" href="resources/report.gif" type="image/gif" />
<script type="text/javascript" src="../coverage/jacoco-resources/prettify.js"></script>
<title>JaCoCo - Implementation Design</title>
</head>
<body onload="prettyPrint()">
<div class="breadcrumb">
<a href="../index.html" class="el_report">JaCoCo</a> &gt;
<a href="index.html" class="el_group">Documentation</a> &gt;
<span class="el_source">Implementation Design</span>
</div>
<div id="content">
<h1>Implementation Design</h1>
<p>
This is a unordered list of implementation design decisions. Each topic tries
to follow this structure:
</p>
<ul>
<li>Problem statement</li>
<li>Proposed Solution</li>
<li>Alternatives and Discussion</li>
</ul>
<h2>Coverage Analysis Mechanism</h2>
<p class="intro">
Coverage information has to be collected at runtime. For this purpose JaCoCo
creates instrumented versions of the original class definitions. The
instrumentation process happens on-the-fly during class loading using so
called Java agents.
</p>
<p>
There are several different approaches to collect coverage information. For
each approach different implementation techniques are known. The following
diagram gives an overview with the techniques used by JaCoCo highlighted:
</p>
<img src="resources/implementation.png" alt="Coverage Implementation Techniques"/>
<p>
Byte code instrumentation is very fast, can be implemented in pure Java and
works with every Java VM. On-the-fly instrumentation with the Java agent
hook can be added to the JVM without any modification of the target
application.
</p>
<p>
The Java agent hook requires at least 1.5 JVMs. Class files compiled with
debug information (line numbers) allow for source code highlighting. Unluckily
some Java language constructs get compiled to byte code that produces
unexpected highlighting results, especially in case of implicitly generated
code like default constructors or control structures for finally statements.
</p>
<h2>Coverage Agent Isolation</h2>
<p class="intro">
The Java agent is loaded by the application class loader. Therefore the
classes of the agent live in the same name space like the application classes
which can result in clashes especially with the third party library ASM. The
JoCoCo build therefore moves all agent classes into a unique package.
</p>
<p>
The JaCoCo build renames all classes contained in the
<code>jacocoagent.jar</code> into classes with a
<code>org.jacoco.agent.rt_&lt;randomid&gt;</code> prefix, including the
required ASM library classes. The identifier is created from a random number.
As the agent does not provide any API, no one should be affected by this
renaming. This trick also allows that JaCoCo tests can be verified with
JaCoCo.
</p>
<h2>Minimal Java Version</h2>
<p class="intro">
JaCoCo requires Java 1.5.
</p>
<p>
The Java agent mechanism used for on-the-fly instrumentation became available
with Java 1.5 VMs. Coding and testing with Java 1.5 language level is more
efficient, less error-prone &ndash; and more fun than with older versions.
JaCoCo will still allow to run against Java code compiled for these.
</p>
<h2>Byte Code Manipulation</h2>
<p class="intro">
Instrumentation requires mechanisms to modify and generate Java byte code.
JaCoCo uses the ASM library for this purpose internally.
</p>
<p>
Implementing the Java byte code specification would be an extensive and
error-prone task. Therefore an existing library should be used. The
<a href="http://asm.objectweb.org/">ASM</a> library is lightweight, easy to
use and very efficient in terms of memory and CPU usage. It is actively
maintained and includes as huge regression test suite. Its simplified BSD
license is approved by the Eclipse Foundation for usage with EPL products.
</p>
<h2>Java Class Identity</h2>
<p class="intro">
Each class loaded at runtime needs a unique identity to associate coverage data with.
JaCoCo creates such identities by a CRC64 hash code of the raw class definition.
</p>
<p>
In multi-classloader environments the plain name of a class does not
unambiguously identify a class. For example OSGi allows to use different
versions of the same class to be loaded within the same VM. In complex
deployment scenarios the actual version of the test target might be different
from current development version. A code coverage report should guarantee that
the presented figures are extracted from a valid test target. A hash code of
the class definitions allows to differentiate between classes and versions of
classes. The CRC64 hash computation is simple and fast resulting in a small 64
bit identifier.
</p>
<p>
The same class definition might be loaded by class loaders which will result
in different classes for the Java runtime system. For coverage analysis this
distinction should be irrelevant. Class definitions might be altered by other
instrumentation based technologies (e.g. AspectJ). In this case the hash code
will change and identity gets lost. On the other hand code coverage analysis
based on classes that have been somehow altered will produce unexpected
results. The CRC64 code might produce so called <i>collisions</i>, i.e.
creating the same hash code for two different classes. Although CRC64 is not
cryptographically strong and collision examples can be easily computed, for
regular class files the collision probability is very low.
</p>
<h2>Coverage Runtime Dependency</h2>
<p class="intro">
Instrumented code typically gets a dependency to a coverage runtime which is
responsible for collecting and storing execution data. JaCoCo uses JRE types
only in generated instrumentation code.
</p>
<p>
Making a runtime library available to all instrumented classes can be a
painful or impossible task in frameworks that use their own class loading
mechanisms. Since Java 1.6 <code>java.lang.instrument.Instrumentation</code>
has an API to extends the bootsstrap loader. As our minimum target is Java 1.5
JaCoCo decouples the instrumented classes and the coverage runtime through
official JRE API types only. The instrumented classes communicate through the
<code>Object.equals(Object)</code> method with the runtime. A instrumented
class can retrieve its probe array instance with the following code. Note
that only JRE APIs are used:
</p>
<pre class="source lang-java linenums">
Object access = ... // Retrieve instance
Object[] args = new Object[3];
args[0] = Long.valueOf(8060044182221863588); // class id
args[1] = "com/example/MyClass"; // class name
args[2] = Integer.valueOf(24); // probe count
access.equals(args);
boolean[] probes = (boolean[]) args[0];
</pre>
<p>
The most tricky part takes place in line 1 and is not shown in the snippet
above. The object instance providing access to the coverage runtime through
its <code>equals()</code> method has to be obtained. Different approaches have
been implemented and tested so far:
</p>
<ul>
<li><b><code>SystemPropertiesRuntime</code></b>: This approach stores the
object instance under a system property. This solution breaks the contract
that system properties must only contain <code>java.lang.String</code>
values and therefore causes trouble in applications that rely on this
definition (e.g. Ant).</li>
<li><b><code>LoggerRuntime</code></b>: Here we use a shared
<code>java.util.logging.Logger</code> and communicate through the logging
parameter array instead of a <code>equals()</code> method. The coverage
runtime registers a custom <code>Handler</code> to receive the parameter
array. This approach might break environments that install their own log
managers (e.g. Glassfish).</li>
<li><b><code>URLStreamHandlerRuntime</code></b>: This runtime registers a
<code>URLStreamHandler</code> for a "jacoco-xxxxx" protocol. Instrumented
classes open a connection on this protocol. The returned connection object
is the one that provides access to the coverage runtime through its
<code>equals()</code> method. However to register the protocol the runtime
needs to access internal members of the <code>java.net.URL</code> class.</li>
<li><b><code>ModifiedSystemClassRuntime</code></b>: This approach adds a
public static field to an existing JRE class through instrumentation. Unlike
the other methods above this is only possible for environments where a Java
agent is active.</li>
</ul>
<p>
The current JaCoCo Java agent implementation uses the
<code>ModifiedSystemClassRuntime</code> adding a field to the class
<code>java.sql.Types</code>.
</p>
<h2>Memory Usage</h2>
<p class="intro">
Coverage analysis for huge projects with several thousand classes or hundred
thousand lines of code should be possible. To allow this with reasonable
memory usage the coverage analysis is based on streaming patterns and
"depth first" traversals.
</p>
<p>
The complete data tree of a huge coverage report is too big to fit into a
reasonable heap memory configuration. Therefore the coverage analysis and
report generation is implemented as "depth first" traversals. Which means that
at any point in time only the following data has to be held in working memory:
</p>
<ul>
<li>A single class which is currently processed.</li>
<li>The summary information of all parents of this class (package, groups).</li>
</ul>
<h2>Java Element Identifiers</h2>
<p class="intro">
The Java language and the Java VM use different String representation formats
for Java elements. For example while a type reference in Java reads like
<code>java.lang.Object</code>, the VM references the same type as
<code>Ljava/lang/Object;</code>. The JaCoCo API is based on VM identifiers only.
</p>
<p>
Using VM identifiers directly does not cause any transformation overhead at
runtime. There are several programming languages based on the Java VM that
might use different notations. Specific transformations should therefore only
happen at the user interface level, for example during report generation.
</p>
<h2>Modularization of the JaCoCo implementation</h2>
<p class="intro">
JaCoCo is implemented in several modules providing different functionality.
These modules are provided as OSGi bundles with proper manifest files. But
there are no dependencies on OSGi itself.
</p>
<p>
Using OSGi bundles allows well defined dependencies at development time and
at runtime in OSGi containers. As there are no dependencies on OSGi, the
bundles can also be used like regular JAR files.
</p>
</div>
<div class="footer">
<span class="right"><a href="@jacoco.home.url@">JaCoCo</a> @qualified.bundle.version@</span>
<a href="license.html">Copyright</a> &copy; @copyright.years@ Mountainminds GmbH &amp; Co. KG and Contributors
</div>
</body>
</html>