en/devices/tech/debug/eval_perf.html - platform/docs/source.android.com - Git at Google

 <html devsite>
   <head>
     <title>Evaluating Performance</title>
     <meta name="project_path" value="/_project.yaml" />
     <meta name="book_path" value="/_book.yaml" />
   </head>
   <body>
   <!--
       Copyright 2017 The Android Open Source Project

       Licensed under the Apache License, Version 2.0 (the "License");
       you may not use this file except in compliance with the License.
       You may obtain a copy of the License at

           http://www.apache.org/licenses/LICENSE-2.0

       Unless required by applicable law or agreed to in writing, software
       distributed under the License is distributed on an "AS IS" BASIS,
       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
       See the License for the specific language governing permissions and
       limitations under the License.
   -->


 <p>Use
   <a href="https://android.googlesource.com/platform/system/extras/+/master/simpleperf/doc/README.md" class="external">Simpleperf</a>
   to evaluate the performance of a device. Simpleperf is a native profiling tool for both
   applications and native processes on Android. Use
   <a href="https://developer.android.com/studio/profile/cpu-profiler" class="external">CPU Profiler</a>
   to inspect app CPU usage and thread activity in real time.</p>

 <p>There are two user-visible indicators of performance:</p>

 <ul>
 <li><strong>Predictable, perceptible performance</strong>. Does the user
 interface (UI) drop frames or consistently render at 60FPS? Does audio play
 without artifacts or popping? How long is the delay between the user touching
 the screen and the effect showing on the display?</li>
 <li><strong>Length of time required for longer operations</strong> (such as
 opening applications).</li>
 </ul>

 <p>The first is more noticeable than the second. Users typically notice jank
 but they won't be able to tell 500ms vs 600ms application startup time unless
 they are looking at two devices side-by-side. Touch latency is immediately
 noticeable and significantly contributes to the perception of a device.</p>

 <p>As a result, in a fast device, the UI pipeline is the most important thing in
 the system other than what is necessary to keep the UI pipeline functional. This
 means that the UI pipeline should preempt any other work that is not necessary
 for fluid UI. To maintain a fluid UI, background syncing, notification delivery,
 and similar work must all be delayed if UI work can be run. It is
 acceptable to trade the performance of longer operations (HDR+ runtime,
 application startup, etc.) to maintain a fluid UI.</p>

 <h2 id="capacity_vs_jitter">Capacity vs jitter</h2>
 <p>When considering device performance, <em>capacity</em> and <em>jitter</em>
 are two meaningful metrics.</p>

 <h3 id="capacity">Capacity</h3>
 <p>Capacity is the total amount of some resource that the device possesses over
 some amount of time. This can be CPU resources, GPU resources, I/O resources,
 network resources, memory bandwidth, or any similar metric. When examining
 whole-system performance, it can be useful to abstract the individual components
 and assume a single metric that determines performance (especially when tuning a
 new device because the workloads run on that device are likely fixed).</p>

 <p>The capacity of a system varies based on the computing resources online.
 Changing CPU/GPU frequency is the primary means of changing capacity, but there
 are others such as changing the number of CPU cores online. Accordingly, the
 capacity of a system corresponds with power consumption; <strong>changing
 capacity always results in a similar change in power consumption.</strong></p>

 <p>The capacity required at a given time is overwhelmingly determined by the
 running application. As a result, the platform can do little to adjust the
 capacity required for a given workload, and the means to do so are limited to
 runtime improvements (Android framework, ART, Bionic, GPU compiler/drivers,
 kernel).</p>

 <h3 id="jitter">Jitter</h3>
 <p>While the required capacity for a workload is easy to see, jitter is a more
 nebulous concept. For a good introduction to jitter as an impediment to fast
 systems, refer to
 <em><a href="http://permalink.lanl.gov/object/tr?what=info:lanl-repo/lareport/LA-UR-03-3116">THE
 CASE OF THE MISSING SUPERCOMPUTER PERFORMANCE: ACHIEVING OPTIMAL PERFORMANCE ON
 THE 8,192 PROCESSORS OF ASCl Q</em></a>. (It's an investigation of why the ASCI
 Q supercomputer did not achieve its expected performance and is a great
 introduction to optimizing large systems.)</p>

 <p>This page uses the term jitter to describe what the ASCI Q paper calls
 <em>noise</em>. Jitter is the random system behavior that prevents perceptible
 work from running. It is often work that must be run, but it may not have strict
 timing requirements that cause it to run at any particular time. Because it is
 random, it is extremely difficult to disprove the existence of jitter for a
 given workload. It is also extremely difficult to prove that a known source of
 jitter was the cause of a particular performance issue. The tools most commonly
 used for diagnosing causes of jitter (such as tracing or logging) can introduce
 their own jitter.</p>

 <p>Sources of jitter experienced in real-world implementations of Android
 include:</p>
 <ul>
 <li>Scheduler delay</li>
 <li>Interrupt handlers</li>
 <li>Driver code running for too long with preemption or interrupts disabled</li>
 <li>Long-running softirqs</li>
 <li>Lock contention (application, framework, kernel driver, binder lock, mmap
 lock)</li>
 <li>File descriptor contention where a low-priority thread holds the lock on a
 file, preventing a high-priority thread from running</li>
 <li>Running UI-critical code in workqueues where it could be delayed</li>
 <li>CPU idle transitions</li>
 <li>Logging</li>
 <li>I/O delays</li>
 <li>Unnecessary process creation (e.g., CONNECTIVITY_CHANGE broadcasts)</li>
 <li>Page cache thrashing caused by insufficient free memory</li>
 </ul>

 <p>The required amount of time for a given period of jitter may or may not
 decrease as capacity increases. For example, if a driver leaves interrupts
 disabled while waiting for a read from across an i2c bus, it will take a fixed
 amount of time regardless of whether the CPU is at 384MHz or 2GHz. Increasing
 capacity is not a feasible solution to improve performance when jitter is
 involved. As a result, <strong>faster processors will not usually improve
 performance in jitter-constrained situations.</strong></p>

 <p>Finally, unlike capacity, jitter is almost entirely within the domain of the
 system vendor.</p>

 <h3 id="memory_consumption">Memory consumption</h3>
 <p>Memory consumption is traditionally blamed for poor performance. While
 consumption itself is not a performance issue, it can cause jitter via
 lowmemorykiller overhead, service restarts, and page cache thrashing. Reducing
 memory consumption can avoid the direct causes of poor performance, but there
 may be other targeted improvements that avoid those causes as well (for example,
 pinning the framework to prevent it from being paged out when it will be paged
 in soon after).</p>

 <h2 id="analyze_initial">Analyzing initial device performance</h2>
 <p>Starting from a functional but poorly-performing system and attempting to fix
 the system's behavior by looking at individual cases of user-visible poor
 performance is <strong>not</strong> a sound strategy. Because poor performance
 is usually not easily reproducible (i.e., jitter) or an application issue, too
 many variables in the full system prevent this strategy from being effective. As
 a result, it's very easy to misidentify causes and make minor improvements while
 missing systemic opportunities for fixing performance across the system.</p>

 <p>Instead, use the following general approach when bringing up a new
 device:</p>
 <ol>
 <li>Get the system booting to UI with all drivers running and some basic
 frequency governor settings (if you change the frequency governor settings,
 repeat all steps below).</li>
 <li>Ensure the kernel supports the <code>sched_blocked_reason</code> tracepoint
 as well as other tracepoints in the display pipeline that denote when the frame
 is delivered to the display.</li>
 <li>Take long traces of the entire UI pipeline (from receiving input via an IRQ
 to final scanout) while running a lightweight and consistent workload (e.g.,
 <a href="https://android.googlesource.com/platform/frameworks/base.git/+/master/tests/UiBench/">UiBench</a>
 or the ball test in <a href="#touchlatency">TouchLatency)</a>.</li>
 <li>Fix the frame drops detected in the lightweight and consistent
 workload.</li>
 <li>Repeat steps 3-4 until you can run with zero dropped frames for 20+ seconds
 at a time. </li>
 <li>Move on to other user-visible sources of jank.</li>
 </ol>

 <p>Other simple things you can do early on in device bringup include:</p>

 <ul>
 <li>Ensure your kernel has the
 <a href="https://android.googlesource.com/kernel/msm/+/c9f00aa0e25e397533c198a0fcf6246715f99a7b%5E!/">sched_blocked_reason
 tracepoint patch</a>. This tracepoint is enabled with the sched trace category
 in systrace and provides the function responsible for sleeping when that
 thread enters uninterruptible sleep. It is critical for performance analysis
 because uninterruptible sleep is a very common indicator of jitter.</li>
 <li>Ensure you have sufficient tracing for the GPU and display pipelines. On
 recent Qualcomm SOCs, tracepoints are enabled using:</li>
 <pre class="devsite-click-to-copy">
 <code class="devsite-terminal">adb shell "echo 1 &gt; /d/tracing/events/kgsl/enable"</code>
 <code class="devsite-terminal">adb shell "echo 1 &gt; /d/tracing/events/mdss/enable"</code>
 </pre>

 <p>These events remain enabled when you run systrace so you can see additional
 information in the trace about the display pipeline (MDSS) in the
 <code>mdss_fb0</code> section. On Qualcomm SOCs, you won't see any additional
 information about the GPU in the standard systrace view, but the results are
 present in the trace itself (for details, see
 <a href="/devices/tech/debug/systrace.html">Understanding
 systrace</a>).</p>

 <p>What you want from this kind of display tracing is a single event that
 directly indicates a frame has been delivered to the display. From there, you
 can determine if you've hit your frame time successfully; if event X<em>n</em>
 occurs less than 16.7ms after event X<em>n-1</em> (assuming a 60Hz display),
 then you know you did not jank. If your SOC does not provide such signals, work
 with your vendor to get them. Debugging jitter is extremely difficult without a
 definitive signal of frame completion.</p></ul>

 <h3 id="synthetic_benchmarks">Using synthetic benchmarks</h3>
 <p>Synthetic benchmarks are useful for ensuring a device's basic functionality
 is present. However, treating benchmarks as a proxy for perceived device
 performance is not useful.</p>

 <p>Based on experiences with SOCs, differences in synthetic benchmark
 performance between SOCs is not correlated with a similar difference in
 perceptible UI performance (number of dropped frames, 99th percentile frame
 time, etc.). Synthetic benchmarks are capacity-only benchmarks; jitter impacts
 the measured performance of these benchmarks only by stealing time from the bulk
 operation of the benchmark. As a result, synthetic benchmark scores are mostly
 irrelevant as a metric of user-perceived performance.</p>

 <p>Consider two SOCs running Benchmark X that renders 1000 frames of UI and
 reports the total rendering time (lower score is better).</p>

 <ul>
 <li>SOC 1 renders each frame of Benchmark X in 10ms and scores 10,000.</li>
 <li>SOC 2 renders 99% of frames in 1ms but 1% of frames in 100ms and scores
 19,900, a dramatically better score.</li>
 </ul>

 <p>If the benchmark is indicative of actual UI performance, SOC 2 would be
 unusable. Assuming a 60Hz refresh rate, SOC 2 would have a janky frame every
 1.5s of operation. Meanwhile, SOC 1 (the slower SOC according to Benchmark X)
 would be perfectly fluid.</p>

 <h3 id="bug_reports">Using bug reports</h3>
 <p>Bug reports are sometimes useful for performance analysis, but because they
 are so heavyweight, they are rarely useful for debugging sporadic jank issues.
 They may provide some hints on what the system was doing at a given time,
 especially if the jank was around an application transition (which is logged in
 a bug report). Bug reports can also indicate when something is more broadly
 wrong with the system that could reduce its effective capacity (such as thermal
 throttling or memory fragmentation).</p>

 <h3 id="touchlatency">Using TouchLatency</h3>
 <p>Several examples of bad behavior come from TouchLatency, which is the
 preferred periodic workload used for the Pixel and Pixel XL. It's available at
 <code>frameworks/base/tests/TouchLatency</code> and has two modes: touch latency
 and bouncing ball (to switch modes, click the button in the upper-right
 corner).</p>

 <p>The bouncing ball test is exactly as simple as it appears: A ball bounces
 around the screen forever, regardless of user input. It is usually also
 <strong>by far</strong> the hardest test to run perfectly, but the closer it
 comes to running without any dropped frames, the better your device will be. The
 bouncing ball test is difficult because it is a trivial but perfectly consistent
 workload that runs at a very low clock (this assumes device has a frequency
 governor; if the device is instead running with fixed clocks, downclock the
 CPU/GPU to near-minimum when running the bouncing ball test for the first time).
 As the system quiesces and the clocks drop closer to idle, the required CPU/GPU
 time per frame increases. You can watch the ball and see things jank, and you'll
 be able to see missed frames in systrace as well.</p>

 <p>Because the workload is so consistent, you can identify most sources of
 jitter much more easily than in most user-visible workloads by tracking what
 exactly is running on the system during each missed frame instead of the UI
 pipeline. <strong>The lower clocks amplify the effects of jitter by making it
 more likely that any jitter causes a dropped frame.</strong> As a result, the
 closer TouchLatency is to 60FPS, the less likely you are to have bad system
 behaviors that cause sporadic, hard-to-reproduce jank in larger
 applications.</p>

 <p>As jitter is often (but not always) clockspeed-invariant, use a test that
 runs at very low clocks to diagnose jitter for the following reasons:</p>
 <ul>
 <li>Not all jitter is clockspeed-invariant; many sources just consume CPU
 time.</li>
 <li>The governor should get the average frame time close to the deadline by
 clocking down, so time spent running non-UI work can push it over the edge to
 dropping a frame.</li>
 </ul>

 </body>
 </html>
	<html devsite>
	<head>
	<title>Evaluating Performance</title>
	<meta name="project_path" value="/_project.yaml" />
	<meta name="book_path" value="/_book.yaml" />
	</head>
	<body>
	<!--
	Copyright 2017 The Android Open Source Project

	Licensed under the Apache License, Version 2.0 (the "License");
	you may not use this file except in compliance with the License.
	You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->


	<p>Use
	<a href="https://android.googlesource.com/platform/system/extras/+/master/simpleperf/doc/README.md" class="external">Simpleperf</a>
	to evaluate the performance of a device. Simpleperf is a native profiling tool for both
	applications and native processes on Android. Use
	<a href="https://developer.android.com/studio/profile/cpu-profiler" class="external">CPU Profiler</a>
	to inspect app CPU usage and thread activity in real time.</p>

	<p>There are two user-visible indicators of performance:</p>

	<ul>
	<li><strong>Predictable, perceptible performance</strong>. Does the user
	interface (UI) drop frames or consistently render at 60FPS? Does audio play
	without artifacts or popping? How long is the delay between the user touching
	the screen and the effect showing on the display?</li>
	<li><strong>Length of time required for longer operations</strong> (such as
	opening applications).</li>
	</ul>

	<p>The first is more noticeable than the second. Users typically notice jank
	but they won't be able to tell 500ms vs 600ms application startup time unless
	they are looking at two devices side-by-side. Touch latency is immediately
	noticeable and significantly contributes to the perception of a device.</p>

	<p>As a result, in a fast device, the UI pipeline is the most important thing in
	the system other than what is necessary to keep the UI pipeline functional. This
	means that the UI pipeline should preempt any other work that is not necessary
	for fluid UI. To maintain a fluid UI, background syncing, notification delivery,
	and similar work must all be delayed if UI work can be run. It is
	acceptable to trade the performance of longer operations (HDR+ runtime,
	application startup, etc.) to maintain a fluid UI.</p>

	<h2 id="capacity_vs_jitter">Capacity vs jitter</h2>
	<p>When considering device performance, <em>capacity</em> and <em>jitter</em>
	are two meaningful metrics.</p>

	<h3 id="capacity">Capacity</h3>
	<p>Capacity is the total amount of some resource that the device possesses over
	some amount of time. This can be CPU resources, GPU resources, I/O resources,
	network resources, memory bandwidth, or any similar metric. When examining
	whole-system performance, it can be useful to abstract the individual components
	and assume a single metric that determines performance (especially when tuning a
	new device because the workloads run on that device are likely fixed).</p>

	<p>The capacity of a system varies based on the computing resources online.
	Changing CPU/GPU frequency is the primary means of changing capacity, but there
	are others such as changing the number of CPU cores online. Accordingly, the
	capacity of a system corresponds with power consumption; <strong>changing
	capacity always results in a similar change in power consumption.</strong></p>

	<p>The capacity required at a given time is overwhelmingly determined by the
	running application. As a result, the platform can do little to adjust the
	capacity required for a given workload, and the means to do so are limited to
	runtime improvements (Android framework, ART, Bionic, GPU compiler/drivers,
	kernel).</p>

	<h3 id="jitter">Jitter</h3>
	<p>While the required capacity for a workload is easy to see, jitter is a more
	nebulous concept. For a good introduction to jitter as an impediment to fast
	systems, refer to
	<em><a href="http://permalink.lanl.gov/object/tr?what=info:lanl-repo/lareport/LA-UR-03-3116">THE
	CASE OF THE MISSING SUPERCOMPUTER PERFORMANCE: ACHIEVING OPTIMAL PERFORMANCE ON
	THE 8,192 PROCESSORS OF ASCl Q</em></a>. (It's an investigation of why the ASCI
	Q supercomputer did not achieve its expected performance and is a great
	introduction to optimizing large systems.)</p>

	<p>This page uses the term jitter to describe what the ASCI Q paper calls
	<em>noise</em>. Jitter is the random system behavior that prevents perceptible
	work from running. It is often work that must be run, but it may not have strict
	timing requirements that cause it to run at any particular time. Because it is
	random, it is extremely difficult to disprove the existence of jitter for a
	given workload. It is also extremely difficult to prove that a known source of
	jitter was the cause of a particular performance issue. The tools most commonly
	used for diagnosing causes of jitter (such as tracing or logging) can introduce
	their own jitter.</p>

	<p>Sources of jitter experienced in real-world implementations of Android
	include:</p>
	<ul>
	<li>Scheduler delay</li>
	<li>Interrupt handlers</li>
	<li>Driver code running for too long with preemption or interrupts disabled</li>
	<li>Long-running softirqs</li>
	<li>Lock contention (application, framework, kernel driver, binder lock, mmap
	lock)</li>
	<li>File descriptor contention where a low-priority thread holds the lock on a
	file, preventing a high-priority thread from running</li>
	<li>Running UI-critical code in workqueues where it could be delayed</li>
	<li>CPU idle transitions</li>
	<li>Logging</li>
	<li>I/O delays</li>
	<li>Unnecessary process creation (e.g., CONNECTIVITY_CHANGE broadcasts)</li>
	<li>Page cache thrashing caused by insufficient free memory</li>
	</ul>

	<p>The required amount of time for a given period of jitter may or may not
	decrease as capacity increases. For example, if a driver leaves interrupts
	disabled while waiting for a read from across an i2c bus, it will take a fixed
	amount of time regardless of whether the CPU is at 384MHz or 2GHz. Increasing
	capacity is not a feasible solution to improve performance when jitter is
	involved. As a result, <strong>faster processors will not usually improve
	performance in jitter-constrained situations.</strong></p>

	<p>Finally, unlike capacity, jitter is almost entirely within the domain of the
	system vendor.</p>

	<h3 id="memory_consumption">Memory consumption</h3>
	<p>Memory consumption is traditionally blamed for poor performance. While
	consumption itself is not a performance issue, it can cause jitter via
	lowmemorykiller overhead, service restarts, and page cache thrashing. Reducing
	memory consumption can avoid the direct causes of poor performance, but there
	may be other targeted improvements that avoid those causes as well (for example,
	pinning the framework to prevent it from being paged out when it will be paged
	in soon after).</p>

	<h2 id="analyze_initial">Analyzing initial device performance</h2>
	<p>Starting from a functional but poorly-performing system and attempting to fix
	the system's behavior by looking at individual cases of user-visible poor
	performance is <strong>not</strong> a sound strategy. Because poor performance
	is usually not easily reproducible (i.e., jitter) or an application issue, too
	many variables in the full system prevent this strategy from being effective. As
	a result, it's very easy to misidentify causes and make minor improvements while
	missing systemic opportunities for fixing performance across the system.</p>

	<p>Instead, use the following general approach when bringing up a new
	device:</p>
	<ol>
	<li>Get the system booting to UI with all drivers running and some basic
	frequency governor settings (if you change the frequency governor settings,
	repeat all steps below).</li>
	<li>Ensure the kernel supports the <code>sched_blocked_reason</code> tracepoint
	as well as other tracepoints in the display pipeline that denote when the frame
	is delivered to the display.</li>
	<li>Take long traces of the entire UI pipeline (from receiving input via an IRQ
	to final scanout) while running a lightweight and consistent workload (e.g.,
	<a href="https://android.googlesource.com/platform/frameworks/base.git/+/master/tests/UiBench/">UiBench</a>
	or the ball test in <a href="#touchlatency">TouchLatency)</a>.</li>
	<li>Fix the frame drops detected in the lightweight and consistent
	workload.</li>
	<li>Repeat steps 3-4 until you can run with zero dropped frames for 20+ seconds
	at a time. </li>
	<li>Move on to other user-visible sources of jank.</li>
	</ol>

	<p>Other simple things you can do early on in device bringup include:</p>

	<ul>
	<li>Ensure your kernel has the
	<a href="https://android.googlesource.com/kernel/msm/+/c9f00aa0e25e397533c198a0fcf6246715f99a7b%5E!/">sched_blocked_reason
	tracepoint patch</a>. This tracepoint is enabled with the sched trace category
	in systrace and provides the function responsible for sleeping when that
	thread enters uninterruptible sleep. It is critical for performance analysis
	because uninterruptible sleep is a very common indicator of jitter.</li>
	<li>Ensure you have sufficient tracing for the GPU and display pipelines. On
	recent Qualcomm SOCs, tracepoints are enabled using:</li>
	<pre class="devsite-click-to-copy">
	<code class="devsite-terminal">adb shell "echo 1 > /d/tracing/events/kgsl/enable"</code>
	<code class="devsite-terminal">adb shell "echo 1 > /d/tracing/events/mdss/enable"</code>
	</pre>

	<p>These events remain enabled when you run systrace so you can see additional
	information in the trace about the display pipeline (MDSS) in the
	<code>mdss_fb0</code> section. On Qualcomm SOCs, you won't see any additional
	information about the GPU in the standard systrace view, but the results are
	present in the trace itself (for details, see
	<a href="/devices/tech/debug/systrace.html">Understanding
	systrace</a>).</p>

	<p>What you want from this kind of display tracing is a single event that
	directly indicates a frame has been delivered to the display. From there, you
	can determine if you've hit your frame time successfully; if event X<em>n</em>
	occurs less than 16.7ms after event X<em>n-1</em> (assuming a 60Hz display),
	then you know you did not jank. If your SOC does not provide such signals, work
	with your vendor to get them. Debugging jitter is extremely difficult without a
	definitive signal of frame completion.</p></ul>

	<h3 id="synthetic_benchmarks">Using synthetic benchmarks</h3>
	<p>Synthetic benchmarks are useful for ensuring a device's basic functionality
	is present. However, treating benchmarks as a proxy for perceived device
	performance is not useful.</p>

	<p>Based on experiences with SOCs, differences in synthetic benchmark
	performance between SOCs is not correlated with a similar difference in
	perceptible UI performance (number of dropped frames, 99th percentile frame
	time, etc.). Synthetic benchmarks are capacity-only benchmarks; jitter impacts
	the measured performance of these benchmarks only by stealing time from the bulk
	operation of the benchmark. As a result, synthetic benchmark scores are mostly
	irrelevant as a metric of user-perceived performance.</p>

	<p>Consider two SOCs running Benchmark X that renders 1000 frames of UI and
	reports the total rendering time (lower score is better).</p>

	<ul>
	<li>SOC 1 renders each frame of Benchmark X in 10ms and scores 10,000.</li>
	<li>SOC 2 renders 99% of frames in 1ms but 1% of frames in 100ms and scores
	19,900, a dramatically better score.</li>
	</ul>

	<p>If the benchmark is indicative of actual UI performance, SOC 2 would be
	unusable. Assuming a 60Hz refresh rate, SOC 2 would have a janky frame every
	1.5s of operation. Meanwhile, SOC 1 (the slower SOC according to Benchmark X)
	would be perfectly fluid.</p>

	<h3 id="bug_reports">Using bug reports</h3>
	<p>Bug reports are sometimes useful for performance analysis, but because they
	are so heavyweight, they are rarely useful for debugging sporadic jank issues.
	They may provide some hints on what the system was doing at a given time,
	especially if the jank was around an application transition (which is logged in
	a bug report). Bug reports can also indicate when something is more broadly
	wrong with the system that could reduce its effective capacity (such as thermal
	throttling or memory fragmentation).</p>

	<h3 id="touchlatency">Using TouchLatency</h3>
	<p>Several examples of bad behavior come from TouchLatency, which is the
	preferred periodic workload used for the Pixel and Pixel XL. It's available at
	<code>frameworks/base/tests/TouchLatency</code> and has two modes: touch latency
	and bouncing ball (to switch modes, click the button in the upper-right
	corner).</p>

	<p>The bouncing ball test is exactly as simple as it appears: A ball bounces
	around the screen forever, regardless of user input. It is usually also
	<strong>by far</strong> the hardest test to run perfectly, but the closer it
	comes to running without any dropped frames, the better your device will be. The
	bouncing ball test is difficult because it is a trivial but perfectly consistent
	workload that runs at a very low clock (this assumes device has a frequency
	governor; if the device is instead running with fixed clocks, downclock the
	CPU/GPU to near-minimum when running the bouncing ball test for the first time).
	As the system quiesces and the clocks drop closer to idle, the required CPU/GPU
	time per frame increases. You can watch the ball and see things jank, and you'll
	be able to see missed frames in systrace as well.</p>

	<p>Because the workload is so consistent, you can identify most sources of
	jitter much more easily than in most user-visible workloads by tracking what
	exactly is running on the system during each missed frame instead of the UI
	pipeline. <strong>The lower clocks amplify the effects of jitter by making it
	more likely that any jitter causes a dropped frame.</strong> As a result, the
	closer TouchLatency is to 60FPS, the less likely you are to have bad system
	behaviors that cause sporadic, hard-to-reproduce jank in larger
	applications.</p>

	<p>As jitter is often (but not always) clockspeed-invariant, use a test that
	runs at very low clocks to diagnose jitter for the following reasons:</p>
	<ul>
	<li>Not all jitter is clockspeed-invariant; many sources just consume CPU
	time.</li>
	<li>The governor should get the average frame time close to the deadline by
	clocking down, so time spent running non-UI work can push it over the edge to
	dropping a frame.</li>
	</ul>

	</body>
	</html>