| page.title=Analyzing Rendering with Profile GPU |
| page.metaDescription=Use the Profile GPU tool to help you optimize your app's rendering performance. |
| |
| meta.tags="power" |
| page.tags="power" |
| |
| @jd:body |
| |
| <div id="qv-wrapper"> |
| <div id="qv"> |
| |
| <h2>In this document</h2> |
| <ol> |
| <li> |
| <a href="#visrep">Visual Representation</a></li> |
| </li> |
| |
| <li> |
| <a href="#sam">Stages and Their Meanings</a> |
| |
| <ul> |
| <li> |
| <a href="#sv">Input Handling</a> |
| </li> |
| <li> |
| <a href="#asd">Animation</a> |
| </li> |
| <li> |
| <a href="#asd">Measurement/Layout</a> |
| </li> |
| <li> |
| <a href="#asd">Drawing</a> |
| </li> |
| </li> |
| <li> |
| <a href="#asd">Sync/Upload</a> |
| </li> |
| <li> |
| <a href="#asd">Issuing Commands</a> |
| </li> |
| <li> |
| <a href="#asd">Processing/Swapping Buffer</a> |
| </li> |
| <li> |
| <a href="#asd">Miscellaneous</a> |
| </li> |
| </ul> |
| </li> |
| </ol> |
| </div> |
| </div> |
| |
| <p> |
| The <a href="/studio/profile/dev-options-rendering.html"> |
| Profile GPU Rendering</a> tool indicates the relative time that each stage of |
| the rendering pipeline takes to render the previous frame. This knowledge |
| can help you identify bottlenecks in the pipeline, so that you |
| can know what to optimize to improve your app's rendering performance. |
| </p> |
| |
| <p> |
| This page briefly explains what happens during each pipeline stage, and |
| discusses issues that can cause bottlenecks there. Before reading |
| this page, you should be familiar with the information presented in the |
| <a href="/studio/profile/dev-options-rendering.html">Profile GPU |
| Rendering Walkthrough</a>. In addition, to understand how all of the |
| stages fit together, it may be helpful to review |
| <a href="https://www.youtube.com/watch?v=we6poP0kw6E&index=64&list=PLWz5rJ2EKKc9CBxr3BVjPTPoDPLdPIFCE"> |
| how the rendering pipeline works.</a> |
| </p> |
| |
| <h2 id="#visrep">Visual Representation</h2> |
| |
| <p> |
| The Profile GPU Rendering tool displays stages and their relative times in the |
| form of a graph: a color-coded histogram. Figure 1 shows an example of |
| such a display. |
| </p> |
| |
| <img src="{@docRoot}topic/performance/images/bars.png"> |
| <p class="img-caption"> |
| <strong>Figure 1.</strong> Profile GPU Rendering Graph |
| </p> |
| |
| </p> |
| |
| <p> |
| Each segment of each vertical bar displayed in the Profile GPU Rendering |
| graph represents a stage of the pipeline and is highlighted using a specific |
| color in |
| the bar graph. Figure 2 shows a key to the meaning of each displayed color. |
| </p> |
| |
| <img src="{@docRoot}topic/performance/images/s-profiler-legend.png"> |
| <p class="img-caption"> |
| <strong>Figure 2.</strong> Profile GPU Rendering Graph Legend |
| </p> |
| |
| <p> |
| Once you understand what each color signfiies, |
| you can target specific aspects of your |
| app to try to optimize its rendering performance. |
| </p> |
| |
| <h2 id="sam">Stages and Their Meanings</a></h2> |
| |
| <p> |
| This section explains what happens during each stage corresponding |
| to a color in Figure 2, as well as bottleneck causes to look out for. |
| </p> |
| |
| |
| <h3 id="ih">Input Handling</h3> |
| |
| <p> |
| The input handling stage of the pipeline measures how long the app |
| spent handling input events. This metric indicates how long the app |
| spent executing code called as a result of input event callbacks. |
| </p> |
| |
| <h4>When this segment is large</h4> |
| |
| <p> |
| High values in this area are typically a result of too much work, or |
| too-complex work, occurring inside the input-handler event callbacks. |
| Since these callbacks always occur on the main thread, solutions to this |
| problem focus on optimizing the work directly, or offloading the work to a |
| different thread. |
| </p> |
| |
| <p> |
| It’s also worth noting that {@link android.support.v7.widget.RecyclerView} |
| scrolling can appear in this phase. |
| {@link android.support.v7.widget.RecyclerView} scrolls immediately when it |
| consumes the touch event. As a result, |
| it can inflate or populate new item views. For this reason, it’s important to |
| make this operation as fast as possible. Profiling tools like Traceview or |
| Systrace can help you investigate further. |
| </p> |
| |
| <h3 id="at">Animation</h3> |
| |
| <p> |
| The Animations phase shows you just how long it took to evaluate all the |
| animators that were running in that frame. The most common animators are |
| {@link android.animation.ObjectAnimator}, |
| {@link android.view.ViewPropertyAnimator}, and |
| <a href="/training/transitions/overview.html">Transitions</a>. |
| </p> |
| |
| <h4>When this segment is large</h4> |
| |
| <p> |
| High values in this area are typically a result of work that’s executing due |
| to some property change of the animation. For example, a fling animation, |
| which scrolls your {@link android.widget.ListView} or |
| {@link android.support.v7.widget.RecyclerView}, causes large amounts of view |
| inflation and population. |
| </p> |
| |
| <h3 id="ml">Measurement/Layout</h3> |
| |
| <p> |
| In order for Android to draw your view items on the screen, it executes |
| two specific operations across layouts and views in your view hierarchy. |
| </p> |
| |
| <p> |
| First, the system measures the view items. Every view and layout has |
| specific data that describes the size of the object on the screen. Some views |
| can have a specific size; others have a size that adapts to the size |
| of the parent layout container |
| </p> |
| |
| <p> |
| Second, the system lays out the view items. Once the system calculates |
| the sizes of children views, the system can proceed with layout, sizing |
| and positioning the views on the screen. |
| </p> |
| |
| <p> |
| The system performs measurement and layout not only for the views to be drawn, |
| but also for the parent hierarchies of those views, all the way up to the root |
| view. |
| </p> |
| |
| <h4>When this segment is large</h4> |
| |
| <p> |
| If your app spends a lot of time per frame in this area, it is |
| usually either because of the sheer volume of views that need to be |
| laid out, or problems such as |
| <a href="/topic/performance/optimizing-view-hierarchies.html#double"> |
| double taxation</a> at the wrong spot in your |
| hierarchy. In either of these cases, addressing performance involves |
| <a href="/topic/performance/optimizing-view-hierarchies.html">improving |
| the performance of your view hierarchies</a>. |
| </p> |
| |
| <p> |
| Code that you’ve added to |
| {@link android.view.View#onLayout(boolean, int, int, int, int)} or |
| {@link android.view.View#onMeasure(int, int)} |
| can also cause performance |
| issues. <a href="/studio/profile/traceview.html">Traceview</a> and |
| <a href="/studio/profile/systrace.html">Systrace</a> can help you examine |
| the callstacks to identify problems your code may have. |
| </p> |
| |
| <h3 id="draw">Drawing</h3> |
| |
| <p> |
| The draw stage translates a view’s rendering operations, such as drawing |
| a background or drawing text, into a sequence of native drawing commands. |
| The system captures these commands into a display list. |
| </p> |
| |
| <p> |
| The Draw bar records how much time it takes to complete capturing the commands |
| into the display list, for all the views that needed to be updated on the screen |
| this frame. The measured time applies to any code that you have added to the UI |
| objects in your app. Examples of such code may be the |
| {@link android.view.View#onDraw(android.graphics.Canvas) onDraw()}, |
| {@link android.view.View#dispatchDraw(android.graphics.Canvas) dispatchDraw()}, |
| and the various <code>draw ()methods</code> belonging to the subclasses of the |
| {@link android.graphics.drawable.Drawable} class. |
| </p> |
| |
| <h4>When this segment is large</h4> |
| |
| <p> |
| In simplified terms, you can understand this metric as showing how long it took |
| to run all of the calls to |
| {@link android.view.View#onDraw(android.graphics.Canvas) onDraw()} |
| for each invalidated view. This |
| measurement includes any time spent dispatching draw commands to children and |
| drawables that may be present. For this reason, when you see this bar spike, the |
| cause could be that a bunch of views suddenly became invalidated. Invalidation |
| makes it necessary to regenerate views' display lists. Alternatively, a |
| lengthy time may be the result of a few custom views that have some extremely |
| complex logic in their |
| {@link android.view.View#onDraw(android.graphics.Canvas) onDraw()} methods. |
| </p> |
| |
| <h3 id="su">Sync/Upload</h3> |
| |
| <p> |
| The Sync & Upload metric represents the time it takes to transfer |
| bitmap objects from CPU memory to GPU memory during the current frame. |
| </p> |
| |
| <p> |
| As different processors, the CPU and the GPU have different RAM areas |
| dedicated to processing. When you draw a bitmap on Android, the system |
| transfers the bitmap to GPU memory before the GPU can render it to the |
| screen. Then, the GPU caches the bitmap so that the system doesn’t need to |
| transfer the data again unless the texture gets evicted from the GPU texture |
| cache. |
| </p> |
| |
| <p class="note"><strong>Note:</strong> On Lollipop devices, this stage is |
| purple. |
| </p> |
| |
| <h4>When this segment is large</h4> |
| |
| <p> |
| All resources for a frame need to reside in GPU memory before they can be |
| used to draw a frame. This means that a high value for this metric could mean |
| either a large number of small resource loads or a small number of very large |
| resources. A common case is when an app displays a single bitmap that’s |
| close to the size of the screen. Another case is when an app displays a |
| large number of thumbnails. |
| </p> |
| |
| <p> |
| To shrink this bar, you can employ techniques such as: |
| </p> |
| |
| <ul> |
| <li> |
| Ensuring your bitmap resolutions are not much larger than the size at which they |
| will be displayed. For example, your app should avoid displaying a 1024x1024 |
| image as a 48x48 image. |
| </li> |
| |
| <li> |
| Taking advantage of {@link android.graphics.Bitmap#prepareToDraw()} |
| to asynchronously pre-upload a bitmap before the next sync phase. |
| </li> |
| </ul> |
| |
| <h3 id="ic">Issuing Commands</h3> |
| |
| <p> |
| The <em>Issue Commands</em> segment represents the time it takes to issue all |
| of the commands necessary for drawing display lists to the screen. |
| </p> |
| |
| <p> |
| For the system to draw display lists to the screen, it sends the |
| necessary commands to the GPU. Typically, it performs this action through the |
| <a href="/guide/topics/graphics/opengl.html">OpenGL ES</a> API. |
| </p> |
| |
| <p> |
| This process takes some time, as the system performs final transformation |
| and clipping for each command before sending the command to the GPU. Additional |
| overhead then arises on the GPU side, which computes the final commands. These |
| commands include final transformations, and additional clipping. |
| </p> |
| |
| <h4>When this segment is large</h4> |
| |
| <p> |
| The time spent in this stage is a direct measure of the complexity and |
| quantity of display lists that the system renders in a given |
| frame. For example, having many draw operations, especially in cases where |
| there's a small inherent cost to each draw primitive, could inflate this time. |
| For example: |
| </p> |
| |
| <pre> |
| for (int i = 0; i < 1000; i++) |
| canvas.drawPoint() |
| </pre> |
| |
| <p> |
| is a lot more expensive to issue than: |
| </p> |
| |
| <pre> |
| canvas.drawPoints(mThousandPointArray); |
| </pre> |
| |
| <p> |
| There isn’t always a 1:1 correlation between issuing commands and |
| actually drawing display lists. Unlike <em>Issue Commands</em>, |
| which captures the time it takes to send drawing commands to the GPU, |
| the <em>Draw</em> metric represents the time that it took to capture the issued |
| commands into the display list. |
| </p> |
| |
| <p> |
| This difference arises because the display lists are cached by |
| the system wherever possible. As a result, there are situations where a |
| scroll, transform, or animation requires the system to re-send a display |
| list, but not have to actually rebuild it—recapture the drawing |
| commands—from scratch. As a result, you can see a high “Issue |
| commands” bar without seeing a high <em>Draw commands</em> bar. |
| </p> |
| |
| <h3 id="psb">Processing/Swapping Buffers</h3> |
| |
| <p> |
| Once Android finishes submitting all its display list to the GPU, |
| the system issues one final command to tell the graphics driver that it's |
| done with the current frame. At this point, the driver can finally present |
| the updated image to the screen. |
| </p> |
| |
| <h4>When this segment is large</h4> |
| |
| <p> |
| It’s important to understand that the GPU executes work in parallel with the |
| CPU. The Android system issues draw commands to the GPU, and then moves on to |
| the next task. The GPU reads those draw commands from a queue and processes |
| them. |
| </p> |
| |
| <p> |
| In situations where the CPU issues commands faster than the GPU |
| consumes them, the communications queue between the processors can become |
| full. When this occurs, the CPU blocks, and waits until there is space in the |
| queue to place the next command. This full-queue state arises often during the |
| <em>Swap Buffers</em> stage, because at that point, a whole frame’s worth of |
| commands have been submitted. |
| </p> |
| |
| </p> |
| The key to mitigating this problem is to reduce the complexity of work occurring |
| on the GPU, in similar fashion to what you would do for the “Issue Commands” |
| phase. |
| </p> |
| |
| |
| <h3 id="mt">Miscellaneous</h3> |
| |
| <p> |
| In addition to the time it takes the rendering system to perform its work, |
| there’s an additional set of work that occurs on the main thread and has |
| nothing to do with rendering. Time that this work consumes is reported as |
| <em>misc time</em>. Misc time generally represents work that might be occurring |
| on the UI thread between two consecutive frames of rendering. |
| </p> |
| |
| <h4>When this segment is large</h4> |
| |
| <p> |
| If this value is high, it is likely that your app has callbacks, intents, or |
| other work that should be happening on another thread. Tools such as |
| <a href="/studio/profile/traceview.html">Method |
| Tracing</a> or <a href="/studio/profile/systrace.html">Systrace</a> can provide |
| visibility into the tasks that are running on |
| the main thread. This information can help you target performance improvements. |
| </p> |