Add "Avoiding priority inversion"

Bug: 8359315
Change-Id: I67db26f4fb8ac50ecd6fc090f1dcbd4b8c262a36
diff --git a/src/devices/audio_avoiding_pi.jd b/src/devices/audio_avoiding_pi.jd
new file mode 100644
index 0000000..184a150
--- /dev/null
+++ b/src/devices/audio_avoiding_pi.jd
@@ -0,0 +1,309 @@
+page.title=Avoiding Priority Inversion
+@jd:body
+
+<div id="qv-wrapper">
+  <div id="qv">
+    <h2>In this document</h2>
+    <ol id="auto-toc">
+    </ol>
+  </div>
+</div>
+
+<p>
+This article explains how the Android's audio system attempts to avoid
+priority inversion, as of the Android 4.1 (Jellybean) release,
+and highlights techniques that you can use too.
+</p>
+
+<p>
+These techniques may be useful to developers of high-performance
+audio apps, OEMs, and SoC providers who are implementing an audio
+HAL. Please note that implementing these techniques is not
+guaranteed to prevent glitches or other failures, particularly if
+used outside of the audio context.
+Your results may vary and you should conduct your own
+evaluation and testing.
+</p>
+
+<h2 id="background">Background</h2>
+
+<p>
+The Android audio server "AudioFlinger" and AudioTrack/AudioRecord
+client implementation are being re-architected to reduce latency.
+This work started in Android 4.1 (Jellybean), continued in 4.2
+(Jellybean MR1), and more improvements are likely in "K".
+</p>
+
+<p>
+The lower latency needed many changes throughout the system. One
+important change was to  assign CPU resources to time-critical
+threads with a more predictable scheduling policy. Reliable scheduling
+allows the audio buffer sizes and counts to be reduced, while still
+avoiding artifacts due to underruns.
+</p>
+
+<h2 id="priorityInversion">Priority Inversion</h2>
+
+<p>
+<a href="http://en.wikipedia.org/wiki/Priority_inversion">Priority inversion</a>
+is a classic failure mode of real-time systems,
+where a higher-priority task is blocked for an unbounded time waiting
+for a lower-priority task to release a resource such as [shared
+state protected by] a
+<a href="http://en.wikipedia.org/wiki/Mutual_exclusion">mutex</a>.
+</p>
+
+<p>
+In an audio system, priority inversion typically manifests as a
+<a href="http://en.wikipedia.org/wiki/Glitch">glitch</a>
+(click, pop, dropout),
+<a href="http://en.wikipedia.org/wiki/Max_Headroom_(character)">repeated audio</a>
+when circular buffers
+are used, or delay in responding to a command.
+</p>
+
+<p>
+In the Android audio implementation, priority inversion is most
+likely to occur in these places, and so we focus attention here:
+</p>
+
+<ul>
+
+<li>
+between normal mixer thread and fast mixer thread in AudioFlinger
+</li>
+
+<li>
+between application callback thread for a fast AudioTrack and
+fast mixer thread (they both have elevated priority, but slightly
+different priorities)
+</li>
+
+<li>
+within the audio HAL implementation, e.g. for telephony or echo cancellation
+</li>
+
+<li>
+within the audio driver in kernel
+</li>
+
+<li>
+between AudioTrack callback thread and other app threads (this is out of our control)
+</li>
+
+</ul>
+
+<p>
+As of this writing, reduced latency for AudioRecord is planned but
+not yet implemented. The likely priority inversion spots will be
+similar to those for AudioTrack.
+</p>
+
+<h2 id="commonSolutions">Common Solutions</h2>
+
+<p>
+The typical solutions listed in the Wikipedia article include:
+</p>
+
+<ul>
+
+<li>
+disabling interrupts
+</li>
+
+<li>
+priority inheritance mutexes
+</li>
+
+</ul>
+
+<p>
+Disabling interrupts is not feasible in Linux user space, and does
+not work for SMP.
+</p>
+
+
+<p>
+Priority inheritance
+<a href="http://en.wikipedia.org/wiki/Futex">futexes</a>
+(fast user-space mutexes) are available
+in Linux kernel, but are not currently exposed by the Android C
+runtime library
+<a href="http://en.wikipedia.org/wiki/Bionic_(software)">Bionic</a>.
+We chose not to use them in the audio system
+because they are relatively heavyweight, and because they rely on
+a trusted client.
+</p>
+
+<h2 id="androidTechniques">Techniques used by Android</h2>
+
+<p>
+We started with "try lock" and lock with timeout. These are
+non-blocking and bounded blocking variants of the mutex lock
+operation. Try lock and lock with timeout worked fairly well for
+us, but were susceptible to a couple of obscure failure modes: the
+server was not guaranteed to be able to access the shared state if
+the client happened to be busy, and the cumulative timeout could
+be too long if there was a long sequence of unrelated locks that
+all timed out.
+</p>
+
+
+<p>
+We also use
+<a href="http://en.wikipedia.org/wiki/Linearizability">atomic operations</a>
+such as:
+</p>
+
+<ul>
+<li>increment</li>
+<li>bitwise "or"</li>
+<li>bitwise "and"</li>
+</ul>
+
+<p>
+All of these return the previous value, and include the necessary
+SMP barriers. The disadvantage is they can require unbounded retries.
+In practice, we've found that the retries are not a problem.
+</p>
+
+<p>
+Note: atomic operations and their interactions with memory barriers
+are notoriously badly misunderstood and used incorrectly. We include
+these here for completeness, but recommend you also read the article
+<a href="https://developer.android.com/training/articles/smp.html">
+SMP Primer for Android</a>
+for further information.
+</p>
+
+<p>
+We still have and use most of the above tools, and have recently
+added these techniques:
+</p>
+
+<ul>
+
+<li>
+Use non-blocking single-reader single-writer
+<a href="http://en.wikipedia.org/wiki/Circular_buffer">FIFO queues</a>
+for data.
+</li>
+
+<li>
+Try to
+<i>copy</i>
+state rather than
+<i>share</i>
+state between high- and
+low-priority modules.
+</li>
+
+<li>
+When state does need to be shared, limit the state to the
+maximum-size
+<a href="http://en.wikipedia.org/wiki/Word_(computer_architecture)">word</a>
+that can be accessed atomically in one bus operation
+without retries.
+</li>
+
+<li>
+For complex multi-word state, use a state queue. A state queue
+is basically just a non-blocking single-reader single-writer FIFO
+queue used for state rather than data, except the writer collapses
+adjacent pushes into a single push.
+</li>
+
+<li>
+Pay attention to
+<a href="http://en.wikipedia.org/wiki/Memory_barrier">memory barriers</a>
+for SMP correctness.
+</li>
+
+<li>
+<a href="http://en.wikipedia.org/wiki/Trust,_but_verify">Trust, but verify</a>.
+When sharing
+<i>state</i>
+between processes, don't
+assume that the state is well-formed. For example, check that indices
+are within bounds. This verification isn't needed between threads
+in the same process, between mutual trusting processes (which
+typically have the same UID). It's also unnecessary for shared
+<i>data</i>
+such as PCM audio where a corruption is inconsequential.
+</li>
+
+</ul>
+
+<h2 id="nonBlockingAlgorithms">Non-Blocking Algorithms</h2>
+
+<p>
+<a href="http://en.wikipedia.org/wiki/Non-blocking_algorithm">Non-blocking algorithms</a>
+have been a subject of much recent study.
+But with the exception of single-reader single-writer FIFO queues,
+we've found them to be complex and error-prone.
+</p>
+
+<p>
+In Android 4.2 (Jellybean MR1), you can find our non-blocking,
+single-reader/writer classes in these locations:
+</p>
+
+<ul>
+
+<li>
+frameworks/av/include/media/nbaio/
+</li>
+
+<li>
+frameworks/av/media/libnbaio/
+</li>
+
+<li>
+frameworks/av/services/audioflinger/StateQueue*
+</li>
+
+</ul>
+
+<p>
+These were designed specifically for AudioFlinger and are not
+general-purpose. Non-blocking algorithms are notorious for being
+difficult to debug. You can look at this code as a model, but be
+aware there may be bugs, and the classes are not guaranteed to be
+suitable for other purposes.
+</p>
+
+<p>
+For developers, we may update some of the sample OpenSL ES application
+code to use non-blocking, or referencing a non-Android open source
+library.
+</p>
+
+<h2 id="tools">Tools</h2>
+
+<p>
+To the best of our knowledge, there are no automatic tools for
+finding priority inversion, especially before it happens. Some
+research static code analysis tools are capable of finding priority
+inversions if able to access the entire codebase. Of course, if
+arbitrary user code is involved (as it is here for the application)
+or is a large codebase (as for the Linux kernel and device drivers),
+static analysis may be impractical. The most important thing is to
+read the code very carefully and get a good grasp on the entire
+system and the interactions. Tools such as
+<a href="http://developer.android.com/tools/help/systrace.html">systrace</a>
+and
+<code>ps -t -p</code>
+are useful for seeing priority inversion after it occurs, but do
+not tell you in advance.
+</p>
+
+<h2 id="aFinalWord">A Final Word</h2>
+
+<p>
+After all of this discussion, don't be afraid of mutexes. Mutexes
+are your friend for ordinary use, when used and implemented correctly
+in ordinary non-time-critical use cases. But between high- and
+low-priority tasks and in time-sensitive systems mutexes are more
+likely to cause trouble.
+</p>
+
diff --git a/src/devices/devices_toc.cs b/src/devices/devices_toc.cs
index 70d6da5..84e838b 100644
--- a/src/devices/devices_toc.cs
+++ b/src/devices/devices_toc.cs
@@ -34,6 +34,7 @@
         <ul>
           <li><a href="<?cs var:toroot ?>devices/audio_latency.html">Latency</a></li>
           <li><a href="<?cs var:toroot ?>devices/audio_warmup.html">Warmup</a></li>
+          <li><a href="<?cs var:toroot ?>devices/audio_avoiding_pi.html">Avoiding Priority Inversion</a></li>
         </ul>
       </li>
       <li><a href="<?cs var:toroot ?>devices/camera.html">Camera v1</a></li>