Android async and non-blocking API guidelines

go/android-api-guidelines-async

Non-blocking APIs request work to happen and then yield control back to the calling thread so that it can perform other work before the completion of the requested operation. They are useful for cases where the requested work might be long-running or may require waiting for I/O, IPC, highly contended system resources to become available, or even user input before work can proceed. Especially well-behaved APIs will provide a way to cancel the operation in progress and stop work from being performed on the original caller's behalf, preserving system health and battery life when the operation is no longer needed.

Asynchronous APIs are one way of achieving non-blocking behavior. Async APIs accept some form of continuation or callback that will be notified when the operation is complete, or of other events during the operation's progress.

There are two primary motivations for writing an asynchronous API:

Executing multiple operations concurrently, where an Nth operation must be initiated before the N-1th operation completes
Avoiding blocking a calling thread until an operation is complete

Kotlin strongly promotes structured concurrency, a series of principles and APIs built on suspend functions that decouple synchronous/asynchronous execution of code from thread-blocking behavior. Suspend functions are non-blocking and synchronous.

Suspend functions:

Do not block their calling thread and instead yield their execution thread under the hood while awaiting the results of operations executing elsewhere
Execute synchronously and do not require the caller of a non-blocking API to continue executing concurrently with non-blocking work initiated by the API call.

This document details a minimum baseline of expectations developers may safely hold when working with non-blocking and asynchronous APIs, followed by a series of recipes for authoring APIs that meet these expectations in the Kotlin or in Java languages, in the Android platform or Jetpack libraries. When in doubt, consider the developer expectations as requirements for any new API surface.

Developer expectations for async APIs

The following expectations are written from the standpoint of non-suspend APIs unless otherwise noted.

APIs that accept callbacks are usually asynchronous

If an API accepts a callback that is not documented to only ever be called in-place, (that is, called only by the calling thread before the API call itself returns,) the API is assumed to be asynchronous and that API should meet all other expectations documented below.

An example of a callback that is only ever called in-place is a higher-order map or filter function that invokes a mapper or predicate on each item in a collection before returning.

Asynchronous APIs should return as quickly as possible

Developers expect async APIs to be non-blocking and return quickly after initiating the request for the operation. It should always be safe to call an async API at any time, and calling an async API should never result in janky frames or ANR.

Many operations and lifecycle signals can be triggered by the platform or libraries on-demand, and expecting a developer to hold global knowledge of all potential call sites for their code is unsustainable. For example, a Fragment can be added to the FragmentManager in a synchronous transaction in response to View measurement and layout when app content must be populated to fill available space. (e.g. RecyclerView.) A LifecycleObserver responding to this fragment's onStart lifecycle callback may reasonably perform one-time startup operations here, and this may be on a critical code path for producing a frame of animation free of jank. A developer should always feel confident that calling any async API in response to these kinds of lifecycle callbacks will not be the cause of a janky frame.

This implies that the work performed by an async API before returning must be very lightweight; creating a record of the request and associated callback and registering it with the execution engine that will perform the work at most. If registering for an async operation requires IPC, the API's implementation should take whatever measures are necessary to meet this developer expectation. This may include one or more of:

Implementing an underlying IPC as a oneway binder call
Making a two-way binder call into the system server where completing the registration does not require taking a highly contended lock
Posting the request to a worker thread in the app process to perform a blocking registration over IPC

Asynchronous APIs should return void and only throw for invalid arguments

Async APIs should report all results of the requested operation to the provided callback. This allows the developer to implement a single code path for success and error handling.

Async APIs may check arguments for null and throw NullPointerException, or check that provided arguments are within a valid range and throw IllegalArgumentException. e.g. for a function that accepts a float in the range of 0-1f, the function may check that the parameter is within this range and throw IllegalArgumentException if it is out of range, or a short String may be checked for conformance to a valid format such as alphanumerics-only. (Remember that the system server should never trust the app process! Any system service should duplicate these checks in the system service itself.)

All other errors should be reported to the provided callback. This includes, but is not limited to:

Terminal failure of the requested operation
Security exceptions for missing authorization/permissions required to complete the operation
Exceeded quota for performing the operation
App process is not sufficiently “foreground” to perform the operation
Required hardware has been disconnected
Network failures
Timeouts
Binder death/unavailable remote process

Asynchronous APIs should provide a cancellation mechanism

Async APIs should provide a way to indicate to a running operation that the caller no longer cares about the result. This cancel operation should signal two things:

Hard references to callbacks provided by the caller should be released

Callbacks provided to async APIs may contain hard references to large object graphs, and ongoing work holding a hard reference to that callback can keep those object graphs from being garbage collected. By releasing these callback references on cancellation, these object graphs may become eligible for garbage collection much sooner than if the work were permitted to run to completion.

The execution engine performing work for the caller may stop that work

Work initiated by async API calls may carry a high cost in power consumption or other system resources. APIs that allow callers to signal when this work is no longer needed permit stopping that work before it can consume further system resources.

Special considerations for Cached or Frozen apps

When designing asynchronous APIs where callbacks originate in a system process and are delivered to apps, consider the following:

Processes and app lifecycle: the recipient app process may be in the cached state.
Cached Apps Freezer: the recipient app process may be frozen.

When an app process enters the cached state, this means that it's not currently hosting any user-visible components such as Activities and Services. The app is kept in memory in case it becomes user-visible again, but in the meantime should not be doing work. In most cases, you should pause dispatching app callbacks when that app enters the cached state and resume when the app exits the cached state, so as to not induce work in cached app processes.

A cached app may also be frozen. When an app is frozen, it receives zero CPU time and is not able to do any work at all. Any calls to that app's registered callbacks will be buffered and delivered when the app is unfrozen.

Buffered transactions to app callbacks may be stale by the time that the app is unfrozen and processes them. The buffer is finite, and if overflown would cause the recipient app to crash. To avoid overwhelming apps with stale events or overflowing their buffers, don't dispatch app callbacks while their process is frozen.

In review:

You should consider pausing dispatching app callbacks while the app's process is cached.
You MUST pause dispatching app callbacks while the app's process is frozen.

Registering for all states

To track when apps enters or exit the cached state:

mActivityManager.addOnUidImportanceListener(
    new UidImportanceListener() { ... },
    IMPORTANCE_CACHED);

For example, see ag/20754479 Defer sending display events to cached apps.

To track when apps are frozen or unfrozen:

IBinder binder = <...>;
binder.addFrozenStateChangeCallback(executor, callback);

Example change: ag/30850473 DisplayManagerService listens for frozen binder updates.

Strategies for resuming dispatching app callbacks

Whether you pause dispatching app callbacks when the app enters the cached state or the frozen state, when the app exits the respective state you should resume dispatching the app's registered callbacks once the app exits the respective state until the app has unregistered its callback or the app process dies.

For example:

IBinder binder = <...>;
bool shouldSendCallbacks = true;
binder.addFrozenStateChangeCallback(executor, (who, state) -> {
    if (state == IBinder.FrozenStateChangeCallback.STATE_FROZEN) {
        shouldSendCallbacks = false;
    } else if (state == IBinder.FrozenStateChangeCallback.STATE_UNFROZEN) {
        shouldSendCallbacks = true;
    }
});

Alternatively, you can use RemoteCallbackList which takes care of not delivering callbacks to the target process when it is frozen.

For example:

RemoteCallbackList<IInterface> rc =
        new RemoteCallbackList.Builder<IInterface>(
                        RemoteCallbackList.FROZEN_CALLEE_POLICY_DROP)
                .setExecutor(executor)
                .build();
rc.register(callback);
rc.broadcast((callback) -> callback.foo(bar));

callback.foo() would only be invoked if the process is not frozen.

Apps often save updates they received via callbacks as a snapshot of the latest state. Consider a hypothetical API for apps to monitor the remaining battery percentage:

interface BatteryListener {
    void onBatteryPercentageChanged(int newPercentage);
}

Consider the scenario where multiple state change events happen when an app is frozen. When the app is unfrozen, you should deliver only the most recent state to the app and drop other stale state changes. This delivery should happen immediately when the app is unfrozen so the app can “catch up”. This can be achieved as following:

RemoteCallbackList<IInterface> rc =
        new RemoteCallbackList.Builder<IInterface>(
                        RemoteCallbackList.FROZEN_CALLEE_POLICY_ENQUEUE_MOST_RECENT)
                .setExecutor(executor)
                .build();
rc.register(callback);
rc.broadcast((callback) -> callback.onBatteryPercentageChanged(value));

In some cases, you may track the last value delivered to the app so the app doesn't need to be notified of the same value once it is unfrozen.

State may be expressed as more complex data. Consider a hypothetical API for apps to be notified of network interfaces:

interface NetworkListener {
    void onAvailable(Network network);
    void onLost(Network network);
    void onChanged(Network network);
}

When pausing notifications to an app, you should remember the set of networks and states that the app had last seen. Upon resuming, it's recommended to notify the app of old networks that were lost, of new networks that became available, and of existing networks whose state had changed - in this order.

Do not notify the app of networks that were made available and then lost while callbacks were paused. Apps should not receive a full account of events that happened while they were frozen, and API documentation should not promise to deliver event streams uninterrupted outside of explicit lifecycle states. In this example, if the app needs to continuously monitor network availability then it must remain in a lifecycle state that keeps it from becoming cached or frozen.

In review, you should coalesce events that had happened after pausing and before resuming notifications and deliver the latest state to the registered app callbacks succinctly.

Considerations for developer documentation

Delivery of async events may be delayed, either because the sender paused delivery for a period of time as shown above or because the recipient app did not receive enough device resources to process the event in a timely way.

Discourage developers from making assumptions on the time between when their app is notified of an event and the time that the event actually happened.

Developer expectations for suspending APIs

Developers familiar with Kotlin's structured concurrency expect the following behaviors from any suspending API:

Suspend functions should complete all associated work before returning or throwing

Results of non-blocking operations are returned as normal function return values, and errors are reported by throwing exceptions. (This often means that callback parameters are unnecessary.)

Suspend functions should only invoke callback parameters in-place

Since suspend functions should always complete all associated work before returning, they should never invoke a provided callback or other function parameter or retain a reference to it after the suspend function has returned.

Suspend functions that accept callback parameters should be context-preserving unless otherwise documented

Calling a function in a suspend function causes it to run in the CoroutineContext of the caller. As suspend functions should complete all associated work before returning or throwing, and should only invoke callback parameters in-place, the default expectation is that any such callbacks are also run on the calling CoroutineContext using its associated dispatcher. If the API's purpose is to run a callback outside of the calling CoroutineContext, this behavior should be clearly documented.

Suspend functions should support kotlinx.coroutines Job cancellation

Any suspend function offered should cooperate with job cancellation as defined by kotlinx.coroutines. If the calling Job of an operation in progress is cancelled, the function should resume with a CancellationException as soon as possible so that the caller can clean up and continue as soon as possible. This is handled automatically by suspendCancellableCoroutine and other suspending APIs offered by kotlinx.coroutines. Library implementations generally should not use suspendCoroutine directly, as it does not support this cancellation behavior by default.

Suspend functions that perform blocking work on a background (non-main or UI thread) must provide a way to configure the dispatcher used

It is not recommended to make a blocking function suspend entirely to switch threads. For more information see Android API guidelines.

Calling a suspend function should not result in the creation of additional threads without permitting the developer to supply their own thread or thread pool to perform that work. For example, a constructor may accept a CoroutineContext that will be used to perform background work for the class's methods.

Suspend functions that would accept an optional CoroutineContext or Dispatcher parameter only to switch to that dispatcher to perform blocking work should instead expose the underlying blocking function and recommend that calling developers use their own call to withContext to direct the work to a desired dispatcher.

Classes launching coroutines

Classes that launch coroutines must have a CoroutineScope to perform those launch operations. Respecting structured concurrency principles implies the following structural patterns for obtaining and managing that scope.

Before writing a class that launches concurrent tasks into another scope, consider alternative patterns:

class MyClass {
    private val requests = Channel<MyRequest>(Channel.UNLIMITED)

    suspend fun handleRequests() {
        coroutineScope {
            for (request in requests) {
                // Allow requests to be processed concurrently;
                // alternatively, omit the [launch] and outer [coroutineScope]
                // to process requests serially
                launch {
                    processRequest(request)
                }
            }
        }
    }

    fun submitRequest(request: MyRequest) {
        requests.trySend(request).getOrThrow()
    }
}

Exposing a suspend fun to perform concurrent work allows the caller to invoke the operation in their own context, removing the need to have MyClass manage a CoroutineScope. Serializing the processing of requests becomes simpler and state can often exist as local variables of handleRequests instead of as class properties that would otherwise require additional synchronization.

Classes that manage coroutines should expose a `close()` and/or `cancel()` method

Classes that launch coroutines as implementation details must offer a way to cleanly shut down those ongoing concurrent tasks so that they do not leak uncontrolled concurrent work into a parent scope. Typically this takes the form of creating a child Job of a provided CoroutineContext:

private val myJob = Job(parent = coroutineContext[Job])
private val myScope = CoroutineScope(coroutineContext + myJob)

fun cancel() {
    myJob.cancel()
}

A join() method may also be provided to allow user code to await the completion of any outstanding concurrent work being performed by the object. (This may include cleanup work performed by cancelling an operation.)

suspend fun join() {
    myJob.join()
}

Naming terminal operations

The name used for methods that cleanly shut down concurrent tasks owned by an object that are still in progress should reflect the behavioral contract of how shutdown will occur:

Use close() when operations in progress will be allowed to complete but no new operations may begin after the call to close() returns.

Use cancel() when operations in progress may be cancelled before completing. No new operations may begin after the call to cancel() returns.

Class constructors accept `CoroutineContext`, not `CoroutineScope`

When objects are forbidden from launching directly into a provided parent scope, the suitability of CoroutineScope as a constructor parameter breaks down:

// Don't do this
class MyClass(scope: CoroutineScope) {
    private val myJob = Job(parent = scope.coroutineContext[Job])
    private val myScope = CoroutineScope(scope.coroutineContext + myJob)

    // ... the [scope] constructor parameter is never used again
}

The CoroutineScope becomes an unnecessary and misleading wrapper that in some use cases may be constructed solely to pass as a constructor parameter, only to be discarded:

// Don't do this; just pass the context
val myObject = MyClass(CoroutineScope(parentScope.coroutineContext + Dispatchers.IO))

`CoroutineContext` parameters default to `EmptyCoroutineContext`

When an optional CoroutineContext parameter appears in an API surface the default value must be the EmptyCoroutineContext sentinel. This allows for better composition of API behaviors, as an EmptyCoroutineContext value from a caller is treated in the same way as accepting the default:

class MyOuterClass(
    coroutineContext: CoroutineContext = EmptyCoroutineContext
) {
    private val innerObject = MyInnerClass(coroutineContext)

    // ...
}

class MyInnerClass(
    coroutineContext: CoroutineContext = EmptyCoroutineContext
) {
    private val job = Job(parent = coroutineContext[Job])
    private val scope = CoroutineScope(coroutineContext + job)

    // ...
}