blob: 1ac9d9f3dc519f4da403755fc2525a435b70245f [file] [log] [blame]
<html><body><pre>Android NDK &amp; ARM NEON instruction set extension support
--------------------------------------------------------
Introduction:
-------------
Android NDK r3 added support for the new 'armeabi-v7a' ARM-based ABI
that allows native code to use two useful instruction set extensions:
- Thumb-2, which provides performance comparable to 32-bit ARM
instructions with similar compactness to Thumb-1
- VFPv3, which provides hardware FPU registers and computations,
to boost floating point performance significantly.
More specifically, by default 'armeabi-v7a' only supports
VFPv3-D16 which only uses/requires 16 hardware FPU 64-bit registers.
More information about this can be read in docs/CPU-ARCH-ABIS.html
The ARMv7 Architecture Reference Manual also defines another optional
instruction set extension known as "ARM Advanced SIMD", nick-named
"NEON". It provides:
- A set of interesting scalar/vector instructions and registers
(the latter are mapped to the same chip area as the FPU ones),
comparable to MMX/SSE/3DNow! in the x86 world.
- VFPv3-D32 as a requirement (i.e. 32 hardware FPU 64-bit registers,
instead of the minimum of 16).
Not all ARMv7-based Android devices will support NEON, but those that
do may benefit in significant ways from the scalar/vector instructions.
The NDK supports the compilation of modules or even specific source
files with support for NEON. What this means is that a specific compiler
flag will be used to enable the use of GCC ARM Neon intrinsics and
VFPv3-D32 at the same time. The intrinsics are described here:
http://gcc.gnu.org/onlinedocs/gcc/ARM-NEON-Intrinsics.html
LOCAL_ARM_NEON:
---------------
Define LOCAL_ARM_NEON to 'true' in your module definition, and the NDK
will build all its source files with NEON support. This can be useful if
you want to build a static or shared library that specifically contains
NEON code paths.
Using the .neon suffix:
-----------------------
When listing sources files in your LOCAL_SRC_FILES variable, you now have
the option of using the .neon suffix to indicate that you want to
corresponding source(s) to be built with Neon support. For example:
LOCAL_SRC_FILES := foo.c.neon bar.c
Will only build 'foo.c' with NEON support.
Note that the .neon suffix can be used with the .arm suffix too (used to
specify the 32-bit ARM instruction set for non-NEON instructions), but must
appear after it.
In other words, 'foo.c.arm.neon' works, but 'foo.c.neon.arm' does NOT.
Build Requirements:
------------------
Neon support only works when targeting the 'armeabi-v7a' ABI, otherwise the
NDK build scripts will complain and abort. It is important to use checks like
the following in your Android.mk:
# define a static library containing our NEON code
ifeq ($(TARGET_ARCH_ABI),armeabi-v7a)
include $(CLEAR_VARS)
LOCAL_MODULE := mylib-neon
LOCAL_SRC_FILES := mylib-neon.c
LOCAL_ARM_NEON := true
include $(BUILD_STATIC_LIBRARY)
endif # TARGET_ARCH_ABI == armeabi-v7a
Runtime Detection:
------------------
As said previously, NOT ALL ARMv7-BASED ANDROID DEVICES WILL SUPPORT NEON !
It is thus crucial to perform runtime detection to know if the NEON-capable
machine code can be run on the target device.
To do that, use the 'cpufeatures' library that comes with this NDK. To learn
more about it, see docs/CPU-FEATURES.html.
You should explicitly check that android_getCpuFamily() returns
ANDROID_CPU_FAMILY_ARM, and that android_getCpuFeatures() returns a value
that has the ANDROID_CPU_ARM_FEATURE_NEON flag set, as in:
#include &lt;cpu-features.h&gt;
...
...
if (android_getCpuFamily() == ANDROID_CPU_FAMILY_ARM &amp;&amp;
(android_getCpuFeatures() &amp; ANDROID_CPU_ARM_FEATURE_NEON) != 0)
{
// use NEON-optimized routines
...
}
else
{
// use non-NEON fallback routines instead
...
}
...
Sample code:
------------
Look at the source code for the "hello-neon" sample in this NDK for an example
on how to use the 'cpufeatures' library and Neon intrinsics at the same time.
This implements a tiny benchmark for a FIR filter loop using a C version, and
a NEON-optimized one for devices that support it.
</pre></body></html>