The ArmNN Delegate can be found within the ArmNN repository but it is a standalone piece of software. However, it makes use of the ArmNN library. For this reason we have added two options to build the delegate. The first option allows you to build the delegate together with the ArmNN library, the second option is a standalone build of the delegate.
This tutorial uses an Aarch64 machine with Ubuntu 18.04 installed that can build all components natively (no cross-compilation required). This is to keep this guide simple.
Build Dependencies:
Required Tools:
Our first step is to build all the build dependencies I have mentioned above. We will have to create quite a few directories. To make navigation a bit easier define a base directory for the project. At this stage we can also install all the tools that are required during the build.
export BASEDIR=/home cd $BASEDIR apt-get update && apt-get install git wget unzip zip python git cmake scons
Tensorflow has a few dependencies on it's own. It requires the python packages pip3, numpy, wheel, keras_preprocessing and also bazel which is used to compile Tensoflow. A description on how to build bazel can be found here. There are multiple ways. I decided to compile from source because that should work for any platform and therefore adds the most value to this guide. Depending on your operating system and architecture there might be an easier way.
# Install the python packages pip3 install -U pip numpy wheel pip3 install -U keras_preprocessing --no-deps # Bazel has a dependency on JDK apt-get install openjdk-11-jdk # Build Bazel wget -O bazel-3.1.0-dist.zip https://github.com/bazelbuild/bazel/releases/download/3.1.0/bazel-3.1.0-dist.zip unzip -d bazel bazel-3.1.0-dist.zip cd bazel env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh # This creates an "output" directory where the bazel binary can be found # Download Tensorflow cd $BASEDIR git clone https://github.com/tensorflow/tensorflow.git cd tensorflow/ git checkout tags/v2.3.1 # Minimum version required for the delegate
Before tensorflow can be built, targets need to be defined in the BUILD
file that can be found in the root directory of Tensorflow. Append the following two targets to the file:
cc_binary( name = "libtensorflow_all.so", linkshared = 1, deps = [ "//tensorflow/core:framework", "//tensorflow/core:tensorflow", "//tensorflow/cc:cc_ops", "//tensorflow/cc:client_session", "//tensorflow/cc:scope", "//tensorflow/c:c_api", ], ) cc_binary( name = "libtensorflow_lite_all.so", linkshared = 1, deps = [ "//tensorflow/lite:framework", "//tensorflow/lite/kernels:builtin_ops", ], )
Now the build process can be started. When calling “configure”, as below, a dialog shows up that asks the user to specify additional options. If you don't have any particular needs to your build, decline all additional options and choose default values. Building libtensorflow_all.so
requires quite some time. This might be a good time to get yourself another drink and take a break.
PATH="$BASEDIR/bazel/output:$PATH" ./configure $BASEDIR/bazel/output/bazel build --define=grpc_no_ares=true --config=opt --config=monolithic --strip=always --config=noaws libtensorflow_all.so $BASEDIR/bazel/output/bazel build --config=opt --config=monolithic --strip=always libtensorflow_lite_all.so
Flatbuffers is a memory efficient cross-platform serialization library as described here. It is used in tflite to store models and is also a dependency of the delegate. After downloading the right version it can be built and installed using cmake.
cd $BASEDIR wget -O flatbuffers-1.12.0.zip https://github.com/google/flatbuffers/archive/v1.12.0.zip unzip -d . flatbuffers-1.12.0.zip cd flatbuffers-1.12.0 mkdir install && mkdir build && cd build # I'm using a different install directory but that is not required cmake .. -DCMAKE_INSTALL_PREFIX:PATH=$BASEDIR/flatbuffers-1.12.0/install make install
The ArmNN library depends on the Arm Compute Library (ACL). It provides a set of functions that are optimized for both Arm CPUs and GPUs. The Arm Compute Library is used directly by ArmNN to run machine learning workloads on Arm CPUs and GPUs.
It is important to have the right version of ACL and ArmNN to make it work. Luckily, ArmNN and ACL are developed very closely and released together. If you would like to use the ArmNN version “20.11” you can use the same “20.11” version for ACL too.
To build the Arm Compute Library on your platform, download the Arm Compute Library and check the branch out that contains the version you want to use and build it using scons
.
cd $BASEDIR git clone https://review.mlplatform.org/ml/ComputeLibrary cd ComputeLibrary/ git checkout <branch_name> # e.g. branches/arm_compute_20_11 # The machine used for this guide only has a Neon CPU which is why I only have "neon=1" but if # your machine has an arm Gpu you can enable that by adding `opencl=1 embed_kernels=1 to the command below scons arch=arm64-v8a neon=1 extra_cxx_flags="-fPIC" benchmark_tests=0 validation_tests=0
After building ACL we can now continue building ArmNN. To do so, download the repository and checkout the same version as you did for ACL. Create a build directory and use cmake to build it.
cd $BASEDIR git clone "https://review.mlplatform.org/ml/armnn" cd armnn git checkout <branch_name> # e.g. branches/armnn_20_11 mkdir build && cd build # if you've got an arm Gpu add `-DARMCOMPUTECL=1` to the command below cmake .. -DARMCOMPUTE_ROOT=$BASEDIR/ComputeLibrary -DARMCOMPUTENEON=1 -DBUILD_UNIT_TESTS=0 make
The delegate as well as ArmNN is built using cmake. Create a build directory as usual and build the Delegate with the additional cmake arguments shown below
cd $BASEDIR/armnn/delegate && mkdir build && cd build cmake .. -DTENSORFLOW_LIB_DIR=$BASEDIR/tensorflow/bazel-bin \ # Directory where tensorflow libraries can be found -DTENSORFLOW_ROOT=$BASEDIR/tensorflow \ # The top directory of the tensorflow repository -DTFLITE_LIB_ROOT=$BASEDIR/tensorflow/bazel-bin \ # In our case the same as TENSORFLOW_LIB_DIR -DFLATBUFFERS_ROOT=$BASEDIR/flatbuffers-1.12.0/install \ # The install directory -DArmnn_DIR=$BASEDIR/armnn/build \ # Directory where the ArmNN library can be found -DARMNN_SOURCE_DIR=$BASEDIR/armnn # The top directory of the ArmNN repository. # Required are the includes for ArmNN make
To ensure that the build was successful you can run the unit tests for the delegate that can be found in the build directory for the delegate. Doctest was used to create those tests. Using test filters you can filter out tests that your build is not configured for. In this case, because ArmNN was only built for Cpu acceleration (CpuAcc), we filter for all test suites that have CpuAcc
in their name.
cd $BASEDIR/armnn/delegate/build ./DelegateUnitTests --test-suite=*CpuAcc*
If you have built for Gpu acceleration as well you might want to change your test-suite filter:
./DelegateUnitTests --test-suite=*CpuAcc*,*GpuAcc*
In the introduction it was mentioned that there is a way to integrate the delegate build into ArmNN. This is pretty straight forward. The cmake arguments that were previously used for the delegate have to be added to the ArmNN cmake arguments. Also another argument BUILD_ARMNN_TFLITE_DELEGATE
needs to be added to instruct ArmNN to build the delegate as well. The new commands to build ArmNN are as follows:
cd $BASEDIR git clone "https://review.mlplatform.org/ml/armnn" cd armnn git checkout <branch_name> # e.g. branches/armnn_20_11 mkdir build && cd build # if you've got an arm Gpu add `-DARMCOMPUTECL=1` to the command below cmake .. -DARMCOMPUTE_ROOT=$BASEDIR/ComputeLibrary \ -DARMCOMPUTENEON=1 \ -DBUILD_UNIT_TESTS=0 \ -DBUILD_ARMNN_TFLITE_DELEGATE=1 \ -DTENSORFLOW_LIB_DIR=$BASEDIR/tensorflow/bazel-bin \ -DTENSORFLOW_ROOT=$BASEDIR/tensorflow \ -DTFLITE_LIB_ROOT=$BASEDIR/tensorflow/bazel-bin \ -DFLATBUFFERS_ROOT=$BASEDIR/flatbuffers-1.12.0/install make
The delegate library can then be found in build/armnn/delegate
.
The delegate can be integrated into your c++ project by creating a TfLite Interpreter and instructing it to use the ArmNN delegate for the graph execution. This should look similiar to the following code snippet.
// Create TfLite Interpreter std::unique_ptr<Interpreter> armnnDelegateInterpreter; InterpreterBuilder(tfLiteModel, ::tflite::ops::builtin::BuiltinOpResolver()) (&armnnDelegateInterpreter) // Create the ArmNN Delegate armnnDelegate::DelegateOptions delegateOptions(backends); std::unique_ptr<TfLiteDelegate, decltype(&armnnDelegate::TfLiteArmnnDelegateDelete)> theArmnnDelegate(armnnDelegate::TfLiteArmnnDelegateCreate(delegateOptions), armnnDelegate::TfLiteArmnnDelegateDelete); // Instruct the Interpreter to use the armnnDelegate armnnDelegateInterpreter->ModifyGraphWithDelegate(theArmnnDelegate.get());
For further information on using TfLite Delegates please visit the tensorflow website