NNXSW-1853 Change SubgraphViewSelector algorithm

The current algorithm in SubgraphViewSelector has a bug that can lead to
it producing subgraphs which have a dependency cycle (see the newly
added test case 'ValidMerge' for a repro). It also fails to merge
subgraphs in some cases where it could, which leads to smaller subgraphs.
In the case of FSRCNN, the NPU cannot support these smaller subgraphs and
so this is blocking us from supporting that network.

This commit changes the algorithm to fix the dependency bug and
also make it so that subgraphs are merged in the cases that were missed
before. It also adds some unit tests to cover cases that were problematic
before, and to extend coverage for the new algorithm.

The new algorithm has two downsides compared to the previous one:

1. Disjoint subgraphs are not merged. This can never lead to a failed
compilation by the NPU and so I believe this is less of an issue than
the previous algorithm's "missed merges". This could however lead to a
runtime performance loss in some cases as the NPU will be unable
to parallelise as many operations. There are some unit tests that cover
this which I have disabled.
2. The performance is worse. I have spent some time analysing this and
for a graph with ~1000 layers the new algorithm takes 20ms vs. the
old algorithm's 4ms (on my desktop PC). I believe the performance is
still within acceptable limits. I also compared inception V3 (which was
the network which caused performance issues with the original version of
the splitting algorithm) and this new algorithm has not regressed there
(200-300us in both cases).

Change-Id: I1dd64a779f272723621e04d203b5a2752a6af2ef
Signed-off-by: Robert Hughes <robert.hughes@arm.com>
4 files changed
tree: 5643a27f8fa35e058aaf3656b4952d48e4645ee2
  1. cmake/
  2. docs/
  3. include/
  4. samples/
  5. scripts/
  6. src/
  7. tests/
  8. third-party/
  9. Android.bp
  10. Android.mk
  11. BuildGuideAndroidNDK.md
  12. BuildGuideCrossCompilation.md
  13. CMakeLists.txt
  14. ContributorGuide.md
  15. LICENSE
  16. README.md
README.md

Arm NN

Arm NN is a key component of the machine learning platform, which is part of the Linaro Machine Intelligence Initiative. For more information on the machine learning platform and Arm NN, see: https://mlplatform.org/, also there is further Arm NN information available from https://developer.arm.com/products/processors/machine-learning/arm-nn

There is a getting started guide here using TensorFlow: https://developer.arm.com/technologies/machine-learning-on-arm/developer-material/how-to-guides/configuring-the-arm-nn-sdk-build-environment-for-tensorflow

There is a getting started guide here using TensorFlow Lite: https://developer.arm.com/technologies/machine-learning-on-arm/developer-material/how-to-guides/configuring-the-arm-nn-sdk-build-environment-for-tensorflow-lite

There is a getting started guide here using Caffe: https://developer.arm.com/technologies/machine-learning-on-arm/developer-material/how-to-guides/configuring-the-arm-nn-sdk-build-environment-for-caffe

There is a getting started guide here using ONNX: https://developer.arm.com/technologies/machine-learning-on-arm/developer-material/how-to-guides/configuring-the-arm-nn-sdk-build-environment-for-onnx

There is a guide for backend development: Backend development guide

Build Instructions

Arm tests the build system of Arm NN with the following build environments:

Arm NN is written using portable C++14 and the build system uses CMake, therefore it is possible to build for a wide variety of target platforms, from a wide variety of host environments.

The armnn/tests directory contains tests used during Arm NN development. Many of them depend on third-party IP, model protobufs and image files not distributed with Arm NN. The dependencies of some of the tests are available freely on the Internet, for those who wish to experiment.

The ‘armnn/samples’ directory contains SimpleSample.cpp, a very basic example of the ArmNN SDK API in use.

The ‘ExecuteNetwork’ program, in armnn/tests/ExecuteNetwork, has no additional dependencies beyond those required by Arm NN and the model parsers. It takes any model and any input tensor, and simply prints out the output tensor. Run it with no arguments to see command-line help.

The ‘ArmnnConverter’ program, in armnn/src/armnnConverter, has no additional dependencies beyond those required by Arm NN and the model parsers. It takes a model in TensorFlow format and produces a serialized model in Arm NN format. Run it with no arguments to see command-line help. Note that this program can only convert models for which all operations are supported by the serialization tool src/armnnSerializer.

The ‘ArmnnQuantizer’ program, in armnn/src/armnnQuantizer, has no additional dependencies beyond those required by Arm NN and the model parsers. It takes a 32-bit float network and converts it into a quantized asymmetric 8-bit or quantized symmetric 16-bit network. Static quantization is supported by default but dynamic quantization can be enabled if CSV file of raw input tensors is specified. Run it with no arguments to see command-line help.

Note that Arm NN needs to be built against a particular version of ARM's Compute Library. The get_compute_library.sh in the scripts subdirectory will clone the compute library from the review.mlplatform.org github repository into a directory alongside armnn named ‘clframework’ and checks out the correct revision.

License

Arm NN is provided under the MIT license. See LICENSE for more information. Contributions to this project are accepted under the same license.

Individual files contain the following tag instead of the full license text.

SPDX-License-Identifier: MIT

This enables machine processing of license information based on the SPDX License Identifiers that are available here: http://spdx.org/licenses/

Contributions

The Arm NN project welcomes contributions. For more details on contributing to Arm NN see the Contributing page on the MLPlatform.org website, or see the Contributor Guide.