| # Setting up QNN Backend |
| |
| This is a tutorial for building and running Qualcomm AI Engine Direct backend, |
| including compiling a model on a x64 host and running the inference |
| on a Android device. |
| |
| |
| ## Prerequisite |
| |
| Please finish tutorial [Setting up executorch](../../docs/source/getting-started-setup.md). |
| |
| |
| ## Conventions |
| |
| `$QNN_SDK_ROOT` refers to the root of Qualcomm AI Engine Direct SDK, |
| i.e., the directory containing `QNN_README.txt`. |
| |
| `$ANDROID_NDK` refers to the root of Android NDK. |
| |
| `$EXECUTORCH_ROOT` refers to the root of executorch git repository. |
| |
| |
| ## Environment Setup |
| |
| ### Download Qualcomm AI Engine Direct SDK |
| |
| Navigate to [Qualcomm AI Engine Direct SDK](https://developer.qualcomm.com/software/qualcomm-ai-engine-direct-sdk) and follow the download button. |
| |
| You might need to apply for a Qualcomm account to download the SDK. |
| |
| After logging in, search Qualcomm AI Stack at the *Tool* panel. |
| You can find Qualcomm AI Engine Direct SDK under the AI Stack group. |
| |
| Please download the Linux version, and follow instructions on the page to |
| extract the file. |
| |
| The SDK should be installed to somewhere `/opt/qcom/aistack/qnn` by default. |
| |
| ### Download Android NDK |
| |
| Please navigate to [Android NDK](https://developer.android.com/ndk) and download |
| a version of NDK. We recommend LTS version, currently r25c. |
| |
| ### Setup environment variables |
| |
| We need to make sure Qualcomm AI Engine Direct libraries can be found by |
| the dynamic linker on x64. Hence we set `LD_LIBRARY_PATH`. In production, |
| we recommend users to put libraries in default search path or use `rpath` |
| to indicate the location of libraries. |
| |
| Further, we set up `$PYTHONPATH` because it's easier to develop and import executorch Python APIs. Users might also build and install executorch package as usual python package. |
| |
| ```bash |
| export LD_LIBRARY_PATH=$QNN_SDK_ROOT/lib/x86_64-linux-clang/:$LD_LIBRARY_PATH |
| export PYTHONPATH=$EXECUTORCH_ROOT/.. |
| ``` |
| |
| |
| ## End to End Inference |
| |
| ### Step 1: Build Python APIs for AOT compilation on x64 |
| |
| Python APIs on x64 are required to compile models to Qualcomm AI Engine Direct binary. |
| Make sure `buck2` is under a directory in `PATH`. |
| |
| ```bash |
| cd $EXECUTORCH_ROOT |
| mkdir build_x86_64 |
| cd build_x86_64 |
| cmake .. -DEXECUTORCH_BUILD_QNN=ON -DQNN_SDK_ROOT=${QNN_SDK_ROOT} |
| cmake --build . -t "PyQnnManagerAdaptor" "PyQnnWrapperAdaptor" -j8 |
| |
| # install Python APIs to correct import path |
| # The filename might vary depending on your Python and host version. |
| cp -f backends/qualcomm/PyQnnManagerAdaptor.cpython-310-x86_64-linux-gnu.so $EXECUTORCH_ROOT/backends/qualcomm/python |
| cp -f backends/qualcomm/PyQnnWrapperAdaptor.cpython-310-x86_64-linux-gnu.so $EXECUTORCH_ROOT/backends/qualcomm/python |
| ``` |
| |
| |
| ### Step 2: Build `qnn_executor_runner` for Android |
| |
| `qnn_executor_runner` is an executable running the compiled model. |
| |
| You might want to ensure the correct `flatc`. `flatc` can be built along with the above step. For example, we can find `flatc` in `build_x86_64/third-party/flatbuffers/`. |
| |
| We can prepend `$EXECUTORCH_ROOT/build_x86_64/third-party/flatbuffers` to `PATH`. Then below cross-compiling can find the correct flatbuffer compiler. |
| |
| Commands to build `qnn_executor_runner` for Android: |
| |
| ```bash |
| cd $EXECUTORCH_ROOT |
| mkdir build_android |
| cd build_android |
| # build executorch & qnn_executorch_backend |
| cmake .. \ |
| -DBUCK2=buck2 \ |
| -DCMAKE_INSTALL_PREFIX=$PWD \ |
| -DEXECUTORCH_BUILD_QNN=ON \ |
| -DQNN_SDK_ROOT=$QNN_SDK_ROOT \ |
| -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \ |
| -DANDROID_ABI='arm64-v8a' \ |
| -DANDROID_NATIVE_API_LEVEL=23 \ |
| -B$PWD |
| |
| cmake --build $PWD -j16 --target install |
| |
| cmake ../examples/qualcomm \ |
| -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \ |
| -DANDROID_ABI='arm64-v8a' \ |
| -DANDROID_NATIVE_API_LEVEL=23 \ |
| -DCMAKE_PREFIX_PATH="$PWD/lib/cmake/ExecuTorch;$PWD/third-party/gflags;" \ |
| -DCMAKE_FIND_ROOT_PATH_MODE_PACKAGE=BOTH \ |
| -Bexamples/qualcomm |
| |
| cmake --build examples/qualcomm -j16 |
| ``` |
| **Note:** If you want to build for release, add `-DCMAKE_BUILD_TYPE=Release` to the `cmake` command options. |
| |
| You can find `qnn_executor_runner` under `build_android/examples/qualcomm/`. |
| |
| |
| ### Step 3: Compile a model |
| |
| ``` |
| python -m examples.qualcomm.scripts.export_example --model_name mv2 |
| ``` |
| |
| Then the generated `mv2.pte` can be run on the device by |
| `build_android/backends/qualcomm/qnn_executor_runner` with Qualcomm AI Engine |
| Direct backend. |
| |
| [**Note**] To get proper accuracy, please apply calibrations with representative |
| dataset, which could be learnt more from examples under `examples/qualcomm/`. |
| |
| |
| ### Step 4: Model Inference |
| |
| The backend rely on Qualcomm AI Engine Direct SDK libraries. |
| |
| You might want to follow docs in Qualcomm AI Engine Direct SDK to setup the device environment. |
| Or see below for a quick setup for testing: |
| |
| ```bash |
| # make sure you have write-permission on below path. |
| DEVICE_DIR=/data/local/tmp/executorch_test/ |
| adb shell "mkdir -p ${DEVICE_DIR}" |
| adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtp.so ${DEVICE_DIR} |
| adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV69Stub.so ${DEVICE_DIR} |
| adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV73Stub.so ${DEVICE_DIR} |
| adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnSystem.so ${DEVICE_DIR} |
| adb push ${QNN_SDK_ROOT}/lib/hexagon-v69/unsigned/libQnnHtpV69Skel.so ${DEVICE_DIR} |
| adb push ${QNN_SDK_ROOT}/lib/hexagon-v73/unsigned/libQnnHtpV73Skel.so ${DEVICE_DIR} |
| ``` |
| |
| We also need to indicate dynamic linkers on Android and Hexagon where to find these libraries |
| by setting `ADSP_LIBRARY_PATH` and `LD_LIBRARY_PATH`. |
| |
| So, we can run `qnn_executor_runner` like |
| ```bash |
| adb push mv2.pte ${DEVICE_DIR} |
| adb push ${EXECUTORCH_ROOT}/build_android/examples/qualcomm/qnn_executor_runner ${DEVICE_DIR} |
| adb shell "cd ${DEVICE_DIR} \ |
| && export LD_LIBRARY_PATH=${DEVICE_DIR} \ |
| && export ADSP_LIBRARY_PATH=${DEVICE_DIR} \ |
| && ./qnn_executor_runner --model_path ./mv2_qnn.pte" |
| ``` |
| |
| You should see the following result. |
| Note that no output file will be generated in this example. |
| ``` |
| I 00:00:00.133366 executorch:qnn_executor_runner.cpp:156] Method loaded. |
| I 00:00:00.133590 executorch:util.h:104] input already initialized, refilling. |
| I 00:00:00.135162 executorch:qnn_executor_runner.cpp:161] Inputs prepared. |
| I 00:00:00.136768 executorch:qnn_executor_runner.cpp:278] Model executed successfully. |
| [INFO][Qnn ExecuTorch] Destroy Qnn backend parameters |
| [INFO][Qnn ExecuTorch] Destroy Qnn context |
| [INFO][Qnn ExecuTorch] Destroy Qnn device |
| [INFO][Qnn ExecuTorch] Destroy Qnn backend |
| ``` |