An easy-to-use tool for generating / executing .pte program from pre-built model libraries / context binaries from Qualcomm AI Engine Direct. Tool is verified with host environement.
This tool aims for users who want to leverage ExecuTorch runtime framework with their existent artifacts generated by QNN. It's possible for them to produce .pte program in few steps.
If users are interested in well-known applications, Qualcomm AI HUB is a great approach which provides tons of optimized state-of-the-art models ready for deploying. All of them could be downloaded in model library or context binary format.
qnn-model-lib-generator | AI HUB, or context binaries(.bin) came from qnn-context-binary-generator | AI HUB, could apply tool directly with:$ python export.py compile
$ python export.py execute
mkdir $MY_WS && cd $MY_WS # target chipset is `SM8650` python -m qai_hub_models.models.quicksrnetlarge_quantized.export --target-runtime qnn --chipset qualcomm-snapdragon-8gen3
$MY_WS/build/quicksrnetlarge_quantized/quicksrnetlarge_quantized.so. This model library maps to the artifacts generated by SDK tools mentioned in Integration workflow section on Qualcomm AI Engine Direct document.# `pip install pydot` if package is missing # Note that device serial & hostname might not be required if given artifacts is in context binary format PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/qaihub_scripts/utils/export.py compile -a $MY_WS/build/quicksrnetlarge_quantized/quicksrnetlarge_quantized.so -m SM8650 -s $DEVICE_SERIAL -b $EXECUTORCH_ROOT/build-android
output_pte/quicksrnetlarge_quantized/quicksrnetlarge_quantized.jsonoutput_pte/quicksrnetlarge_quantized/quicksrnetlarge_quantized.svgcd $MY_WS wget https://user-images.githubusercontent.com/12981474/40157448-eff91f06-5953-11e8-9a37-f6b5693fa03f.png -O baboon.pngExecute following python script to generate input data:
import torch import torchvision.transforms as transforms from PIL import Image img = Image.open('baboon.png').resize((128, 128)) transform = transforms.Compose([transforms.PILToTensor()]) # convert (C, H, W) to (N, H, W, C) # IO tensor info. could be checked with quicksrnetlarge_quantized.json | .svg img = transform(img).permute(1, 2, 0).unsqueeze(0) torch.save(img, 'baboon.pt')
PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/qaihub_scripts/utils/export.py execute -p output_pte/quicksrnetlarge_quantized -i baboon.pt -s $DEVICE_SERIAL -b $EXECUTORCH_ROOT/build-android
cd output_data
Execute following python script to generate output image:import io import torch import torchvision.transforms as transforms # IO tensor info. could be checked with quicksrnetlarge_quantized.json | .svg # generally we would have same layout for input / output tensors: e.g. either NHWC or NCHW # this might not be true under different converter configurations # learn more with converter tool from Qualcomm AI Engine Direct documentation # https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/tools.html#model-conversion with open('output__142.pt', 'rb') as f: buffer = io.BytesIO(f.read()) img = torch.load(buffer, weights_only=False) transform = transforms.Compose([transforms.ToPILImage()]) img_pil = transform(img.squeeze(0)) img_pil.save('baboon_upscaled.png')You could check the upscaled result now!
Please check help messages for more information:
PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/aihub/utils/export.py -h PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/aihub/utils/python export.py compile -h PYTHONPATH=$EXECUTORCH_ROOT/.. python $EXECUTORCH_ROOT/examples/qualcomm/aihub/utils/python export.py execute -h