backends/apple/mps/setup.md - platform/external/executorch - Git at Google

 # Building and Running ExecuTorch with MPS Backend

 In this tutorial we will walk you through the process of getting setup to build the MPS backend for ExecuTorch and running a simple model on it.

 The MPS backend device maps machine learning computational graphs and primitives on the [MPS Graph](https://developer.apple.com/documentation/metalperformanceshadersgraph/mpsgraph?language=objc) framework and tuned kernels provided by [MPS](https://developer.apple.com/documentation/metalperformanceshaders?language=objc).

 ::::{grid} 2
 :::{grid-item-card}  What you will learn in this tutorial:
 :class-card: card-prerequisites
 * In this tutorial you will learn how to export [MobileNet V3](https://pytorch.org/vision/main/models/mobilenetv3.html) model to the MPS delegate.
 * You will also learn how to compile and deploy the ExecuTorch runtime with the MPS delegate on macOS and iOS.
 :::
 :::{grid-item-card}  Tutorials we recommend you complete before this:
 :class-card: card-prerequisites
 * [Introduction to ExecuTorch](intro-how-it-works.md)
 * [Setting up ExecuTorch](getting-started-setup.md)
 * [Building ExecuTorch with CMake](runtime-build-and-cross-compilation.md)
 * [ExecuTorch iOS Demo App](demo-apps-ios.md)
 * [ExecuTorch iOS LLaMA Demo App](llm/llama-demo-ios.md)
 :::
 ::::


 ## Prerequisites (Hardware and Software)

 In order to be able to successfully build and run a model using the MPS backend for ExecuTorch, you'll need the following hardware and software components:

 ### Hardware:
  - A [mac](https://www.apple.com/mac/) for tracing the model

 ### Software:

   - **Ahead of time** tracing:
     - [macOS](https://www.apple.com/macos/) 12

   - **Runtime**:
     - [macOS](https://www.apple.com/macos/) >= 12.4
     - [iOS](https://www.apple.com/ios) >= 15.4
     - [Xcode](https://developer.apple.com/xcode/) >= 14.1

 ## Setting up Developer Environment

 ***Step 1.*** Please finish tutorial [Setting up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup).

 ***Step 2.*** Install dependencies needed to lower MPS delegate:

   ```bash
   ./backends/apple/mps/install_requirements.sh
   ```

 ## Build

 ### AOT (Ahead-of-time) Components

 **Compiling model for MPS delegate**:
 - In this step, you will generate a simple ExecuTorch program that lowers MobileNetV3 model to the MPS delegate. You'll then pass this Program (the `.pte` file) during the runtime to run it using the MPS backend.

 ```bash
 cd executorch
 # Note: `mps_example` script uses by default the MPSPartitioner for ops that are not yet supported by the MPS delegate. To turn it off, pass `--no-use_partitioner`.
 python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --bundled --use_fp16

 # To see all options, run following command:
 python3 -m examples.apple.mps.scripts.mps_example --help
 ```

 ### Runtime

 **Building the MPS executor runner:**
 ```bash
 # In this step, you'll be building the `mps_executor_runner` that is able to run MPS lowered modules:
 cd executorch
 ./examples/apple/mps/scripts/build_mps_executor_runner.sh
 ```

 ## Run the mv3 generated model using the mps_executor_runner

 ```bash
 ./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_bundled_fp16.pte --bundled_program
 ```

 - You should see the following results. Note that no output file will be generated in this example:
 ```
 I 00:00:00.003290 executorch:mps_executor_runner.mm:286] Model file mv3_mps_bundled_fp16.pte is loaded.
 I 00:00:00.003306 executorch:mps_executor_runner.mm:292] Program methods: 1
 I 00:00:00.003308 executorch:mps_executor_runner.mm:294] Running method forward
 I 00:00:00.003311 executorch:mps_executor_runner.mm:349] Setting up non-const buffer 1, size 606112.
 I 00:00:00.003374 executorch:mps_executor_runner.mm:376] Setting up memory manager
 I 00:00:00.003376 executorch:mps_executor_runner.mm:392] Loading method name from plan
 I 00:00:00.018942 executorch:mps_executor_runner.mm:399] Method loaded.
 I 00:00:00.018944 executorch:mps_executor_runner.mm:404] Loading bundled program...
 I 00:00:00.018980 executorch:mps_executor_runner.mm:421] Inputs prepared.
 I 00:00:00.118731 executorch:mps_executor_runner.mm:438] Model executed successfully.
 I 00:00:00.122615 executorch:mps_executor_runner.mm:501] Model verified successfully.
 ```

 ### [Optional] Run the generated model directly using pybind
 1. Make sure `pybind` MPS support was installed:
 ```bash
 ./install_requirements.sh --pybind mps
 ```
 2. Run the `mps_example` script to trace the model and run it directly from python:
 ```bash
 cd executorch
 # Check correctness between PyTorch eager forward pass and ExecuTorch MPS delegate forward pass
 python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp16 --check_correctness
 # You should see following output: `Results between ExecuTorch forward pass with MPS backend and PyTorch forward pass for mv3_mps are matching!`

 # Check performance between PyTorch MPS forward pass and ExecuTorch MPS forward pass
 python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp16 --bench_pytorch
 ```

 ### Profiling:
 1. [Optional] Generate an [ETRecord](./sdk-etrecord.rst) while you're exporting your model.
 ```bash
 cd executorch
 python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --generate_etrecord -b
 ```
 2. Run your Program on the ExecuTorch runtime and generate an [ETDump](./sdk-etdump.md).
 ```
 ./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_bundled_fp16.pte --bundled_program --dump-outputs
 ```
 3. Create an instance of the Inspector API by passing in the ETDump you have sourced from the runtime along with the optionally generated ETRecord from step 1.
 ```bash
 python3 -m sdk.inspector.inspector_cli --etdump_path etdump.etdp --etrecord_path etrecord.bin
 ```

 ## Deploying and Running on Device

 ***Step 1***. Create the ExecuTorch core and MPS delegate frameworks to link on iOS
 ```bash
 cd executorch
 ./build/build_apple_frameworks.sh --mps
 ```

 `mps_delegate.xcframework` will be in `cmake-out` folder, along with `executorch.xcframework` and `portable_delegate.xcframework`:
 ```bash
 cd cmake-out && ls
 ```

 ***Step 2***. Link the frameworks into your XCode project:
 Go to project Target’s  `Build Phases`  -  `Link Binaries With Libraries`, click the **+** sign and add the frameworks: files located in  `Release` folder.
 - `executorch.xcframework`
 - `portable_delegate.xcframework`
 - `mps_delegate.xcframework`

 From the same page, include the needed libraries for the MPS delegate:
 - `MetalPerformanceShaders.framework`
 - `MetalPerformanceShadersGraph.framework`
 - `Metal.framework`

 In this tutorial, you have learned how to lower a model to the MPS delegate, build the mps_executor_runner and run a lowered model through the MPS delegate, or directly on device using the MPS delegate static library.


 ## Frequently encountered errors and resolution.

 If you encountered any bugs or issues following this tutorial please file a bug/issue on the [ExecuTorch repository](https://github.com/pytorch/executorch/issues), with hashtag **#mps**.
	# Building and Running ExecuTorch with MPS Backend

	In this tutorial we will walk you through the process of getting setup to build the MPS backend for ExecuTorch and running a simple model on it.

	The MPS backend device maps machine learning computational graphs and primitives on the [MPS Graph](https://developer.apple.com/documentation/metalperformanceshadersgraph/mpsgraph?language=objc) framework and tuned kernels provided by [MPS](https://developer.apple.com/documentation/metalperformanceshaders?language=objc).

	::::{grid} 2
	:::{grid-item-card} What you will learn in this tutorial:
	:class-card: card-prerequisites
	* In this tutorial you will learn how to export [MobileNet V3](https://pytorch.org/vision/main/models/mobilenetv3.html) model to the MPS delegate.
	* You will also learn how to compile and deploy the ExecuTorch runtime with the MPS delegate on macOS and iOS.
	:::
	:::{grid-item-card} Tutorials we recommend you complete before this:
	:class-card: card-prerequisites
	* [Introduction to ExecuTorch](intro-how-it-works.md)
	* [Setting up ExecuTorch](getting-started-setup.md)
	* [Building ExecuTorch with CMake](runtime-build-and-cross-compilation.md)
	* [ExecuTorch iOS Demo App](demo-apps-ios.md)
	* [ExecuTorch iOS LLaMA Demo App](llm/llama-demo-ios.md)
	:::
	::::


	## Prerequisites (Hardware and Software)

	In order to be able to successfully build and run a model using the MPS backend for ExecuTorch, you'll need the following hardware and software components:

	### Hardware:
	- A [mac](https://www.apple.com/mac/) for tracing the model

	### Software:

	- Ahead of time tracing:
	- [macOS](https://www.apple.com/macos/) 12

	- Runtime:
	- [macOS](https://www.apple.com/macos/) >= 12.4
	- [iOS](https://www.apple.com/ios) >= 15.4
	- [Xcode](https://developer.apple.com/xcode/) >= 14.1

	## Setting up Developer Environment

	*Step 1.* Please finish tutorial [Setting up ExecuTorch](https://pytorch.org/executorch/stable/getting-started-setup).

	*Step 2.* Install dependencies needed to lower MPS delegate:

	```bash
	./backends/apple/mps/install_requirements.sh
	```

	## Build

	### AOT (Ahead-of-time) Components

	Compiling model for MPS delegate:
	- In this step, you will generate a simple ExecuTorch program that lowers MobileNetV3 model to the MPS delegate. You'll then pass this Program (the `.pte` file) during the runtime to run it using the MPS backend.

	```bash
	cd executorch
	# Note: `mps_example` script uses by default the MPSPartitioner for ops that are not yet supported by the MPS delegate. To turn it off, pass `--no-use_partitioner`.
	python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --bundled --use_fp16

	# To see all options, run following command:
	python3 -m examples.apple.mps.scripts.mps_example --help
	```

	### Runtime

	Building the MPS executor runner:
	```bash
	# In this step, you'll be building the `mps_executor_runner` that is able to run MPS lowered modules:
	cd executorch
	./examples/apple/mps/scripts/build_mps_executor_runner.sh
	```

	## Run the mv3 generated model using the mps_executor_runner

	```bash
	./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_bundled_fp16.pte --bundled_program
	```

	- You should see the following results. Note that no output file will be generated in this example:
	```
	I 00:00:00.003290 executorch:mps_executor_runner.mm:286] Model file mv3_mps_bundled_fp16.pte is loaded.
	I 00:00:00.003306 executorch:mps_executor_runner.mm:292] Program methods: 1
	I 00:00:00.003308 executorch:mps_executor_runner.mm:294] Running method forward
	I 00:00:00.003311 executorch:mps_executor_runner.mm:349] Setting up non-const buffer 1, size 606112.
	I 00:00:00.003374 executorch:mps_executor_runner.mm:376] Setting up memory manager
	I 00:00:00.003376 executorch:mps_executor_runner.mm:392] Loading method name from plan
	I 00:00:00.018942 executorch:mps_executor_runner.mm:399] Method loaded.
	I 00:00:00.018944 executorch:mps_executor_runner.mm:404] Loading bundled program...
	I 00:00:00.018980 executorch:mps_executor_runner.mm:421] Inputs prepared.
	I 00:00:00.118731 executorch:mps_executor_runner.mm:438] Model executed successfully.
	I 00:00:00.122615 executorch:mps_executor_runner.mm:501] Model verified successfully.
	```

	### [Optional] Run the generated model directly using pybind
	1. Make sure `pybind` MPS support was installed:
	```bash
	./install_requirements.sh --pybind mps
	```
	2. Run the `mps_example` script to trace the model and run it directly from python:
	```bash
	cd executorch
	# Check correctness between PyTorch eager forward pass and ExecuTorch MPS delegate forward pass
	python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp16 --check_correctness
	# You should see following output: `Results between ExecuTorch forward pass with MPS backend and PyTorch forward pass for mv3_mps are matching!`

	# Check performance between PyTorch MPS forward pass and ExecuTorch MPS forward pass
	python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp16 --bench_pytorch
	```

	### Profiling:
	1. [Optional] Generate an [ETRecord](./sdk-etrecord.rst) while you're exporting your model.
	```bash
	cd executorch
	python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --generate_etrecord -b
	```
	2. Run your Program on the ExecuTorch runtime and generate an [ETDump](./sdk-etdump.md).
	```
	./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_bundled_fp16.pte --bundled_program --dump-outputs
	```
	3. Create an instance of the Inspector API by passing in the ETDump you have sourced from the runtime along with the optionally generated ETRecord from step 1.
	```bash
	python3 -m sdk.inspector.inspector_cli --etdump_path etdump.etdp --etrecord_path etrecord.bin
	```

	## Deploying and Running on Device

	*Step 1*. Create the ExecuTorch core and MPS delegate frameworks to link on iOS
	```bash
	cd executorch
	./build/build_apple_frameworks.sh --mps
	```

	`mps_delegate.xcframework` will be in `cmake-out` folder, along with `executorch.xcframework` and `portable_delegate.xcframework`:
	```bash
	cd cmake-out && ls
	```

	*Step 2*. Link the frameworks into your XCode project:
	Go to project Target’s `Build Phases` - `Link Binaries With Libraries`, click the + sign and add the frameworks: files located in `Release` folder.
	- `executorch.xcframework`
	- `portable_delegate.xcframework`
	- `mps_delegate.xcframework`

	From the same page, include the needed libraries for the MPS delegate:
	- `MetalPerformanceShaders.framework`
	- `MetalPerformanceShadersGraph.framework`
	- `Metal.framework`

	In this tutorial, you have learned how to lower a model to the MPS delegate, build the mps_executor_runner and run a lowered model through the MPS delegate, or directly on device using the MPS delegate static library.


	## Frequently encountered errors and resolution.

	If you encountered any bugs or issues following this tutorial please file a bug/issue on the [ExecuTorch repository](https://github.com/pytorch/executorch/issues), with hashtag #mps.