commit | 2b7eb62ddfdb534fbf581a7aeb23dd61032b737b | [log] [tgz] |
---|---|---|
author | Max Ren <maxren@meta.com> | Mon Sep 18 13:22:21 2023 -0700 |
committer | Facebook GitHub Bot <facebook-github-bot@users.noreply.github.com> | Mon Sep 18 13:22:21 2023 -0700 |
tree | 9396aab2bffafd6db13781f38ec3c54f80cca6f1 | |
parent | 51b238579cc21cd7a9f85eab25cbb9e2ffaae3fe [diff] |
remove transpose addmm weights hack (#358) Summary: Pull Request resolved: https://github.com/pytorch/executorch/pull/358 ### Background A common pattern we when encountering addmm is that weights are permuted before given to addmm. This is because generally for torch.nn.Linear, the input shape and weight shape are given as such: ``` input: (*, in_features) weight: (out_features,in_features) ``` while the input shape and weight shape of addmm are the following: ``` input1 (input): (*, in_features) input2 (weight): (in_features, out_features) ``` so when decomposing nn.Linear to addmm, the weights go through a permute node to comply with addmm's shapes ### XNNPACK Status XNNPACK can handle both the transpose and normal weight shape, however it requires a flag for whether or not the weights are transposed. So an easy optimization is to skip the permute node and use the flag. ### Change and Motivation Currently, we have hardcoded some of this optimization logic directly into serialization. I believe that serialization should not be aware of these optimizations, which is why I am removing this logic from within serialization. Instead this logic should be performed completely by the addmm --> linear pass which recomposes permute + addmm into a singular linear. We should no longer rely on serialization logic to perform this logic (Right now its errorneous and causing a bug). Reviewed By: kirklandsign Differential Revision: D49129704 fbshipit-source-id: 1134c33f76eb27ac05a90b29c6dc057c8c647b58
A unified ML software stack within the PyTorch platform for edge devices. It defines new compiler entry points as well as a state-of-art runtime.
Compared to the legacy Lite Interpreter, there are some major benefits:
executorch ├── backends # 1st party backend implementations. | ├── xnnpack | ├── vulkan ├── build # Utilities for managing the build system. ├── bundled_program # Utilities for attaching reference inputs and outputs to models. TODO move to extension ├── codegen # Tooling to autogenerate bindings between kernels and the runtime. TODO move to tool ├── configurations # TODO delete this ├── docs # Static docs tooling ├── examples # Examples of various user flows, such as model export, delegates, and runtime execution. | ├── executor_runner | ├── export | ├── models ├── exir # Ahead of time library, model capture and lowering apis. | ├── backend # Backend delegate ahead of time APIs | ├── capture # Program capture. | ├── dialects # Op sets for various dialects in the export process. | ├── emit # Conversion from ExportedProgram to Executorch execution instructions. | ├── program # Export artifacts. | ├── serialize # Serialize final export artifact. ├── extension # Extensions built on top of the runtime. | ├── aten_util | ├── data_loader # 1st party data loader implementations. | ├── memory_allocator # 1st party memory allocator implementations. | ├── pybindings # Python api for executorch runtime. | ├── pytree # C++ and Python flattening and unflattening lib for pytrees. | ├── testing_util ├── kernels # 1st party kernel implementations. | ├── aten | ├── optimized | ├── portable # Reference implementations of ATen operators. | ├── prim_ops # Special ops used in executorch runtime for control flow and symbolic primitives. | ├── quantized ├── profiler # Utilities for profiling. TODO delete in favor of ETDump in sdk/ ├── runtime # core cpp runtime of executorch | ├── backend # Backend delegate runtime APIs | ├── core # Core structures used across all levels of the runtime | ├── executor # Model loading, initalization, and execution. | ├── kernel # Kernel registration and management. | ├── platform # Layer between architecture specific code and user calls. ├── schema # Executorch program definition, TODO move under serialization/ ├── scripts # Utility scripts for size management, dependency management, etc. ├── sdk # Model profiling, debugging, and introspection: NOT READY YET FOR OSS USE ├── shim # Compatibility layer between OSS and Internal builds ├── test # Broad scoped end2end tests ├── third-party # third-party dependencies ├── util # TODO delete this
ExecuTorch is BSD licensed, as found in the LICENSE file.