aten.avg_pool2d (#3770)

Summary:
Pull Request resolved: https://github.com/pytorch/executorch/pull/3770

## The Operator
`nn.Module` invocations of [`torch.nn.AvgPool2d`](https://pytorch.org/docs/stable/generated/torch.nn.AvgPool2d.html) get compiled to `aten.avg_pool2d.default` in the Edge Dialect, which carries the following signature.
```
- func: avg_pool2d(Tensor self, int[2] kernel_size, int[2] stride=[], int[2] padding=0, bool ceil_mode=False, bool count_include_pad=True, int? divisor_override=None) -> Tensor
```

## Implementation
This is a full C-packing implementation including dynamic shape support. We start with [LiteInterpreter's `avg_pool2d.glsl` logic](https://github.com/pytorch/pytorch/blob/9257a0698b57acc5607ee6fe31a16fdd93af1731/aten/src/ATen/native/vulkan/glsl/avg_pool2d.glsl), which is incomplete, and cover `ceil_mode=True`,  `count_include_pad=True`, and `divisor_override` cases for full support. As a result, the divisor's computation is now a bit complex. If needed, we can simplify it into separate shaders in the future.
ghstack-source-id: 228476264

Reviewed By: copyrightly

Differential Revision: D57918523

fbshipit-source-id: 8069c4a2dcc5d46da7221d58661e57bf2055b521
8 files changed
tree: 6268f66bac8c6fbae5903512bcf2b974a9e30d22
  1. .ci/
  2. .github/
  3. backends/
  4. build/
  5. codegen/
  6. configurations/
  7. docs/
  8. examples/
  9. exir/
  10. extension/
  11. kernels/
  12. profiler/
  13. runtime/
  14. schema/
  15. scripts/
  16. sdk/
  17. shim/
  18. test/
  19. third-party/
  20. util/
  21. .buckconfig
  22. .clang-format
  23. .clang-tidy
  24. .cmake-format.yaml
  25. .cmakelintrc
  26. .flake8
  27. .gitignore
  28. .gitmodules
  29. .lintrunner.toml
  30. CMakeLists.txt
  31. CODE_OF_CONDUCT.md
  32. CONTRIBUTING.md
  33. install_requirements.sh
  34. LICENSE
  35. pyproject.toml
  36. pytest.ini
  37. README-wheel.md
  38. README.md
  39. requirements-lintrunner.txt
  40. setup.py
  41. version.txt
README.md

ExecuTorch

ExecuTorch is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of PyTorch models to edge devices.

Key value propositions of ExecuTorch are:

  • Portability: Compatibility with a wide variety of computing platforms, from high-end mobile phones to highly constrained embedded systems and microcontrollers.
  • Productivity: Enabling developers to use the same toolchains and SDK from PyTorch model authoring and conversion, to debugging and deployment to a wide variety of platforms.
  • Performance: Providing end users with a seamless and high-performance experience due to a lightweight runtime and utilizing full hardware capabilities such as CPUs, NPUs, and DSPs.

For a comprehensive technical overview of ExecuTorch and step-by-step tutorials, please visit our documentation website for the latest release (or the main branch).

Feedback

We welcome any feedback, suggestions, and bug reports from the community to help us improve our technology. Please use the PyTorch Forums for discussion and feedback about ExecuTorch using the ExecuTorch category, and our GitHub repository for bug reporting.

We recommend using the latest release tag from the Releases page when developing.

Directory Structure

executorch
├── backends                        #  Backend delegate implementations.
├── build                           #  Utilities for managing the build system.
├── bundled_program                 #  Utilities for attaching reference inputs and outputs to models.
├── codegen                         #  Tooling to autogenerate bindings between kernels and the runtime.
├── configurations
├── docs                            #  Static docs tooling
├── examples                        #  Examples of various user flows, such as model export, delegates, and runtime execution.
├── exir                            #  Ahead of time library, model capture and lowering apis.
|   ├── _serialize                  #  Serialize final export artifact.
|   ├── backend                     #  Backend delegate ahead of time APIs
|   ├── capture                     #  Program capture.
|   ├── dialects                    #  Op sets for various dialects in the export process.
|   ├── emit                        #  Conversion from ExportedProgram to ExecuTorch execution instructions.
|   ├── passes                      #  Built-in compiler passes.
|   ├── program                     #  Export artifacts.
|   ├── verification                #  IR verification.
├── extension                       #  Extensions built on top of the runtime.
|   ├── aten_util
|   ├── data_loader                 #  1st party data loader implementations.
|   ├── memory_allocator            #  1st party memory allocator implementations.
|   ├── pybindings                  #  Python api for executorch runtime.
|   ├── pytree                      #  C++ and Python flattening and unflattening lib for pytrees.
|   ├── testing_util
├── kernels                         #  1st party kernel implementations.
|   ├── aten
|   ├── optimized
|   ├── portable                    #  Reference implementations of ATen operators.
|   ├── prim_ops                    #  Special ops used in executorch runtime for control flow and symbolic primitives.
|   ├── quantized
├── profiler                        #  Utilities for profiling.
├── runtime                         #  Core cpp runtime
|   ├── backend                     #  Backend delegate runtime APIs
|   ├── core                        #  Core structures used across all levels of the runtime
|   ├── executor                    #  Model loading, initalization, and execution.
|   ├── kernel                      #  Kernel registration and management.
|   ├── platform                    #  Layer between architecture specific code and user calls.
├── schema                          #  ExecuTorch program definition
├── scripts                         #  Utility scripts for size management, dependency management, etc.
├── sdk                             #  Model profiling, debugging, and introspection.
├── shim                            #  Compatibility layer between OSS and Internal builds
├── test                            #  Broad scoped end2end tests
├── third-party                     #  Third-party dependencies
├── util

License

ExecuTorch is BSD licensed, as found in the LICENSE file.