[JAX] Refactor jax_jit to avoid DevicePut on pruned args.

name                                  old cpu/op  new cpu/op  delta
eager_unary_dispatch                  35.7µs ± 2%  35.9µs ± 3%     ~     (p=0.841 n=5+5)
eager_unary                           36.4µs ± 2%  36.6µs ± 3%     ~     (p=0.421 n=5+5)
eager_binary_dispatch                 45.6µs ± 1%  46.1µs ± 2%     ~     (p=0.421 n=5+5)
eager_binary                          46.6µs ± 2%  47.0µs ± 5%     ~     (p=1.000 n=5+5)
jit_trivial_dispatch                  41.4µs ± 1%  41.4µs ± 0%     ~     (p=0.690 n=5+5)
jit_trivial                           42.4µs ± 1%  42.3µs ± 1%     ~     (p=0.841 n=5+5)
jit_simple_dispatch                   8.85µs ± 3%  9.15µs ± 3%     ~     (p=0.095 n=5+5)
jit_simple                            9.77µs ± 1%  9.82µs ± 2%     ~     (p=0.548 n=5+5)
jit_simple_many_args_dispatch_10      13.4µs ± 1%  13.6µs ± 3%     ~     (p=0.222 n=5+5)
jit_simple_many_args_10               14.0µs ± 2%  14.1µs ± 1%     ~     (p=0.421 n=5+5)
jit_simple_pruned_args_dispatch_10    8.05µs ± 3%  8.07µs ± 4%     ~     (p=0.841 n=5+5)
jit_simple_pruned_args_10             9.53µs ± 2%  9.43µs ± 2%     ~     (p=0.222 n=5+5)
jit_simple_many_args_dispatch_100     55.2µs ± 1%  54.8µs ± 2%     ~     (p=0.310 n=5+5)
jit_simple_many_args_100              55.8µs ± 1%  55.8µs ± 1%     ~     (p=0.841 n=5+5)
jit_simple_pruned_args_dispatch_100   14.3µs ± 4%  12.6µs ± 1%  -11.41%  (p=0.016 n=5+4)
jit_simple_pruned_args_100            14.8µs ± 1%  13.3µs ± 2%  -10.06%  (p=0.008 n=5+5)
jit_simple_many_args_dispatch_1000     489µs ± 1%   477µs ± 3%     ~     (p=0.056 n=5+5)
jit_simple_many_args_1000              495µs ± 3%   493µs ± 3%     ~     (p=0.841 n=5+5)
jit_simple_pruned_args_dispatch_1000  85.0µs ± 3%  65.3µs ± 3%  -23.13%  (p=0.008 n=5+5)
jit_simple_pruned_args_1000           86.0µs ± 3%  66.4µs ± 3%  -22.78%  (p=0.008 n=5+5)
jit_simple_many_args_dispatch_2000    1.09ms ± 4%  1.03ms ± 3%   -5.97%  (p=0.016 n=5+5)
jit_simple_many_args_2000             1.07ms ± 3%  1.04ms ± 5%     ~     (p=0.095 n=5+5)
jit_simple_pruned_args_dispatch_2000   190µs ± 3%   144µs ± 3%  -23.96%  (p=0.008 n=5+5)
jit_simple_pruned_args_2000            195µs ± 4%   147µs ± 3%  -24.29%  (p=0.008 n=5+5)
jit_dispatch_without_transfer         76.0µs ± 1%  77.2µs ± 6%     ~     (p=0.310 n=5+5)
jit_dispatch_with_transfer            82.1µs ± 5%  81.3µs ± 2%     ~     (p=0.421 n=5+5)
sda_index_1                           8.83µs ± 1%  8.73µs ± 2%     ~     (p=0.222 n=5+5)

name                                  old time/op             new time/op             delta
eager_unary_dispatch                  35.7µs ± 2%             35.9µs ± 3%     ~             (p=0.841 n=5+5)
eager_unary                           36.5µs ± 2%             37.1µs ± 4%     ~             (p=0.222 n=5+5)
eager_binary_dispatch                 45.6µs ± 1%             46.1µs ± 2%     ~             (p=0.421 n=5+5)
eager_binary                          46.8µs ± 3%             47.1µs ± 5%     ~             (p=1.000 n=5+5)
jit_trivial_dispatch                  41.4µs ± 1%             41.4µs ± 0%     ~             (p=0.690 n=5+5)
jit_trivial                           42.4µs ± 1%             42.3µs ± 1%     ~             (p=0.841 n=5+5)
jit_simple_dispatch                   8.86µs ± 3%             9.15µs ± 3%     ~             (p=0.095 n=5+5)
jit_simple                            9.82µs ± 1%             9.91µs ± 0%     ~             (p=0.190 n=5+4)
jit_simple_many_args_dispatch_10      13.4µs ± 1%             13.6µs ± 4%     ~             (p=0.310 n=5+5)
jit_simple_many_args_10               14.1µs ± 2%             14.2µs ± 1%     ~             (p=0.421 n=5+5)
jit_simple_pruned_args_dispatch_10    8.07µs ± 4%             8.07µs ± 4%     ~             (p=0.841 n=5+5)
jit_simple_pruned_args_10             9.59µs ± 2%             9.48µs ± 2%     ~             (p=0.222 n=5+5)
jit_simple_many_args_dispatch_100     55.2µs ± 1%             54.8µs ± 2%     ~             (p=0.310 n=5+5)
jit_simple_many_args_100              55.9µs ± 1%             55.9µs ± 1%     ~             (p=0.841 n=5+5)
jit_simple_pruned_args_dispatch_100   14.3µs ± 5%             12.6µs ± 1%  -11.75%          (p=0.016 n=5+4)
jit_simple_pruned_args_100            14.8µs ± 2%             13.3µs ± 2%  -10.19%          (p=0.008 n=5+5)
jit_simple_many_args_dispatch_1000     489µs ± 1%              477µs ± 3%     ~             (p=0.056 n=5+5)
jit_simple_many_args_1000              495µs ± 3%              493µs ± 3%     ~             (p=0.841 n=5+5)
jit_simple_pruned_args_dispatch_1000  85.0µs ± 3%             65.3µs ± 3%  -23.13%          (p=0.008 n=5+5)
jit_simple_pruned_args_1000           86.1µs ± 3%             66.5µs ± 2%  -22.72%          (p=0.008 n=5+5)
jit_simple_many_args_dispatch_2000    1.09ms ± 4%             1.03ms ± 3%   -5.96%          (p=0.016 n=5+5)
jit_simple_many_args_2000             1.07ms ± 3%             1.04ms ± 5%     ~             (p=0.095 n=5+5)
jit_simple_pruned_args_dispatch_2000   190µs ± 3%              144µs ± 3%  -23.97%          (p=0.008 n=5+5)
jit_simple_pruned_args_2000            195µs ± 4%              147µs ± 3%  -24.31%          (p=0.008 n=5+5)
jit_dispatch_without_transfer         1.41ms ± 1%             1.40ms ± 1%     ~             (p=0.095 n=5+5)
jit_dispatch_with_transfer            1.40ms ± 2%             1.40ms ± 2%     ~             (p=0.841 n=5+5)
sda_index_1                           8.83µs ± 1%             8.73µs ± 2%     ~             (p=0.222 n=5+5)

PiperOrigin-RevId: 374468578
Change-Id: I0a45af35b936a72f8271bd3e3a66e0d778619132
3 files changed
tree: 35530e2a44bd78a4807ed0bb8e01623e5a6e966b
  1. .github/
  2. tensorflow/
  3. third_party/
  4. tools/
  5. .bazelrc
  6. .bazelversion
  7. .gitignore
  8. ACKNOWLEDGMENTS
  9. arm_compiler.BUILD
  10. AUTHORS
  11. BUILD
  12. CODE_OF_CONDUCT.md
  13. CODEOWNERS
  14. configure
  15. configure.cmd
  16. configure.py
  17. CONTRIBUTING.md
  18. ISSUE_TEMPLATE.md
  19. ISSUES.md
  20. LICENSE
  21. models.BUILD
  22. README.md
  23. RELEASE.md
  24. SECURITY.md
  25. WORKSPACE
README.md

Python PyPI DOI

Documentation
Documentation

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML-powered applications.

TensorFlow was originally developed by researchers and engineers working on the Google Brain team within Google's Machine Intelligence Research organization to conduct machine learning and deep neural networks research. The system is general enough to be applicable in a wide variety of other domains, as well.

TensorFlow provides stable Python and C++ APIs, as well as non-guaranteed backward compatible API for other languages.

Keep up-to-date with release announcements and security updates by subscribing to announce@tensorflow.org. See all the mailing lists.

Install

See the TensorFlow install guide for the pip package, to enable GPU support, use a Docker container, and build from source.

To install the current release, which includes support for CUDA-enabled GPU cards (Ubuntu and Windows):

$ pip install tensorflow

A smaller CPU-only package is also available:

$ pip install tensorflow-cpu

To update TensorFlow to the latest version, add --upgrade flag to the above commands.

Nightly binaries are available for testing using the tf-nightly and tf-nightly-cpu packages on PyPi.

Try your first TensorFlow program

$ python
>>> import tensorflow as tf
>>> tf.add(1, 2).numpy()
3
>>> hello = tf.constant('Hello, TensorFlow!')
>>> hello.numpy()
b'Hello, TensorFlow!'

For more examples, see the TensorFlow tutorials.

Contribution guidelines

If you want to contribute to TensorFlow, be sure to review the contribution guidelines. This project adheres to TensorFlow's code of conduct. By participating, you are expected to uphold this code.

We use GitHub issues for tracking requests and bugs, please see TensorFlow Discuss for general questions and discussion, and please direct specific questions to Stack Overflow.

The TensorFlow project strives to abide by generally accepted best practices in open-source software development:

Fuzzing Status CII Best Practices Contributor Covenant

Continuous build status

Official Builds

Build TypeStatusArtifacts
Linux CPUStatusPyPI
Linux GPUStatusPyPI
Linux XLAStatusTBA
macOSStatusPyPI
Windows CPUStatusPyPI
Windows GPUStatusPyPI
AndroidStatusDownload
Raspberry Pi 0 and 1StatusPy3
Raspberry Pi 2 and 3StatusPy3
Libtensorflow MacOS CPUStatus Temporarily UnavailableNightly Binary Official GCS
Libtensorflow Linux CPUStatus Temporarily UnavailableNightly Binary Official GCS
Libtensorflow Linux GPUStatus Temporarily UnavailableNightly Binary Official GCS
Libtensorflow Windows CPUStatus Temporarily UnavailableNightly Binary Official GCS
Libtensorflow Windows GPUStatus Temporarily UnavailableNightly Binary Official GCS

Community Supported Builds

See TensorFlow SIG Build to find our list of community-supported TensorFlow builds.

Resources

Learn more about the TensorFlow community and how to contribute.

License

Apache License 2.0