Changes to BlockMap, in particular add Hilbert-curve fractal traversal above a certain size threshold.

Renames cache_friendly_traversal_threshold to local_data_cache_size so it's more explicit about what it is in practice. Introduce shared_data_cache_size, needed in the decision of whether to use Hilbert curve.  Hilbert curve is more expensive to decode and only worth it if it allows to reduce DRAM accesses, which depends on shared_data_cache_size. Centralize defaults in a new :cpu_cache_size library. Centralize the reading of these defaults in Spec so that users can override these consistently by passing own spec (either to provide more accurate/runtime values or for test coverage purposes).

On Pixel4, This does not significantly affect latencies, outside of a 1%-2% improvement on latencies on 4 threads on very large matrix sizes.

The motivation for this is that it reduces DRAM accesses: the PMU observes typically a 10% reduction, up to 20%, of 'L3 data cache refill' events on very large matrix multiplications (1000x1000 and above). DRAM accesses should be an increasing function of that, perhaps even more or less proportional to that, so this indicates that this change will significantly reduce DRAM accesses and thus power usage. This was observed consistently on all 2x2=4 combinations of {1, 4} threads on {little, big} cores on Pixel4.

PiperOrigin-RevId: 291531754
Change-Id: I810264f691f2cb884eca59942b957b5e79456a37
11 files changed
tree: 396bc8f256ccaa25c54473648dce2e393726f8f3
  1. .github/
  2. tensorflow/
  3. third_party/
  4. tools/
  5. .bazelrc
  6. .bazelversion
  7. .gitignore
  8. ACKNOWLEDGMENTS
  9. ADOPTERS.md
  10. arm_compiler.BUILD
  11. AUTHORS
  12. BUILD
  13. CODE_OF_CONDUCT.md
  14. CODEOWNERS
  15. configure
  16. configure.cmd
  17. configure.py
  18. CONTRIBUTING.md
  19. ISSUE_TEMPLATE.md
  20. ISSUES.md
  21. LICENSE
  22. models.BUILD
  23. README.md
  24. RELEASE.md
  25. SECURITY.md
  26. WORKSPACE
README.md
Documentation
Documentation

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML-powered applications.

TensorFlow was originally developed by researchers and engineers working on the Google Brain team within Google's Machine Intelligence Research organization to conduct machine learning and deep neural networks research. The system is general enough to be applicable in a wide variety of other domains, as well.

TensorFlow provides stable Python and C++ APIs, as well as non-guaranteed backward compatible API for other languages.

Keep up-to-date with release announcements and security updates by subscribing to announce@tensorflow.org. See all the mailing lists.

Install

See the TensorFlow install guide for the pip package, to enable GPU support, use a Docker container, and build from source.

To install the current release, which includes support for CUDA-enabled GPU cards (Ubuntu and Windows):

$ pip install tensorflow

A smaller CPU-only package is also available:

$ pip install tensorflow-cpu

To update TensorFlow to the latest version, add --upgrade flag to the above commands.

Nightly binaries are available for testing using the tf-nightly and tf-nightly-cpu packages on PyPi.

Try your first TensorFlow program

$ python
>>> import tensorflow as tf
>>> tf.add(1, 2).numpy()
3
>>> hello = tf.constant('Hello, TensorFlow!')
>>> hello.numpy()
'Hello, TensorFlow!'

For more examples, see the TensorFlow tutorials.

Contribution guidelines

If you want to contribute to TensorFlow, be sure to review the contribution guidelines. This project adheres to TensorFlow's code of conduct. By participating, you are expected to uphold this code.

We use GitHub issues for tracking requests and bugs, please see TensorFlow Discuss for general questions and discussion, and please direct specific questions to Stack Overflow.

The TensorFlow project strives to abide by generally accepted best practices in open-source software development:

CII Best Practices Contributor Covenant

Continuous build status

Official Builds

Build TypeStatusArtifacts
Linux CPUStatusPyPI
Linux GPUStatusPyPI
Linux XLAStatusTBA
macOSStatusPyPI
Windows CPUStatusPyPI
Windows GPUStatusPyPI
AndroidStatusDownload
Raspberry Pi 0 and 1Status StatusPy2 Py3
Raspberry Pi 2 and 3Status StatusPy2 Py3

Community Supported Builds

Build TypeStatusArtifacts
Linux AMD ROCm GPU NightlyBuild StatusNightly
Linux AMD ROCm GPU Stable ReleaseBuild StatusRelease 1.15 / 2.x
Linux s390x NightlyBuild StatusNightly
Linux s390x CPU Stable ReleaseBuild StatusRelease
Linux ppc64le CPU NightlyBuild StatusNightly
Linux ppc64le CPU Stable ReleaseBuild StatusRelease 1.15 / 2.x
Linux ppc64le GPU NightlyBuild StatusNightly
Linux ppc64le GPU Stable ReleaseBuild StatusRelease 1.15 / 2.x
Linux CPU with Intel® MKL-DNN NightlyBuild StatusNightly
Linux CPU with Intel® MKL-DNN Stable ReleaseBuild StatusRelease 1.15 / 2.x
Red Hat® Enterprise Linux® 7.6 CPU & GPU
Python 2.7, 3.6
Build Status1.13.1 PyPI

Resources

Learn more about the TensorFlow community and how to contribute.

License

Apache License 2.0