| commit | e46c8323be5f4a9033e3e061473a3ff794422804 | [log] [tgz] |
|---|---|---|
| author | Nicolas Vasilache <ntv@google.com> | Wed Sep 04 19:16:32 2019 -0700 |
| committer | TensorFlower Gardener <gardener@tensorflow.org> | Wed Sep 04 19:23:29 2019 -0700 |
| tree | f83734bb847d025830c3994dc54eebd95be787aa | |
| parent | 2d8a56aa4384f9c7d30cdd83fcacfea08b172f25 [diff] |
Use transform function on llvm::Module in the ExecutionEngine
The refactoring of ExecutionEngine dropped the usage of the irTransform function used to pass -O3 and other options to LLVM. As a consequence, the proper optimizations do not kick in in LLMV-land.
This CL makes use of the transform function and allows producing avx512 instructions, on an internal example, when using:
`mlir-cpu-runner -dump-object-file=1 -object-filename=foo.o` combined with `objdump -D foo.o`.
Assembly produced resembles:
```
2b2e: 62 72 7d 48 18 04 0e vbroadcastss (%rsi,%rcx,1),%zmm8
2b35: 62 71 7c 48 28 ce vmovaps %zmm6,%zmm9
2b3b: 62 72 3d 48 a8 c9 vfmadd213ps %zmm1,%zmm8,%zmm9
2b41: 62 f1 7c 48 28 cf vmovaps %zmm7,%zmm1
2b47: 62 f2 3d 48 a8 c8 vfmadd213ps %zmm0,%zmm8,%zmm1
2b4d: 62 f2 7d 48 18 44 0e vbroadcastss 0x4(%rsi,%rcx,1),%zmm0
2b54: 01
2b55: 62 71 7c 48 28 c6 vmovaps %zmm6,%zmm8
2b5b: 62 72 7d 48 a8 c3 vfmadd213ps %zmm3,%zmm0,%zmm8
2b61: 62 f1 7c 48 28 df vmovaps %zmm7,%zmm3
2b67: 62 f2 7d 48 a8 da vfmadd213ps %zmm2,%zmm0,%zmm3
2b6d: 62 f2 7d 48 18 44 0e vbroadcastss 0x8(%rsi,%rcx,1),%zmm0
2b74: 02
2b75: 62 f2 7d 48 a8 f5 vfmadd213ps %zmm5,%zmm0,%zmm6
2b7b: 62 f2 7d 48 a8 fc vfmadd213ps %zmm4,%zmm0,%zmm7
```
etc.
Fixes #120
PiperOrigin-RevId: 267281097
Documentation |
|---|
TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.
TensorFlow was originally developed by researchers and engineers working on the Google Brain team within Google's Machine Intelligence Research organization for the purposes of conducting machine learning and deep neural networks research. The system is general enough to be applicable in a wide variety of other domains, as well.
TensorFlow provides stable Python and C++ APIs, as well as non-guaranteed backwards compatible API for other languages.
Keep up-to-date with release announcements and security updates by subscribing to announce@tensorflow.org. See all the mailing lists.
See the TensorFlow install guide for the pip package, to enable GPU support, use a Docker container, and build from source.
To install the current release for CPU-only:
$ pip install tensorflow
Use the GPU package for CUDA-enabled GPU cards:
$ pip install tensorflow-gpu
Nightly binaries are available for testing using the tf-nightly and tf-nightly-gpu packages on PyPi.
$ python
>>> import tensorflow as tf >>> tf.enable_eager_execution() >>> tf.add(1, 2).numpy() 3 >>> hello = tf.constant('Hello, TensorFlow!') >>> hello.numpy() 'Hello, TensorFlow!'
For more examples, see the TensorFlow tutorials.
If you want to contribute to TensorFlow, be sure to review the contribution guidelines. This project adheres to TensorFlow's code of conduct. By participating, you are expected to uphold this code.
We use GitHub issues for tracking requests and bugs, please see TensorFlow Discuss for general questions and discussion, and please direct specific questions to Stack Overflow.
The TensorFlow project strives to abide by generally accepted best practices in open-source software development:
| Build Type | Status | Artifacts |
|---|---|---|
| Linux CPU | pypi | |
| Linux GPU | pypi | |
| Linux XLA | TBA | |
| MacOS | pypi | |
| Windows CPU | pypi | |
| Windows GPU | pypi | |
| Android | ||
| Raspberry Pi 0 and 1 | Py2 Py3 | |
| Raspberry Pi 2 and 3 | Py2 Py3 |
| Build Type | Status | Artifacts |
|---|---|---|
| Linux AMD ROCm GPU Nightly | Nightly | |
| Linux AMD ROCm GPU Stable Release | Release | |
| Linux s390x Nightly | Nightly | |
| Linux s390x CPU Stable Release | Release | |
| Linux ppc64le CPU Nightly | Nightly | |
| Linux ppc64le CPU Stable Release | Release | |
| Linux ppc64le GPU Nightly | Nightly | |
| Linux ppc64le GPU Stable Release | Release | |
| Linux CPU with Intel® MKL-DNN Nightly | Nightly | |
| Linux CPU with Intel® MKL-DNN Supports Python 2.7, 3.4, 3.5, and 3.6 | 1.13.1 pypi | |
| Red Hat® Enterprise Linux® 7.6 CPU & GPU Python 2.7, 3.6 | 1.13.1 pypi |
Learn more about the TensorFlow community and how to contribute.