commit | 564db5520880732331f13907408b21fb3d0eac21 | [log] [tgz] |
---|---|---|
author | Christian Sigg <csigg@google.com> | Thu Sep 09 23:13:35 2021 -0700 |
committer | TensorFlower Gardener <gardener@tensorflow.org> | Thu Sep 09 23:18:09 2021 -0700 |
tree | 7e95f0401cadc6c14f989a46f567f661015b2eb2 | |
parent | 20d98b02a1ee89673833ce9665e2f9e1f855a9e7 [diff] |
Adding `async.execute` lowering to `tfrt_test.do.async`. This allows scheduling GPU work on multiple streams. The implementation splits the `gpu-to-tfrt-gpu` into 4 separate passes. This is necessary because the (gpu.token to chain+stream within blocks and to chain+event across blocks) conversion of doubly-nested (func and async.execute) code became too complex, with the different patterns (including ones rewriting the op and the tail of the op's region) interacting in unexpected ways. The passes now adhere to the recommendations in the 'Type Conversions the Not-So-Hard Way' [talk](https://mlir.llvm.org/talks): the individual passes rewrite a set of ops and insert source materialization (cast from the illegal type to the legal one) and target materialization (cast the other way) before and after. This keeps the rewrites in flight smaller, gets back to valid IR quicker, and makes the conversion pipeline more controllable and understandable. It has the [disadvantage](https://llvm.discourse.group/t/partial-lowering-with-type-conversions/1166/6) of inserting a significant amount of intermediate casts, but that should be fine for converting those relatively infrequent gpu ops. Passes that make up the `gpu-to-tfrt-gpu` pipeline: - `func-tfrt-streamify`: add chain and stream to func - `async-tfrt-streamify`: convert `async.execute` to chain+event - `gpu-tfrt-streamify`: convert gpu ops to chain+stream, synchronize streams - `async-to-tfrt`: rewrite `async.execute` to `tfrt_test.do.async` No more `tfrt.new_chain`: the chains are new correctly threaded through the stream synchronizations (previously a new chain was created in each `async.execute` region), which is necessary because `tfrt_test.do.async` is non-strict. Only one `tfrt_gpu.stream.get_context`: getting the context is lifted to the function body (previsously there was one per `async.execute` region), which is sufficient because all streams use the same context. PiperOrigin-RevId: 395861586 Change-Id: I74a1008fc251b892cc873d1f8a7d76864c2cd9aa
Documentation |
---|
TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML-powered applications.
TensorFlow was originally developed by researchers and engineers working on the Google Brain team within Google's Machine Intelligence Research organization to conduct machine learning and deep neural networks research. The system is general enough to be applicable in a wide variety of other domains, as well.
TensorFlow provides stable Python and C++ APIs, as well as non-guaranteed backward compatible API for other languages.
Keep up-to-date with release announcements and security updates by subscribing to announce@tensorflow.org. See all the mailing lists.
See the TensorFlow install guide for the pip package, to enable GPU support, use a Docker container, and build from source.
To install the current release, which includes support for CUDA-enabled GPU cards (Ubuntu and Windows):
$ pip install tensorflow
A smaller CPU-only package is also available:
$ pip install tensorflow-cpu
To update TensorFlow to the latest version, add --upgrade
flag to the above commands.
Nightly binaries are available for testing using the tf-nightly and tf-nightly-cpu packages on PyPi.
$ python
>>> import tensorflow as tf >>> tf.add(1, 2).numpy() 3 >>> hello = tf.constant('Hello, TensorFlow!') >>> hello.numpy() b'Hello, TensorFlow!'
For more examples, see the TensorFlow tutorials.
If you want to contribute to TensorFlow, be sure to review the contribution guidelines. This project adheres to TensorFlow's code of conduct. By participating, you are expected to uphold this code.
We use GitHub issues for tracking requests and bugs, please see TensorFlow Discuss for general questions and discussion, and please direct specific questions to Stack Overflow.
The TensorFlow project strives to abide by generally accepted best practices in open-source software development:
You can find more community-supported platforms and configurations in the TensorFlow SIG Build community builds table.
Build Type | Status | Artifacts |
---|---|---|
Linux CPU | PyPI | |
Linux GPU | PyPI | |
Linux XLA | TBA | |
macOS | PyPI | |
Windows CPU | PyPI | |
Windows GPU | PyPI | |
Android | Download | |
Raspberry Pi 0 and 1 | Py3 | |
Raspberry Pi 2 and 3 | Py3 | |
Libtensorflow MacOS CPU | Status Temporarily Unavailable | Nightly Binary Official GCS |
Libtensorflow Linux CPU | Status Temporarily Unavailable | Nightly Binary Official GCS |
Libtensorflow Linux GPU | Status Temporarily Unavailable | Nightly Binary Official GCS |
Libtensorflow Windows CPU | Status Temporarily Unavailable | Nightly Binary Official GCS |
Libtensorflow Windows GPU | Status Temporarily Unavailable | Nightly Binary Official GCS |
Learn more about the TensorFlow community and how to contribute.