Migration docstring: Add TF2 compatibility information to `tf.compat.v1.train.Optimizer`.
PiperOrigin-RevId: 403483286
Change-Id: Ia8179e6defebc6338a182a5891ed53a215cbe7f6
diff --git a/tensorflow/python/training/optimizer.py b/tensorflow/python/training/optimizer.py
index 0bff8c5..8d7da1a 100644
--- a/tensorflow/python/training/optimizer.py
+++ b/tensorflow/python/training/optimizer.py
@@ -307,6 +307,79 @@
This can be useful if you want to log debug a training algorithm, report stats
about the slots, etc.
+
+ @compatibility(TF2)
+ `tf.compat.v1.train.Optimizer` can be used in eager mode and `tf.function`,
+ but it is not recommended. Please use the subclasses of
+ `tf.keras.optimizers.Optimizer` instead in TF2. Please see [Basic training
+ loops](https://www.tensorflow.org/guide/basic_training_loops) or
+ [Writing a training loop from scratch]
+ (https://www.tensorflow.org/guide/keras/writing_a_training_loop_from_scratch)
+ for examples.
+
+ If your TF1 code contains a `tf.compat.v1.train.Optimizer` symbol, whether it
+ is used with or without a `tf.estimator.Estimator`, you cannot simply replace
+ that with the corresponding `tf.keras.optimizers.Optimizer`s. To migrate to
+ TF2, it is advised the whole training program used with `Estimator` to be
+ migrated to Keras `Model.fit` based or TF2 custom training loops.
+
+ #### Structural Mapping to Native TF2
+
+ Before:
+
+ ```python
+ sgd_op = tf.compat.v1.train.GradientDescentOptimizer(3.0)
+ opt_op = sgd_op.minimize(cost, global_step, [var0, var1])
+ opt_op.run(session=session)
+ ```
+
+ After:
+
+ ```python
+ sgd = tf.keras.optimizers.SGD(3.0)
+ sgd.minimize(cost_fn, [var0, var1])
+ ```
+
+ #### How to Map Arguments
+
+ | TF1 Arg Name | TF2 Arg Name | Note |
+ | :-------------------- | :-------------- | :------------------------- |
+ | `use_locking` | Not supported | - |
+ | `name` | `name. ` | - |
+
+ #### Before & After Usage Example
+
+ Before:
+
+ >>> g = tf.compat.v1.Graph()
+ >>> with g.as_default():
+ ... var0 = tf.compat.v1.Variable([1.0, 2.0])
+ ... var1 = tf.compat.v1.Variable([3.0, 4.0])
+ ... cost = 5 * var0 + 3 * var1
+ ... global_step = tf.compat.v1.Variable(
+ ... tf.compat.v1.zeros([], tf.compat.v1.int64), name='global_step')
+ ... init_op = tf.compat.v1.initialize_all_variables()
+ ... sgd_op = tf.compat.v1.train.GradientDescentOptimizer(3.0)
+ ... opt_op = sgd_op.minimize(cost, global_step, [var0, var1])
+ >>> session = tf.compat.v1.Session(graph=g)
+ >>> session.run(init_op)
+ >>> opt_op.run(session=session)
+ >>> print(session.run(var0))
+ [-14. -13.]
+
+
+ After:
+ >>> var0 = tf.Variable([1.0, 2.0])
+ >>> var1 = tf.Variable([3.0, 4.0])
+ >>> cost_fn = lambda: 5 * var0 + 3 * var1
+ >>> sgd = tf.keras.optimizers.SGD(3.0)
+ >>> sgd.minimize(cost_fn, [var0, var1])
+ >>> print(var0.numpy())
+ [-14. -13.]
+
+ @end_compatibility
+
+
"""
# Values for gate_gradients.
@@ -429,6 +502,23 @@
`IndexedSlices`, or `None` if there is no gradient for the
given variable.
+ @compatibility(TF2)
+ `tf.keras.optimizers.Optimizer` in TF2 does not provide a
+ `compute_gradients` method, and you should use a `tf.GradientTape` to
+ obtain the gradients:
+
+ ```python
+ @tf.function
+ def train step(inputs):
+ batch_data, labels = inputs
+ with tf.GradientTape() as tape:
+ predictions = model(batch_data, training=True)
+ loss = tf.keras.losses.CategoricalCrossentropy(
+ reduction=tf.keras.losses.Reduction.NONE)(labels, predictions)
+ gradients = tape.gradient(loss, model.trainable_variables)
+ optimizer.apply_gradients(zip(gradients, model.trainable_variables))
+ ```
+
Args:
loss: A Tensor containing the value to minimize or a callable taking
no arguments which returns the value to minimize. When eager execution
@@ -538,6 +628,15 @@
This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
+ @compatibility(TF2)
+ #### How to Map Arguments
+
+ | TF1 Arg Name | TF2 Arg Name | Note |
+ | :-------------------- | :-------------- | :------------------------- |
+ | `grads_and_vars` | `grads_and_vars`| - |
+ | `global_step` | Not supported. | Use `optimizer.iterations` |
+ | `name` | `name. ` | - |
+
Args:
grads_and_vars: List of (gradient, variable) pairs as returned by
`compute_gradients()`.