If you have a feedforward or convolutional model for classification that is converging too slowly, K-FAC is for you. K-FAC can be used in your model if:
Using K-FAC requires three steps:
LayerCollection
.KfacOptimizer
.# Build model. w = tf.get_variable("w", ...) b = tf.get_variable("b", ...) logits = tf.matmul(x, w) + b loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=logits)) # Register layers. layer_collection = LayerCollection() layer_collection.register_fully_connected((w, b), x, logits) layer_collection.register_categorical_predictive_distribution(logits) # Construct training ops. optimizer = KfacOptimizer(..., layer_collection=layer_collection) train_op = optimizer.minimize(loss) # Minimize loss. with tf.Session() as sess: ... sess.run([train_op, optimizer.cov_update_op, optimizer.inv_update_op])
See examples/
for runnable, end-to-end illustrations.