tensorflow/lite/tutorials/post_training_integer_quant.ipynb - platform/external/tensorflow - Git at Google

 {
   "nbformat": 4,
   "nbformat_minor": 0,
   "metadata": {
     "colab": {
       "name": "post-training--integer-quant.ipynb",
       "version": "0.3.2",
       "provenance": [],
       "private_outputs": true,
       "collapsed_sections": [],
       "toc_visible": true
     },
     "kernelspec": {
       "display_name": "Python 2",
       "name": "python2"
     }
   },
   "cells": [
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "6Y8E0lw5eYWm"
       },
       "source": [
         "# Post Training Integer Quantization"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "CIGrZZPTZVeO"
       },
       "source": [
         "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
         "  <td>\n",
         "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/lite/tutorials/post_training_integer_quant.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
         "  </td>\n",
         "  <td>\n",
         "    <a target=\"_blank\" href=\"https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tutorials/post_training_integer_quant.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
         "  </td>\n",
         "</table>"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "BTC1rDAuei_1"
       },
       "source": [
         "## Overview\n",
         "\n",
         "[TensorFlow Lite](https://www.tensorflow.org/lite/) now supports\n",
         "converting an entire model (weights and activations) to 8-bit during model conversion from TensorFlow to TensorFlow Lite's flat buffer format. This results in a 4x reduction in model size and a 3 to 4x performance improvement on CPU performance. In addition, this fully quantized model can be consumed by integer-only hardware accelerators.\n",
         "\n",
         "In contrast to [post-training \"on-the-fly\" quantization](https://colab.sandbox.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/lite/tutorials/post_training_quant.ipynb)\n",
         ", which only stores weights as 8-bit ints, in this technique all weights *and* activations are quantized statically during model conversion.\n",
         "\n",
         "In this tutorial, we train an MNIST model from scratch, check its accuracy in TensorFlow, and then convert the saved model into a Tensorflow Lite flatbuffer\n",
         "with full quantization. We finally check the\n",
         "accuracy of the converted model and compare it to the original saved model. We\n",
         "run the training script [mnist.py](https://github.com/tensorflow/models/blob/master/official/mnist/mnist.py) from\n",
         "[Tensorflow official MNIST tutorial](https://github.com/tensorflow/models/tree/master/official/mnist).\n"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "2XsEP17Zelz9"
       },
       "source": [
         "## Building an MNIST model"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "dDqqUIZjZjac"
       },
       "source": [
         "### Setup"
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "gyqAw1M9lyab",
         "colab": {}
       },
       "source": [
         "! pip uninstall -y tensorflow\n",
         "! pip install -U tf-nightly"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "WsN6s5L1ieNl",
         "colab": {}
       },
       "source": [
         "import tensorflow as tf\n",
         "tf.enable_eager_execution()"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "00U0taBoe-w7",
         "colab": {}
       },
       "source": [
         "! git clone --depth 1 https://github.com/tensorflow/models"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "4XZPtSh-fUOc",
         "colab": {}
       },
       "source": [
         "import sys\n",
         "import os\n",
         "\n",
         "if sys.version_info.major >= 3:\n",
         "    import pathlib\n",
         "else:\n",
         "    import pathlib2 as pathlib\n",
         "\n",
         "# Add `models` to the python path.\n",
         "models_path = os.path.join(os.getcwd(), \"models\")\n",
         "sys.path.append(models_path)"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "eQ6Q0qqKZogR"
       },
       "source": [
         "### Train and export the model"
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "eMsw_6HujaqM",
         "colab": {}
       },
       "source": [
         "saved_models_root = \"/tmp/mnist_saved_model\""
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "hWSAjQWagIHl",
         "colab": {}
       },
       "source": [
         "# The above path addition is not visible to subprocesses, add the path for the subprocess as well.\n",
         "# Note: channels_last is required here or the conversion may fail. \n",
         "!PYTHONPATH={models_path} python models/official/mnist/mnist.py --train_epochs=1 --export_dir {saved_models_root} --data_format=channels_last"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "5NMaNZQCkW9X"
       },
       "source": [
         "For the example, we only trained the model for a single epoch, so it only trains to ~96% accuracy.\n",
         "\n"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "xl8_fzVAZwOh"
       },
       "source": [
         "### Convert to a TensorFlow Lite model\n",
         "\n",
         "The `savedmodel` directory is named with a timestamp. Select the most recent one: "
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "Xp5oClaZkbtn",
         "colab": {}
       },
       "source": [
         "saved_model_dir = str(sorted(pathlib.Path(saved_models_root).glob(\"*\"))[-1])\n",
         "saved_model_dir"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "AT8BgkKmljOy"
       },
       "source": [
         "Using the [Python `TFLiteConverter`](https://www.tensorflow.org/lite/convert/python_api), the saved model can be converted into a TensorFlow Lite model.\n",
         "\n",
         "First load the model using the `TFLiteConverter`:"
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "_i8B2nDZmAgQ",
         "colab": {}
       },
       "source": [
         "import tensorflow as tf\n",
         "tf.enable_eager_execution()\n",
         "tf.logging.set_verbosity(tf.logging.DEBUG)\n",
         "\n",
         "converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)\n",
         "tflite_model = converter.convert()"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "F2o2ZfF0aiCx"
       },
       "source": [
         "Write it out to a `.tflite` file:"
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "vptWZq2xnclo",
         "colab": {}
       },
       "source": [
         "tflite_models_dir = pathlib.Path(\"/tmp/mnist_tflite_models/\")\n",
         "tflite_models_dir.mkdir(exist_ok=True, parents=True)"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "Ie9pQaQrn5ue",
         "colab": {}
       },
       "source": [
         "tflite_model_file = tflite_models_dir/\"mnist_model.tflite\"\n",
         "tflite_model_file.write_bytes(tflite_model)"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "7BONhYtYocQY"
       },
       "source": [
         "To instead quantize the model on export, first set the `optimizations` flag to optimize for size:"
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "HEZ6ET1AHAS3",
         "colab": {}
       },
       "source": [
         "tf.logging.set_verbosity(tf.logging.INFO)\n",
         "converter.optimizations = [tf.lite.Optimize.DEFAULT]"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "id": "rTe8avZJHMDO",
         "colab_type": "text"
       },
       "source": [
         "Now, construct and provide a representative dataset, this is used to get the dynamic range of activations."
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "id": "FiwiWU3gHdkW",
         "colab_type": "code",
         "colab": {}
       },
       "source": [
         "mnist_train, _ = tf.keras.datasets.mnist.load_data()\n",
         "images = tf.cast(mnist_train[0], tf.float32)/255.0\n",
         "mnist_ds = tf.data.Dataset.from_tensor_slices((images)).batch(1)\n",
         "def representative_data_gen():\n",
         "  for input_value in mnist_ds.take(100):\n",
         "    yield [input_value]\n",
         "\n",
         "converter.representative_dataset = representative_data_gen"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "id": "xW84iMYjHd9t",
         "colab_type": "text"
       },
       "source": [
         "Finally, convert the model like usual. Note, by default the converted model will still use float input and outputs for invocation convenience."
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "id": "yuNfl3CoHNK3",
         "colab_type": "code",
         "colab": {}
       },
       "source": [
         "tflite_quant_model = converter.convert()\n",
         "tflite_model_quant_file = tflite_models_dir/\"mnist_model_quant.tflite\"\n",
         "tflite_model_quant_file.write_bytes(tflite_quant_model)"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "PhMmUTl4sbkz"
       },
       "source": [
         "Note how the resulting file is approximately `1/4` the size."
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "JExfcfLDscu4",
         "colab": {}
       },
       "source": [
         "!ls -lh {tflite_models_dir}"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "L8lQHMp_asCq"
       },
       "source": [
         "## Run the TensorFlow Lite models"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "-5l6-ciItvX6"
       },
       "source": [
         "We can run the TensorFlow Lite model using the Python TensorFlow Lite\n",
         "Interpreter. \n",
         "\n",
         "### Load the test data\n",
         "\n",
         "First, let's load the MNIST test data to feed to the model:"
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "eTIuU07NuKFL",
         "colab": {}
       },
       "source": [
         "import numpy as np\n",
         "_, mnist_test = tf.keras.datasets.mnist.load_data()\n",
         "images, labels = tf.cast(mnist_test[0], tf.float32)/255.0, mnist_test[1]\n",
         "\n",
         "mnist_ds = tf.data.Dataset.from_tensor_slices((images, labels)).batch(1)"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "Ap_jE7QRvhPf"
       },
       "source": [
         "### Load the model into the interpreters"
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "Jn16Rc23zTss",
         "colab": {}
       },
       "source": [
         "interpreter = tf.lite.Interpreter(model_path=str(tflite_model_file))\n",
         "interpreter.allocate_tensors()"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "J8Pztk1mvNVL",
         "colab": {}
       },
       "source": [
         "interpreter_quant = tf.lite.Interpreter(model_path=str(tflite_model_quant_file))\n",
         "interpreter_quant.allocate_tensors()"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "2opUt_JTdyEu"
       },
       "source": [
         "### Test the models on one image"
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "AKslvo2kwWac",
         "colab": {}
       },
       "source": [
         "for img, label in mnist_ds:\n",
         "  break\n",
         "\n",
         "interpreter.set_tensor(interpreter.get_input_details()[0][\"index\"], img)\n",
         "interpreter.invoke()\n",
         "predictions = interpreter.get_tensor(\n",
         "    interpreter.get_output_details()[0][\"index\"])"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "XZClM2vo3_bm",
         "colab": {}
       },
       "source": [
         "import matplotlib.pylab as plt\n",
         "\n",
         "plt.imshow(img[0])\n",
         "template = \"True:{true}, predicted:{predict}\"\n",
         "_ = plt.title(template.format(true= str(label[0].numpy()),\n",
         "                              predict=str(predictions[0])))\n",
         "plt.grid(False)"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "3gwhv4lKbYZ4",
         "colab": {}
       },
       "source": [
         "interpreter_quant.set_tensor(\n",
         "    interpreter_quant.get_input_details()[0][\"index\"], img)\n",
         "interpreter_quant.invoke()\n",
         "predictions = interpreter_quant.get_tensor(\n",
         "    interpreter_quant.get_output_details()[0][\"index\"])"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "CIH7G_MwbY2x",
         "colab": {}
       },
       "source": [
         "plt.imshow(img[0])\n",
         "template = \"True:{true}, predicted:{predict}\"\n",
         "_ = plt.title(template.format(true= str(label[0].numpy()),\n",
         "                              predict=str(predictions[0])))\n",
         "plt.grid(False)"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "LwN7uIdCd8Gw"
       },
       "source": [
         "### Evaluate the models"
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "05aeAuWjvjPx",
         "colab": {}
       },
       "source": [
         "def eval_model(interpreter, mnist_ds):\n",
         "  total_seen = 0\n",
         "  num_correct = 0\n",
         "\n",
         "  input_index = interpreter.get_input_details()[0][\"index\"]\n",
         "  output_index = interpreter.get_output_details()[0][\"index\"]\n",
         "  for img, label in mnist_ds:\n",
         "    total_seen += 1\n",
         "    interpreter.set_tensor(input_index, img)\n",
         "    interpreter.invoke()\n",
         "    predictions = interpreter.get_tensor(output_index)\n",
         "    if predictions == label.numpy():\n",
         "      num_correct += 1\n",
         "\n",
         "    if total_seen % 500 == 0:\n",
         "      print(\"Accuracy after %i images: %f\" %\n",
         "            (total_seen, float(num_correct) / float(total_seen)))\n",
         "\n",
         "  return float(num_correct) / float(total_seen)"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "T5mWkSbMcU5z",
         "colab": {}
       },
       "source": [
         "print(eval_model(interpreter, mnist_ds))"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "Km3cY9ry8ZlG"
       },
       "source": [
         "We can repeat the evaluation on the fully quantized model to obtain:\n"
       ]
     },
     {
       "cell_type": "code",
       "metadata": {
         "colab_type": "code",
         "id": "-9cnwiPp6EGm",
         "colab": {}
       },
       "source": [
         "# NOTE: Colab runs on server CPUs. At the time of writing this, TensorFlow Lite\n",
         "# doesn't have super optimized server CPU kernels. For this reason this may be\n",
         "# slower than the above float interpreter. But for mobile CPUs, considerable\n",
         "# speedup can be observed.\n",
         "print(eval_model(interpreter_quant, mnist_ds))\n"
       ],
       "execution_count": 0,
       "outputs": []
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "colab_type": "text",
         "id": "L7lfxkor8pgv"
       },
       "source": [
         "In this example, we have fully quantized a model with no difference in the accuracy."
       ]
     }
   ]
 }
	{
	"nbformat": 4,
	"nbformat_minor": 0,
	"metadata": {
	"colab": {
	"name": "post-training--integer-quant.ipynb",
	"version": "0.3.2",
	"provenance": [],
	"private_outputs": true,
	"collapsed_sections": [],
	"toc_visible": true
	},
	"kernelspec": {
	"display_name": "Python 2",
	"name": "python2"
	}
	},
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "6Y8E0lw5eYWm"
	},
	"source": [
	"# Post Training Integer Quantization"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "CIGrZZPTZVeO"
	},
	"source": [
	"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
	" <td>\n",
	" <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/lite/tutorials/post_training_integer_quant.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
	" </td>\n",
	" <td>\n",
	" <a target=\"_blank\" href=\"https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tutorials/post_training_integer_quant.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
	" </td>\n",
	"</table>"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "BTC1rDAuei_1"
	},
	"source": [
	"## Overview\n",
	"\n",
	"[TensorFlow Lite](https://www.tensorflow.org/lite/) now supports\n",
	"converting an entire model (weights and activations) to 8-bit during model conversion from TensorFlow to TensorFlow Lite's flat buffer format. This results in a 4x reduction in model size and a 3 to 4x performance improvement on CPU performance. In addition, this fully quantized model can be consumed by integer-only hardware accelerators.\n",
	"\n",
	"In contrast to [post-training \"on-the-fly\" quantization](https://colab.sandbox.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/lite/tutorials/post_training_quant.ipynb)\n",
	", which only stores weights as 8-bit ints, in this technique all weights and activations are quantized statically during model conversion.\n",
	"\n",
	"In this tutorial, we train an MNIST model from scratch, check its accuracy in TensorFlow, and then convert the saved model into a Tensorflow Lite flatbuffer\n",
	"with full quantization. We finally check the\n",
	"accuracy of the converted model and compare it to the original saved model. We\n",
	"run the training script [mnist.py](https://github.com/tensorflow/models/blob/master/official/mnist/mnist.py) from\n",
	"[Tensorflow official MNIST tutorial](https://github.com/tensorflow/models/tree/master/official/mnist).\n"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "2XsEP17Zelz9"
	},
	"source": [
	"## Building an MNIST model"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "dDqqUIZjZjac"
	},
	"source": [
	"### Setup"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "gyqAw1M9lyab",
	"colab": {}
	},
	"source": [
	"! pip uninstall -y tensorflow\n",
	"! pip install -U tf-nightly"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "WsN6s5L1ieNl",
	"colab": {}
	},
	"source": [
	"import tensorflow as tf\n",
	"tf.enable_eager_execution()"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "00U0taBoe-w7",
	"colab": {}
	},
	"source": [
	"! git clone --depth 1 https://github.com/tensorflow/models"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "4XZPtSh-fUOc",
	"colab": {}
	},
	"source": [
	"import sys\n",
	"import os\n",
	"\n",
	"if sys.version_info.major >= 3:\n",
	" import pathlib\n",
	"else:\n",
	" import pathlib2 as pathlib\n",
	"\n",
	"# Add `models` to the python path.\n",
	"models_path = os.path.join(os.getcwd(), \"models\")\n",
	"sys.path.append(models_path)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "eQ6Q0qqKZogR"
	},
	"source": [
	"### Train and export the model"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "eMsw_6HujaqM",
	"colab": {}
	},
	"source": [
	"saved_models_root = \"/tmp/mnist_saved_model\""
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "hWSAjQWagIHl",
	"colab": {}
	},
	"source": [
	"# The above path addition is not visible to subprocesses, add the path for the subprocess as well.\n",
	"# Note: channels_last is required here or the conversion may fail. \n",
	"!PYTHONPATH={models_path} python models/official/mnist/mnist.py --train_epochs=1 --export_dir {saved_models_root} --data_format=channels_last"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "5NMaNZQCkW9X"
	},
	"source": [
	"For the example, we only trained the model for a single epoch, so it only trains to ~96% accuracy.\n",
	"\n"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "xl8_fzVAZwOh"
	},
	"source": [
	"### Convert to a TensorFlow Lite model\n",
	"\n",
	"The `savedmodel` directory is named with a timestamp. Select the most recent one: "
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "Xp5oClaZkbtn",
	"colab": {}
	},
	"source": [
	"saved_model_dir = str(sorted(pathlib.Path(saved_models_root).glob(\"*\"))[-1])\n",
	"saved_model_dir"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "AT8BgkKmljOy"
	},
	"source": [
	"Using the [Python `TFLiteConverter`](https://www.tensorflow.org/lite/convert/python_api), the saved model can be converted into a TensorFlow Lite model.\n",
	"\n",
	"First load the model using the `TFLiteConverter`:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "_i8B2nDZmAgQ",
	"colab": {}
	},
	"source": [
	"import tensorflow as tf\n",
	"tf.enable_eager_execution()\n",
	"tf.logging.set_verbosity(tf.logging.DEBUG)\n",
	"\n",
	"converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)\n",
	"tflite_model = converter.convert()"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "F2o2ZfF0aiCx"
	},
	"source": [
	"Write it out to a `.tflite` file:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "vptWZq2xnclo",
	"colab": {}
	},
	"source": [
	"tflite_models_dir = pathlib.Path(\"/tmp/mnist_tflite_models/\")\n",
	"tflite_models_dir.mkdir(exist_ok=True, parents=True)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "Ie9pQaQrn5ue",
	"colab": {}
	},
	"source": [
	"tflite_model_file = tflite_models_dir/\"mnist_model.tflite\"\n",
	"tflite_model_file.write_bytes(tflite_model)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "7BONhYtYocQY"
	},
	"source": [
	"To instead quantize the model on export, first set the `optimizations` flag to optimize for size:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "HEZ6ET1AHAS3",
	"colab": {}
	},
	"source": [
	"tf.logging.set_verbosity(tf.logging.INFO)\n",
	"converter.optimizations = [tf.lite.Optimize.DEFAULT]"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "rTe8avZJHMDO",
	"colab_type": "text"
	},
	"source": [
	"Now, construct and provide a representative dataset, this is used to get the dynamic range of activations."
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "FiwiWU3gHdkW",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"mnist_train, _ = tf.keras.datasets.mnist.load_data()\n",
	"images = tf.cast(mnist_train[0], tf.float32)/255.0\n",
	"mnist_ds = tf.data.Dataset.from_tensor_slices((images)).batch(1)\n",
	"def representative_data_gen():\n",
	" for input_value in mnist_ds.take(100):\n",
	" yield [input_value]\n",
	"\n",
	"converter.representative_dataset = representative_data_gen"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "xW84iMYjHd9t",
	"colab_type": "text"
	},
	"source": [
	"Finally, convert the model like usual. Note, by default the converted model will still use float input and outputs for invocation convenience."
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "yuNfl3CoHNK3",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"tflite_quant_model = converter.convert()\n",
	"tflite_model_quant_file = tflite_models_dir/\"mnist_model_quant.tflite\"\n",
	"tflite_model_quant_file.write_bytes(tflite_quant_model)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "PhMmUTl4sbkz"
	},
	"source": [
	"Note how the resulting file is approximately `1/4` the size."
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "JExfcfLDscu4",
	"colab": {}
	},
	"source": [
	"!ls -lh {tflite_models_dir}"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "L8lQHMp_asCq"
	},
	"source": [
	"## Run the TensorFlow Lite models"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "-5l6-ciItvX6"
	},
	"source": [
	"We can run the TensorFlow Lite model using the Python TensorFlow Lite\n",
	"Interpreter. \n",
	"\n",
	"### Load the test data\n",
	"\n",
	"First, let's load the MNIST test data to feed to the model:"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "eTIuU07NuKFL",
	"colab": {}
	},
	"source": [
	"import numpy as np\n",
	"_, mnist_test = tf.keras.datasets.mnist.load_data()\n",
	"images, labels = tf.cast(mnist_test[0], tf.float32)/255.0, mnist_test[1]\n",
	"\n",
	"mnist_ds = tf.data.Dataset.from_tensor_slices((images, labels)).batch(1)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "Ap_jE7QRvhPf"
	},
	"source": [
	"### Load the model into the interpreters"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "Jn16Rc23zTss",
	"colab": {}
	},
	"source": [
	"interpreter = tf.lite.Interpreter(model_path=str(tflite_model_file))\n",
	"interpreter.allocate_tensors()"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "J8Pztk1mvNVL",
	"colab": {}
	},
	"source": [
	"interpreter_quant = tf.lite.Interpreter(model_path=str(tflite_model_quant_file))\n",
	"interpreter_quant.allocate_tensors()"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "2opUt_JTdyEu"
	},
	"source": [
	"### Test the models on one image"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "AKslvo2kwWac",
	"colab": {}
	},
	"source": [
	"for img, label in mnist_ds:\n",
	" break\n",
	"\n",
	"interpreter.set_tensor(interpreter.get_input_details()[0][\"index\"], img)\n",
	"interpreter.invoke()\n",
	"predictions = interpreter.get_tensor(\n",
	" interpreter.get_output_details()[0][\"index\"])"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "XZClM2vo3_bm",
	"colab": {}
	},
	"source": [
	"import matplotlib.pylab as plt\n",
	"\n",
	"plt.imshow(img[0])\n",
	"template = \"True:{true}, predicted:{predict}\"\n",
	"_ = plt.title(template.format(true= str(label[0].numpy()),\n",
	" predict=str(predictions[0])))\n",
	"plt.grid(False)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "3gwhv4lKbYZ4",
	"colab": {}
	},
	"source": [
	"interpreter_quant.set_tensor(\n",
	" interpreter_quant.get_input_details()[0][\"index\"], img)\n",
	"interpreter_quant.invoke()\n",
	"predictions = interpreter_quant.get_tensor(\n",
	" interpreter_quant.get_output_details()[0][\"index\"])"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "CIH7G_MwbY2x",
	"colab": {}
	},
	"source": [
	"plt.imshow(img[0])\n",
	"template = \"True:{true}, predicted:{predict}\"\n",
	"_ = plt.title(template.format(true= str(label[0].numpy()),\n",
	" predict=str(predictions[0])))\n",
	"plt.grid(False)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "LwN7uIdCd8Gw"
	},
	"source": [
	"### Evaluate the models"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "05aeAuWjvjPx",
	"colab": {}
	},
	"source": [
	"def eval_model(interpreter, mnist_ds):\n",
	" total_seen = 0\n",
	" num_correct = 0\n",
	"\n",
	" input_index = interpreter.get_input_details()[0][\"index\"]\n",
	" output_index = interpreter.get_output_details()[0][\"index\"]\n",
	" for img, label in mnist_ds:\n",
	" total_seen += 1\n",
	" interpreter.set_tensor(input_index, img)\n",
	" interpreter.invoke()\n",
	" predictions = interpreter.get_tensor(output_index)\n",
	" if predictions == label.numpy():\n",
	" num_correct += 1\n",
	"\n",
	" if total_seen % 500 == 0:\n",
	" print(\"Accuracy after %i images: %f\" %\n",
	" (total_seen, float(num_correct) / float(total_seen)))\n",
	"\n",
	" return float(num_correct) / float(total_seen)"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "T5mWkSbMcU5z",
	"colab": {}
	},
	"source": [
	"print(eval_model(interpreter, mnist_ds))"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "Km3cY9ry8ZlG"
	},
	"source": [
	"We can repeat the evaluation on the fully quantized model to obtain:\n"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"colab_type": "code",
	"id": "-9cnwiPp6EGm",
	"colab": {}
	},
	"source": [
	"# NOTE: Colab runs on server CPUs. At the time of writing this, TensorFlow Lite\n",
	"# doesn't have super optimized server CPU kernels. For this reason this may be\n",
	"# slower than the above float interpreter. But for mobile CPUs, considerable\n",
	"# speedup can be observed.\n",
	"print(eval_model(interpreter_quant, mnist_ds))\n"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"colab_type": "text",
	"id": "L7lfxkor8pgv"
	},
	"source": [
	"In this example, we have fully quantized a model with no difference in the accuracy."
	]
	}
	]
	}