armnn: sync to 20.11 release

Sync to 20.11 release.

Bug: 175039528
Test: build

Merge commit 'fc55a19c73240236d30482a6189f438ffb2e2cc4' into HEAD

Change-Id: I9d5712d100cd3f8621356fcfb337f299ccc0e0ac
diff --git a/README.md b/README.md
index f76085c..e6959eb 100644
--- a/README.md
+++ b/README.md
@@ -14,6 +14,8 @@
 
 There is a guide for installation of ArmNN, Tensorflow Lite Parser and PyArmnn via our Apt Repository: [Installation via Apt Repository](InstallationViaAptRepository.md)
 
+There is a getting started guide for our ArmNN TfLite Delegate: [Build the TfLite Delegate natively](delegate/BuildGuideNative.md)
+
 API Documentation is available at https://github.com/ARM-software/armnn/wiki/Documentation.
 
 Dox files to generate Arm NN doxygen files can be found at armnn/docs/. Following generation the xhtml files can be found at armnn/documentation/
diff --git a/cmake/GlobalConfig.cmake b/cmake/GlobalConfig.cmake
index 843ad6b..921fe72 100644
--- a/cmake/GlobalConfig.cmake
+++ b/cmake/GlobalConfig.cmake
@@ -353,6 +353,10 @@
     add_definitions(-DARMNNREF_ENABLED)
 endif()
 
+if(ETHOSN_SUPPORT)
+    add_definitions(-DETHOSN_SUPPORT_ENABLED)
+endif()
+
 # This is the root for the dynamic backend tests to search for dynamic
 # backends. By default it will be the project build directory.
 add_definitions(-DDYNAMIC_BACKEND_BUILD_DIR="${PROJECT_BINARY_DIR}")
diff --git a/delegate/BuildGuideNative.md b/delegate/BuildGuideNative.md
new file mode 100644
index 0000000..0f591d1
--- /dev/null
+++ b/delegate/BuildGuideNative.md
@@ -0,0 +1,237 @@
+# Introduction
+
+The ArmNN Delegate can be found within the ArmNN repository but it is a standalone piece of software. However,
+it makes use of the ArmNN library. For this reason we have added two options to build the delegate. The first option
+allows you to build the delegate together with the ArmNN library, the second option is a standalone build 
+of the delegate.
+
+This tutorial uses an Aarch64 machine with Ubuntu 18.04 installed that can build all components
+natively (no cross-compilation required). This is to keep this guide simple.
+
+1. [Dependencies](#Dependencies)
+   * [Build Tensorflow for C++](#Build Tensorflow for C++)
+   * [Build Flatbuffers](#Build Flatbuffers)
+   * [Build the Arm Compute Library](#Build the Arm Compute Library)
+   * [Build the ArmNN Library](#Build the ArmNN Library)
+2. [Build the TfLite Delegate (Stand-Alone)](#Build the TfLite Delegate (Stand-Alone))
+3. [Build the Delegate together with ArmNN](#Build the Delegate together with ArmNN)
+4. [Integrate the ArmNN TfLite Delegate into your project](#Integrate the ArmNN TfLite Delegate into your project)
+
+# Dependencies
+
+Build Dependencies:
+ * Tensorflow and Tensorflow Lite version 2.3.1
+ * Flatbuffers 1.12.0
+ * ArmNN 20.11 or higher
+
+Required Tools:
+ * Git
+ * pip
+ * wget
+ * zip
+ * unzip
+ * cmake 3.7.0 or higher
+ * scons
+ * bazel 3.1.0
+
+Our first step is to build all the build dependencies I have mentioned above. We will have to create quite a few
+directories. To make navigation a bit easier define a base directory for the project. At this stage we can also
+install all the tools that are required during the build.
+```bash
+export BASEDIR=/home
+cd $BASEDIR
+apt-get update && apt-get install git wget unzip zip python git cmake scons
+```
+
+## Build Tensorflow for C++
+Tensorflow has a few dependencies on it's own. It requires the python packages pip3, numpy, wheel, keras_preprocessing
+and also bazel which is used to compile Tensoflow. A description on how to build bazel can be 
+found [here](https://docs.bazel.build/versions/master/install-compile-source.html). There are multiple ways. 
+I decided to compile from source because that should work for any platform and therefore adds the most value 
+to this guide. Depending on your operating system and architecture there might be an easier way.
+```bash
+# Install the python packages
+pip3 install -U pip numpy wheel
+pip3 install -U keras_preprocessing --no-deps
+
+# Bazel has a dependency on JDK
+apt-get install openjdk-11-jdk
+# Build Bazel
+wget -O bazel-3.1.0-dist.zip https://github.com/bazelbuild/bazel/releases/download/3.1.0/bazel-3.1.0-dist.zip
+unzip -d bazel bazel-3.1.0-dist.zip
+cd bazel
+env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh 
+# This creates an "output" directory where the bazel binary can be found
+ 
+# Download Tensorflow
+cd $BASEDIR
+git clone https://github.com/tensorflow/tensorflow.git
+cd tensorflow/
+git checkout tags/v2.3.1 # Minimum version required for the delegate
+```
+Before tensorflow can be built, targets need to be defined in the `BUILD` file that can be 
+found in the root directory of Tensorflow. Append the following two targets to the file:
+```
+cc_binary(
+     name = "libtensorflow_all.so",
+     linkshared = 1,
+     deps = [
+         "//tensorflow/core:framework",
+         "//tensorflow/core:tensorflow",
+         "//tensorflow/cc:cc_ops",
+         "//tensorflow/cc:client_session",
+         "//tensorflow/cc:scope",
+         "//tensorflow/c:c_api",
+     ],
+)
+cc_binary(
+     name = "libtensorflow_lite_all.so",
+     linkshared = 1,
+     deps = [
+         "//tensorflow/lite:framework",
+         "//tensorflow/lite/kernels:builtin_ops",
+     ],
+)
+```
+Now the build process can be started. When calling "configure", as below, a dialog shows up that asks the 
+user to specify additional options. If you don't have any particular needs to your build, decline all 
+additional options and choose default values. Building `libtensorflow_all.so` requires quite some time. 
+This might be a good time to get yourself another drink and take a break.
+```bash
+PATH="$BASEDIR/bazel/output:$PATH" ./configure
+$BASEDIR/bazel/output/bazel build --define=grpc_no_ares=true --config=opt --config=monolithic --strip=always --config=noaws libtensorflow_all.so
+$BASEDIR/bazel/output/bazel build --config=opt --config=monolithic --strip=always libtensorflow_lite_all.so
+```
+
+## Build Flatbuffers
+
+Flatbuffers is a memory efficient cross-platform serialization library as 
+described [here](https://google.github.io/flatbuffers/). It is used in tflite to store models and is also a dependency 
+of the delegate. After downloading the right version it can be built and installed using cmake.
+```bash
+cd $BASEDIR
+wget -O flatbuffers-1.12.0.zip https://github.com/google/flatbuffers/archive/v1.12.0.zip
+unzip -d . flatbuffers-1.12.0.zip
+cd flatbuffers-1.12.0 
+mkdir install && mkdir build && cd build
+# I'm using a different install directory but that is not required
+cmake .. -DCMAKE_INSTALL_PREFIX:PATH=$BASEDIR/flatbuffers-1.12.0/install 
+make install
+```
+
+## Build the Arm Compute Library
+
+The ArmNN library depends on the Arm Compute Library (ACL). It provides a set of functions that are optimized for 
+both Arm CPUs and GPUs. The Arm Compute Library is used directly by ArmNN to run machine learning workloads on 
+Arm CPUs and GPUs.
+
+It is important to have the right version of ACL and ArmNN to make it work. Luckily, ArmNN and ACL are developed 
+very closely and released together. If you would like to use the ArmNN version "20.11" you can use the same "20.11"
+version for ACL too.
+
+To build the Arm Compute Library on your platform, download the Arm Compute Library and check the branch 
+out that contains the version you want to use and build it using `scons`.
+```bash
+cd $BASEDIR
+git clone https://review.mlplatform.org/ml/ComputeLibrary 
+cd ComputeLibrary/
+git checkout <branch_name> # e.g. branches/arm_compute_20_11
+# The machine used for this guide only has a Neon CPU which is why I only have "neon=1" but if 
+# your machine has an arm Gpu you can enable that by adding `opencl=1 embed_kernels=1 to the command below
+scons arch=arm64-v8a neon=1 extra_cxx_flags="-fPIC" benchmark_tests=0 validation_tests=0 
+```
+
+## Build the ArmNN Library
+
+After building ACL we can now continue building ArmNN. To do so, download the repository and checkout the same 
+version as you did for ACL. Create a build directory and use cmake to build it.
+```bash
+cd $BASEDIR
+git clone "https://review.mlplatform.org/ml/armnn" 
+cd armnn
+git checkout <branch_name> # e.g. branches/armnn_20_11
+mkdir build && cd build
+# if you've got an arm Gpu add `-DARMCOMPUTECL=1` to the command below
+cmake .. -DARMCOMPUTE_ROOT=$BASEDIR/ComputeLibrary -DARMCOMPUTENEON=1 -DBUILD_UNIT_TESTS=0 
+make
+```
+
+# Build the TfLite Delegate (Stand-Alone)
+
+The delegate as well as ArmNN is built using cmake. Create a build directory as usual and build the Delegate
+with the additional cmake arguments shown below
+```bash
+cd $BASEDIR/armnn/delegate && mkdir build && cd build
+cmake .. -DTENSORFLOW_LIB_DIR=$BASEDIR/tensorflow/bazel-bin \     # Directory where tensorflow libraries can be found
+         -DTENSORFLOW_ROOT=$BASEDIR/tensorflow \                  # The top directory of the tensorflow repository
+         -DTFLITE_LIB_ROOT=$BASEDIR/tensorflow/bazel-bin \        # In our case the same as TENSORFLOW_LIB_DIR 
+         -DFLATBUFFERS_ROOT=$BASEDIR/flatbuffers-1.12.0/install \ # The install directory 
+         -DArmnn_DIR=$BASEDIR/armnn/build \                       # Directory where the ArmNN library can be found
+         -DARMNN_SOURCE_DIR=$BASEDIR/armnn                        # The top directory of the ArmNN repository. 
+                                                                  # Required are the includes for ArmNN
+make
+```
+
+To ensure that the build was successful you can run the unit tests for the delegate that can be found in 
+the build directory for the delegate. [Doctest](https://github.com/onqtam/doctest) was used to create those tests. Using test filters you can
+filter out tests that your build is not configured for. In this case, because ArmNN was only built for Cpu 
+acceleration (CpuAcc), we filter for all test suites that have `CpuAcc` in their name.
+```bash
+cd $BASEDIR/armnn/delegate/build
+./DelegateUnitTests --test-suite=*CpuAcc* 
+```
+If you have built for Gpu acceleration as well you might want to change your test-suite filter:
+```bash
+./DelegateUnitTests --test-suite=*CpuAcc*,*GpuAcc*
+```
+
+
+# Build the Delegate together with ArmNN
+
+In the introduction it was mentioned that there is a way to integrate the delegate build into ArmNN. This is
+pretty straight forward. The cmake arguments that were previously used for the delegate have to be added
+to the ArmNN cmake arguments. Also another argument `BUILD_ARMNN_TFLITE_DELEGATE` needs to be added to 
+instruct ArmNN to build the delegate as well. The new commands to build ArmNN are as follows:
+```bash
+cd $BASEDIR
+git clone "https://review.mlplatform.org/ml/armnn" 
+cd armnn
+git checkout <branch_name> # e.g. branches/armnn_20_11
+mkdir build && cd build
+# if you've got an arm Gpu add `-DARMCOMPUTECL=1` to the command below
+cmake .. -DARMCOMPUTE_ROOT=$BASEDIR/ComputeLibrary \
+         -DARMCOMPUTENEON=1 \
+         -DBUILD_UNIT_TESTS=0 \
+         -DBUILD_ARMNN_TFLITE_DELEGATE=1 \
+         -DTENSORFLOW_LIB_DIR=$BASEDIR/tensorflow/bazel-bin \
+         -DTENSORFLOW_ROOT=$BASEDIR/tensorflow \
+         -DTFLITE_LIB_ROOT=$BASEDIR/tensorflow/bazel-bin \
+         -DFLATBUFFERS_ROOT=$BASEDIR/flatbuffers-1.12.0/install
+make
+```
+The delegate library can then be found in `build/armnn/delegate`.
+
+
+# Integrate the ArmNN TfLite Delegate into your project
+
+The delegate can be integrated into your c++ project by creating a TfLite Interpreter and 
+instructing it to use the ArmNN delegate for the graph execution. This should look similiar
+to the following code snippet.
+```objectivec
+// Create TfLite Interpreter
+std::unique_ptr<Interpreter> armnnDelegateInterpreter;
+InterpreterBuilder(tfLiteModel, ::tflite::ops::builtin::BuiltinOpResolver())
+                  (&armnnDelegateInterpreter)
+
+// Create the ArmNN Delegate
+armnnDelegate::DelegateOptions delegateOptions(backends);
+std::unique_ptr<TfLiteDelegate, decltype(&armnnDelegate::TfLiteArmnnDelegateDelete)>
+                    theArmnnDelegate(armnnDelegate::TfLiteArmnnDelegateCreate(delegateOptions),
+                                     armnnDelegate::TfLiteArmnnDelegateDelete);
+
+// Instruct the Interpreter to use the armnnDelegate
+armnnDelegateInterpreter->ModifyGraphWithDelegate(theArmnnDelegate.get());
+```
+For further information on using TfLite Delegates 
+please visit the [tensorflow website](https://www.tensorflow.org/lite/guide)
+
diff --git a/delegate/CMakeLists.txt b/delegate/CMakeLists.txt
index 5b64587..38f7bd1 100644
--- a/delegate/CMakeLists.txt
+++ b/delegate/CMakeLists.txt
@@ -124,6 +124,8 @@
         src/test/Pooling2dTestHelper.hpp
         src/test/QuantizationTest.cpp
         src/test/QuantizationTestHelper.hpp
+        src/test/RedefineTestHelper.hpp
+        src/test/ReshapeTest.cpp
         src/test/ResizeTest.cpp
         src/test/ResizeTestHelper.hpp
         src/test/SoftmaxTest.cpp
diff --git a/delegate/README.md b/delegate/README.md
new file mode 100644
index 0000000..7430f19
--- /dev/null
+++ b/delegate/README.md
@@ -0,0 +1,7 @@
+# The Arm NN TensorFlow Lite delegate
+
+'armnnDelegate' is a library for accelerating certain TensorFlow Lite operators on Arm hardware by providing
+the TensorFlow Lite interpreter with an alternative implementation of the operators via its delegation mechanism.
+
+For more information about the TensorFlow Lite operators that are supported,
+see [TensorFlowLiteDelegateSupport.md](./TensorFlowLiteDelegateSupport.md).
diff --git a/delegate/TensorFlowLiteDelegateSupport.md b/delegate/TensorFlowLiteDelegateSupport.md
new file mode 100644
index 0000000..a9f548e
--- /dev/null
+++ b/delegate/TensorFlowLiteDelegateSupport.md
@@ -0,0 +1,85 @@
+# TensorFlow Lite operators that the Arm NN TensorFlow Lite Delegate supports
+
+This reference guide provides a list of TensorFlow Lite operators the Arm NN SDK currently supports.
+
+## Fully supported
+
+The Arm NN SDK TensorFlow Lite delegate currently supports the following operators:
+
+* ABS
+
+* ADD
+
+* AVERAGE_POOL_2D, Supported Fused Activation: RELU , RELU6 , TANH, NONE
+
+* CONCATENATION, Supported Fused Activation: RELU , RELU6 , TANH, NONE
+
+* CONV_2D, Supported Fused Activation: RELU , RELU6 , TANH, NONE
+
+* DEPTHWISE_CONV_2D, Supported Fused Activation: RELU , RELU6 , TANH, NONE
+
+* DEQUANTIZE
+
+* DIV
+
+* EQUAL
+
+* EXP
+
+* FULLY_CONNECTED
+
+* GREATER
+
+* GREATER_OR_EQUAL
+
+* LESS
+
+* LESS_OR_EQUAL
+
+* LOGISTIC
+
+* LOG_SOFTMAX
+
+* L2_POOL_2D
+
+* MAXIMUM
+
+* MAX_POOL_2D, Supported Fused Activation: RELU , RELU6 , TANH, NONE
+
+* MEAN
+
+* MINIMUM
+
+* MUL
+
+* NEG
+
+* NOT_EQUAL
+
+* QUANTIZE
+
+* RESHAPE
+
+* RESIZE_BILINEAR
+
+* RESIZE_NEAREST_NEIGHBOR
+
+* RELU
+
+* RELU6
+
+* RSQRT
+
+* SOFTMAX
+
+* SQRT
+
+* SUB
+
+* TANH
+
+* TRANSPOSE
+
+* TRANSPOSE_CONV
+
+More machine learning operators will be supported in future releases.
diff --git a/delegate/cmake/Modules/FindTensorflow.cmake b/delegate/cmake/Modules/FindTensorflow.cmake
index 8b47d30..8f90011 100644
--- a/delegate/cmake/Modules/FindTensorflow.cmake
+++ b/delegate/cmake/Modules/FindTensorflow.cmake
@@ -18,7 +18,7 @@
         NAMES
             tensorflow_all
         HINTS
-            ${TENSORFLOW_ROOT})
+            ${TENSORFLOW_LIB_DIR})
 
 ## Set TENSORFLOW_FOUND
 find_package_handle_standard_args(Tensorflow DEFAULT_MSG Tensorflow_INCLUDE_DIR Tensorflow_LIB)
diff --git a/delegate/cmake/Modules/FindTfLite.cmake b/delegate/cmake/Modules/FindTfLite.cmake
index 96e15db..9bb117e 100644
--- a/delegate/cmake/Modules/FindTfLite.cmake
+++ b/delegate/cmake/Modules/FindTfLite.cmake
@@ -11,7 +11,7 @@
             tensorflow/lite
             third_party
         HINTS
-            ${TENSORFLOW_ROOT}/..)
+            ${TENSORFLOW_ROOT})
 
 find_library(TfLite_LIB
         NAMES
diff --git a/delegate/src/Convolution.hpp b/delegate/src/Convolution.hpp
index fed084e..2d9fdba 100644
--- a/delegate/src/Convolution.hpp
+++ b/delegate/src/Convolution.hpp
@@ -340,6 +340,13 @@
         biasTensorInfo = armnn::TensorInfo(armnn::TensorShape({1}), GetDataType(tfLiteInputTensor));
     }
 
+    std::vector<uint8_t> swizzledData(filterTensorInfo.GetNumBytes());
+    auto filter =
+        CreateConstTensor(&tfLiteFilterTensor,
+                          filterTensorInfo,
+                          armnn::Optional<armnn::PermutationVector&>(permutationVector),
+                          swizzledData.data());
+
     if (!delegateData.m_Network)
     {
         bool isSupported = false;
@@ -351,18 +358,13 @@
                                    inputTensorInfo,
                                    outputTensorInfo,
                                    descriptor,
-                                   filterTensorInfo,
+                                   filter.GetInfo(),
                                    armnn::Optional<armnn::TensorInfo>(biasTensorInfo));
         return isSupported ? kTfLiteOk : kTfLiteError;
     }
 
     armnn::IConnectableLayer* layer = nullptr;
-    std::vector<uint8_t> swizzledData(filterTensorInfo.GetNumBytes());
-    auto filter =
-        CreateConstTensor(&tfLiteFilterTensor,
-                          filterTensorInfo,
-                          armnn::Optional<armnn::PermutationVector&>(permutationVector),
-                          swizzledData.data());
+
     if(biasEnabled)
     {
         auto biases =
diff --git a/delegate/src/FullyConnected.hpp b/delegate/src/FullyConnected.hpp
index b79f6a2..53251f7 100644
--- a/delegate/src/FullyConnected.hpp
+++ b/delegate/src/FullyConnected.hpp
@@ -129,6 +129,27 @@
         biasTensorInfo = armnn::TensorInfo(armnn::TensorShape({1}), GetDataType(tfLiteInputTensor));
     }
 
+    armnn::TensorInfo reshapedTensorInfo = GetTensorInfoForTfLiteTensor(tfLiteInputTensor);
+
+    if (inputTensorInfo.GetNumDimensions() > 2)
+    {
+        // Calculate reshape to flatten to 2D [batch_size, input_size]
+        std::vector<unsigned int> reshapedDimensions(2);
+        reshapedDimensions[1] = weightsTensorInfo.GetShape()[1];
+        reshapedDimensions[0] = inputTensorInfo.GetNumElements() / reshapedDimensions[1];
+
+        if (inputTensorInfo.GetNumElements() % reshapedDimensions[1] != 0)
+        {
+            TF_LITE_MAYBE_KERNEL_LOG(
+                tfLiteContext,
+                "TfLiteArmnnDelegate: Failed to deduce input tensor shape from filter size #%d #%d node #%d: ",
+                reshapedDimensions[1], operatorCode, nodeIndex);
+            return kTfLiteError;
+        }
+
+        reshapedTensorInfo.SetShape(armnn::TensorShape{ 2, reshapedDimensions.data() });
+    }
+
     armnn::FullyConnectedDescriptor descriptor;
     descriptor.m_TransposeWeightMatrix = true;
     descriptor.m_BiasEnabled           = biasEnabled;
@@ -141,7 +162,7 @@
                                    IsFullyConnectedSupported,
                                    delegateData.m_Backends,
                                    isSupported,
-                                   inputTensorInfo,
+                                   reshapedTensorInfo,
                                    outputTensorInfo,
                                    weightsTensorInfo,
                                    biasTensorInfo,
@@ -184,22 +205,6 @@
     if (inputTensorInfo.GetNumDimensions() > 2)
     {
         // Add reshape to flatten to 2D [batch_size, input_size]
-        std::vector<unsigned int> reshapedDimensions(2);
-        reshapedDimensions[1] = weightsTensorInfo.GetShape()[1];
-        reshapedDimensions[0] = inputTensorInfo.GetNumElements() / reshapedDimensions[1];
-
-        if (inputTensorInfo.GetNumElements() % reshapedDimensions[1] != 0)
-        {
-            TF_LITE_MAYBE_KERNEL_LOG(
-                tfLiteContext,
-                "TfLiteArmnnDelegate: Failed to deduce input tensor shape from filter size #%d #%d node #%d: ",
-                reshapedDimensions[1], operatorCode, nodeIndex);
-            return kTfLiteError;
-        }
-
-        armnn::TensorInfo reshapedTensorInfo = GetTensorInfoForTfLiteTensor(tfLiteInputTensor);
-        reshapedTensorInfo.SetShape(armnn::TensorShape{ 2, reshapedDimensions.data() });
-
         armnn::ReshapeDescriptor reshapeDescriptor;
         reshapeDescriptor.m_TargetShape = reshapedTensorInfo.GetShape();
         reshapeLayer = delegateData.m_Network->AddReshapeLayer(reshapeDescriptor);
@@ -210,7 +215,6 @@
         // Connect
         delegateData.m_OutputSlotForNode[tfLiteNode->inputs->data[0]]->Connect(reshapeLayer->GetInputSlot(0));
         reshapeLayer->GetOutputSlot(0).Connect(layer->GetInputSlot(0));
-        armnn::IOutputSlot& outputSlot = layer->GetOutputSlot(0);
         delegateData.m_OutputSlotForNode[tfLiteNode->outputs->data[0]] = &outputSlot;
     }
 
diff --git a/delegate/src/Redefine.hpp b/delegate/src/Redefine.hpp
index 755bb97..e880383 100644
--- a/delegate/src/Redefine.hpp
+++ b/delegate/src/Redefine.hpp
@@ -7,27 +7,195 @@
 
 #include <armnn/utility/IgnoreUnused.hpp>
 
+#include "DelegateUtils.hpp"
+
 #include <tensorflow/lite/builtin_ops.h>
 #include <tensorflow/lite/c/builtin_op_data.h>
 #include <tensorflow/lite/c/common.h>
 #include <tensorflow/lite/minimal_logging.h>
+#include <numeric>
 
 namespace armnnDelegate
 {
 
+TfLiteStatus CreateOutputTensorShape(const armnn::TensorInfo& inputTensorInfo,
+                                           const std::vector<int32_t>& targetShape,
+                                           armnn::ReshapeDescriptor& reshapeDesc)
+{
+    std::vector<unsigned int> outputDims(targetShape.begin(), targetShape.end());
+    const auto stretchDim = std::find(targetShape.begin(), targetShape.end(), -1);
+
+    if (stretchDim != targetShape.end())
+    {
+        if (std::find(std::next(stretchDim), targetShape.end(), -1) != targetShape.end())
+        {
+            // Return kTfLiteError and log the error after returning
+            return kTfLiteError;
+        }
+
+        auto targetNumElements =
+            armnn::numeric_cast<unsigned int>(
+                std::accumulate(targetShape.begin(), targetShape.end(), -1, std::multiplies<int32_t>()));
+
+        auto stretchIndex = static_cast<size_t>(std::distance(targetShape.begin(), stretchDim));
+        outputDims[stretchIndex] = inputTensorInfo.GetNumElements() / targetNumElements;
+    }
+
+    armnn::TensorShape outputShape = armnn::TensorShape(static_cast<unsigned int>(outputDims.size()),
+                                                        outputDims.data());
+    reshapeDesc.m_TargetShape = outputShape;
+    return kTfLiteOk;
+}
+
 TfLiteStatus VisitReshapeOperator(DelegateData& delegateData,
                                   TfLiteContext* tfLiteContext,
                                   TfLiteNode* tfLiteNode,
                                   int nodeIndex,
                                   int32_t operatorCode)
 {
-    armnn::IgnoreUnused(delegateData,
-                        tfLiteContext,
-                        tfLiteNode,
-                        nodeIndex,
-                        operatorCode);
+    auto numInputs = tfLiteNode->inputs->size;
 
-    return kTfLiteError;
+    if (numInputs == 2)
+    {
+        TF_LITE_ENSURE_STATUS(ValidateNumInputs(tfLiteContext, tfLiteNode, 2, nodeIndex));
+    }
+    else
+    {
+        TF_LITE_ENSURE_STATUS(ValidateNumInputs(tfLiteContext, tfLiteNode, 1, nodeIndex));
+    }
+    TF_LITE_ENSURE_STATUS(ValidateNumOutputs(tfLiteContext, tfLiteNode, 1, nodeIndex));
+
+    const TfLiteTensor* tfLiteTensors = tfLiteContext->tensors;
+    const TfLiteTensor& tfLiteInputTensor0 = tfLiteTensors[tfLiteNode->inputs->data[0]];
+    if (IsDynamicTensor(tfLiteInputTensor0))
+    {
+        TF_LITE_MAYBE_KERNEL_LOG(tfLiteContext,
+                                 "TfLiteArmnnDelegate: Dynamic input tensors are not supported in "
+                                 "operator #%d node #%d: ",
+                                 operatorCode, nodeIndex);
+        return kTfLiteError;
+    }
+
+    const TfLiteTensor& tfLiteOutputTensor = tfLiteTensors[tfLiteNode->outputs->data[0]];
+    if (IsDynamicTensor(tfLiteOutputTensor))
+    {
+        TF_LITE_MAYBE_KERNEL_LOG(tfLiteContext,
+                                 "TfLiteArmnnDelegate: Dynamic output tensors are not supported in "
+                                 "operator #%d node #%d: ",
+                                 operatorCode, nodeIndex);
+        return kTfLiteError;
+    }
+
+    const armnn::TensorInfo& inputTensorInfo0 = GetTensorInfoForTfLiteTensor(tfLiteInputTensor0);
+    const armnn::TensorInfo& outputTensorInfo = GetTensorInfoForTfLiteTensor(tfLiteOutputTensor);
+
+    armnn::ReshapeDescriptor reshapeDesc;
+    std::vector<int32_t> targetShape;
+
+    // The new shape can be defined by either a second input tensor or by a builtin option, we need to check for both.
+    if (numInputs == 2)
+    {
+        // Get shape from the second input tensor
+        const TfLiteTensor& tfLiteShapeInputTensor = tfLiteTensors[tfLiteNode->inputs->data[1]];
+        if (IsDynamicTensor(tfLiteShapeInputTensor))
+        {
+            TF_LITE_MAYBE_KERNEL_LOG(tfLiteContext,
+                                     "TfLiteArmnnDelegate: Dynamic input tensors are not supported in "
+                                     "operator #%d node #%d: ",
+                                     operatorCode, nodeIndex);
+            return kTfLiteError;
+        }
+
+        if (tfLiteShapeInputTensor.dims->size != 1)
+        {
+            TF_LITE_MAYBE_KERNEL_LOG(tfLiteContext,
+                                     "TfLiteArmnnDelegate: Target 'shape' input is not a 1D tensor in "
+                                     "operator #%d node #%d: ",
+                                     operatorCode, nodeIndex);
+            return kTfLiteError;
+        }
+
+        // Get the shape data out of the input tensor
+        auto* shapeTensorDataPtr = tflite::GetTensorData<int32_t>(&tfLiteShapeInputTensor);
+        auto shapeTensorNumValues = tfLiteShapeInputTensor.dims->data[0];
+        for (auto i=0; i < shapeTensorNumValues; ++i)
+        {
+            targetShape.push_back(*(shapeTensorDataPtr+i));
+        }
+    }
+    else
+    {
+        // Get shape from the builtin data
+        TfLiteReshapeParams* reshapeOptions = reinterpret_cast<TfLiteReshapeParams*>(tfLiteNode->builtin_data);
+
+        if (reshapeOptions != nullptr)
+        {
+            // Options might be set without valid data. we need to check the dimensions are in a valid range.
+            if (reshapeOptions->num_dimensions > 0 && reshapeOptions->num_dimensions <= 8)
+            {
+                for (int i=0; i < reshapeOptions->num_dimensions; ++i)
+                {
+                    targetShape.push_back(reshapeOptions->shape[i]);
+                }
+            }
+        }
+        else
+        {
+            TF_LITE_MAYBE_KERNEL_LOG(tfLiteContext,
+                                     "Target shape not defined in reshape parameters or input tensor. "
+                                     "At least one method required in operator #%d node #%d: ",
+                                     operatorCode, nodeIndex);
+            return kTfLiteError;
+        }
+    }
+
+    // Use the data to create the required tensor shape.
+    if (CreateOutputTensorShape(inputTensorInfo0, targetShape, reshapeDesc) != kTfLiteOk)
+    {
+        TF_LITE_MAYBE_KERNEL_LOG(tfLiteContext,
+                                 "TfLiteArmnnDelegate: At most one component of shape can be -1 in: "
+                                 "operator #%d node #%d: ",
+                                 operatorCode, nodeIndex);
+        return kTfLiteError;
+    }
+
+    if (reshapeDesc.m_TargetShape.GetNumElements() != inputTensorInfo0.GetNumElements())
+    {
+        TF_LITE_MAYBE_KERNEL_LOG(
+            tfLiteContext,
+            "TfLiteArmnnDelegate: Reshape, number of elements in output shape does not match input "
+            "operator #%d node #%d: ",
+            operatorCode, nodeIndex);
+        return kTfLiteError;
+    }
+
+    bool isSupported = false;
+    auto validateFunc = [&](const armnn::TensorInfo& outInfo, bool& isSupported)
+    {
+        FORWARD_LAYER_SUPPORT_FUNC(__func__,
+                                   tfLiteContext,
+                                   IsReshapeSupported,
+                                   delegateData.m_Backends,
+                                   isSupported,
+                                   inputTensorInfo0,
+                                   outInfo,
+                                   reshapeDesc);
+    };
+
+    if (!delegateData.m_Network)
+    {
+        validateFunc(outputTensorInfo, isSupported);
+        return isSupported ? kTfLiteOk : kTfLiteError;
+    }
+
+    armnn::IConnectableLayer* layer = delegateData.m_Network->AddReshapeLayer(reshapeDesc);
+    ARMNN_ASSERT(layer != nullptr);
+
+    armnn::IOutputSlot& outputSlot = layer->GetOutputSlot(0);
+    outputSlot.SetTensorInfo(outputTensorInfo);
+
+    // Connect
+    return Connect(layer, tfLiteNode, delegateData);
 }
 
 TfLiteStatus VisitSqueezeOperator(DelegateData& delegateData,
diff --git a/delegate/src/test/ComparisonTest.cpp b/delegate/src/test/ComparisonTest.cpp
index 0826535..95bfe21 100644
--- a/delegate/src/test/ComparisonTest.cpp
+++ b/delegate/src/test/ComparisonTest.cpp
@@ -497,90 +497,156 @@
                             expectedOutputValues);
 }
 
-TEST_SUITE("ComparisonTest")
+TEST_SUITE("Comparison_CpuRefTests")
+{
+
+TEST_CASE ("EQUAL_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    EqualFP32Test(backends);
+}
+
+TEST_CASE ("EQUAL_Broadcast_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    EqualBroadcastTest(backends);
+}
+
+TEST_CASE ("EQUAL_INT32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    EqualInt32Test(backends);
+}
+
+TEST_CASE ("NOT_EQUAL_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    NotEqualFP32Test(backends);
+}
+
+TEST_CASE ("NOT_EQUAL_Broadcast_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    NotEqualBroadcastTest(backends);
+}
+
+TEST_CASE ("NOT_EQUAL_INT32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    NotEqualInt32Test(backends);
+}
+
+TEST_CASE ("GREATER_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    GreaterFP32Test(backends);
+}
+
+TEST_CASE ("GREATER_Broadcast_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    GreaterBroadcastTest(backends);
+}
+
+TEST_CASE ("GREATER_INT32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    GreaterInt32Test(backends);
+}
+
+TEST_CASE ("GREATER_EQUAL_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    GreaterEqualFP32Test(backends);
+}
+
+TEST_CASE ("GREATER_EQUAL_Broadcast_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    GreaterEqualBroadcastTest(backends);
+}
+
+TEST_CASE ("GREATER_EQUAL_INT32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    GreaterEqualInt32Test(backends);
+}
+
+TEST_CASE ("LESS_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    LessFP32Test(backends);
+}
+
+TEST_CASE ("LESS_Broadcast_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    LessBroadcastTest(backends);
+}
+
+TEST_CASE ("LESS_INT32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    LessInt32Test(backends);
+}
+
+TEST_CASE ("LESS_EQUAL_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    LessEqualFP32Test(backends);
+}
+
+TEST_CASE ("LESS_EQUAL_Broadcast_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    LessEqualBroadcastTest(backends);
+}
+
+TEST_CASE ("LESS_EQUAL_INT32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    LessEqualInt32Test(backends);
+}
+} // End TEST_SUITE("Comparison_CpuRefTests")
+
+
+
+TEST_SUITE("Comparison_GpuAccTests")
 {
 
 TEST_CASE ("EQUAL_FP32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    EqualFP32Test(backends);
-}
-
-TEST_CASE ("EQUAL_FP32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     EqualFP32Test(backends);
 }
 
 TEST_CASE ("EQUAL_Broadcast_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    EqualBroadcastTest(backends);
-}
-
-TEST_CASE ("EQUAL_Broadcast_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     EqualBroadcastTest(backends);
 }
 
 TEST_CASE ("EQUAL_INT32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    EqualInt32Test(backends);
-}
-
-TEST_CASE ("EQUAL_INT32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     EqualInt32Test(backends);
 }
 
 TEST_CASE ("NOT_EQUAL_FP32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    NotEqualFP32Test(backends);
-}
-
-TEST_CASE ("NOT_EQUAL_FP32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     NotEqualFP32Test(backends);
 }
 
 TEST_CASE ("NOT_EQUAL_Broadcast_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    NotEqualBroadcastTest(backends);
-}
-
-TEST_CASE ("NOT_EQUAL_Broadcast_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     NotEqualBroadcastTest(backends);
 }
 
 TEST_CASE ("NOT_EQUAL_INT32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    NotEqualInt32Test(backends);
-}
-
-TEST_CASE ("NOT_EQUAL_INT32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     NotEqualInt32Test(backends);
 }
 
@@ -591,13 +657,6 @@
     GreaterFP32Test(backends);
 }
 
-TEST_CASE ("GREATER_FP32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    GreaterFP32Test(backends);
-}
-
 TEST_CASE ("GREATER_Broadcast_GpuAcc_Test")
 {
     std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
@@ -605,13 +664,6 @@
     GreaterBroadcastTest(backends);
 }
 
-TEST_CASE ("GREATER_Broadcast_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    GreaterBroadcastTest(backends);
-}
-
 TEST_CASE ("GREATER_INT32_GpuAcc_Test")
 {
     std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
@@ -619,136 +671,174 @@
     GreaterInt32Test(backends);
 }
 
-TEST_CASE ("GREATER_INT32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    GreaterInt32Test(backends);
-}
 TEST_CASE ("GREATER_EQUAL_FP32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    GreaterEqualFP32Test(backends);
-}
-
-TEST_CASE ("GREATER_EQUAL_FP32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     GreaterEqualFP32Test(backends);
 }
 
 TEST_CASE ("GREATER_EQUAL_Broadcast_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    GreaterEqualBroadcastTest(backends);
-}
-
-TEST_CASE ("GREATER_EQUAL_Broadcast_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     GreaterEqualBroadcastTest(backends);
 }
 
 TEST_CASE ("GREATER_EQUAL_INT32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     GreaterEqualInt32Test(backends);
 }
 
-TEST_CASE ("GREATER_EQUAL_INT32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    GreaterEqualInt32Test(backends);
-}
 TEST_CASE ("LESS_FP32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    LessFP32Test(backends);
-}
-
-TEST_CASE ("LESS_FP32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     LessFP32Test(backends);
 }
 
 TEST_CASE ("LESS_Broadcast_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    LessBroadcastTest(backends);
-}
-
-TEST_CASE ("LESS_Broadcast_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     LessBroadcastTest(backends);
 }
 
 TEST_CASE ("LESS_INT32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     LessInt32Test(backends);
 }
 
-TEST_CASE ("LESS_INT32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    LessInt32Test(backends);
-}
 TEST_CASE ("LESS_EQUAL_FP32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    LessEqualFP32Test(backends);
-}
-
-TEST_CASE ("LESS_EQUAL_FP32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     LessEqualFP32Test(backends);
 }
 
 TEST_CASE ("LESS_EQUAL_Broadcast_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    LessEqualBroadcastTest(backends);
-}
-
-TEST_CASE ("LESS_EQUAL_Broadcast_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     LessEqualBroadcastTest(backends);
 }
 
 TEST_CASE ("LESS_EQUAL_INT32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     LessEqualInt32Test(backends);
 }
 
+} // End TEST_SUITE("Comparison_GpuAccTests")
+
+
+TEST_SUITE("Comparison_CpuAccTests")
+{
+
+TEST_CASE ("EQUAL_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    EqualFP32Test(backends);
+}
+
+TEST_CASE ("EQUAL_Broadcast_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    EqualBroadcastTest(backends);
+}
+
+TEST_CASE ("EQUAL_INT32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    EqualInt32Test(backends);
+}
+
+TEST_CASE ("NOT_EQUAL_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    NotEqualFP32Test(backends);
+}
+
+TEST_CASE ("NOT_EQUAL_Broadcast_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    NotEqualBroadcastTest(backends);
+}
+
+TEST_CASE ("NOT_EQUAL_INT32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    NotEqualInt32Test(backends);
+}
+
+TEST_CASE ("GREATER_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    GreaterFP32Test(backends);
+}
+
+TEST_CASE ("GREATER_Broadcast_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    GreaterBroadcastTest(backends);
+}
+
+TEST_CASE ("GREATER_INT32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    GreaterInt32Test(backends);
+}
+
+TEST_CASE ("GREATER_EQUAL_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    GreaterEqualFP32Test(backends);
+}
+
+TEST_CASE ("GREATER_EQUAL_Broadcast_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    GreaterEqualBroadcastTest(backends);
+}
+
+TEST_CASE ("GREATER_EQUAL_INT32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    GreaterEqualInt32Test(backends);
+}
+
+TEST_CASE ("LESS_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    LessFP32Test(backends);
+}
+
+TEST_CASE ("LESS_Broadcast_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    LessBroadcastTest(backends);
+}
+
+TEST_CASE ("LESS_INT32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    LessInt32Test(backends);
+}
+
+TEST_CASE ("LESS_EQUAL_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    LessEqualFP32Test(backends);
+}
+
+TEST_CASE ("LESS_EQUAL_Broadcast_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    LessEqualBroadcastTest(backends);
+}
+
 TEST_CASE ("LESS_EQUAL_INT32_CpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
     LessEqualInt32Test(backends);
 }
 
-} // End TEST_SUITE("ComparisonTest")
+} // End TEST_SUITE("Comparison_CpuAccTests")
 
 } // namespace armnnDelegate
\ No newline at end of file
diff --git a/delegate/src/test/ComparisonTestHelper.hpp b/delegate/src/test/ComparisonTestHelper.hpp
index 0011c76..21fc3a8 100644
--- a/delegate/src/test/ComparisonTestHelper.hpp
+++ b/delegate/src/test/ComparisonTestHelper.hpp
@@ -5,6 +5,8 @@
 
 #pragma once
 
+#include "TestUtils.hpp"
+
 #include <armnn_delegate.hpp>
 
 #include <flatbuffers/flatbuffers.h>
@@ -225,12 +227,9 @@
     auto armnnDelegateOutputId = armnnDelegateInterpreter->outputs()[0];
     auto armnnDelegateOutputData = armnnDelegateInterpreter->typed_tensor<bool>(armnnDelegateOutputId);
 
-    for (size_t i = 0; i < expectedOutputValues.size(); i++)
-    {
-        CHECK(expectedOutputValues[i] == armnnDelegateOutputData[i]);
-        CHECK(tfLiteDelageOutputData[i] == expectedOutputValues[i]);
-        CHECK(tfLiteDelageOutputData[i] == armnnDelegateOutputData[i]);
-    }
+    armnnDelegate::CompareData(expectedOutputValues  , armnnDelegateOutputData, expectedOutputValues.size());
+    armnnDelegate::CompareData(expectedOutputValues  , tfLiteDelageOutputData , expectedOutputValues.size());
+    armnnDelegate::CompareData(tfLiteDelageOutputData, armnnDelegateOutputData, expectedOutputValues.size());
 }
 
 } // anonymous namespace
\ No newline at end of file
diff --git a/delegate/src/test/Convolution2dTest.cpp b/delegate/src/test/Convolution2dTest.cpp
index 4e9377a..2ce2944 100644
--- a/delegate/src/test/Convolution2dTest.cpp
+++ b/delegate/src/test/Convolution2dTest.cpp
@@ -73,7 +73,7 @@
                                  biasValues);
 }
 
-void Conv2DWithBiasesUint8Test(std::vector<armnn::BackendId>& backends)
+void Conv2DWithBiasesInt8Test(std::vector<armnn::BackendId>& backends)
 {
     // Set input data
     std::vector<int32_t> inputShape { 1, 2, 2, 1 };
@@ -81,13 +81,13 @@
     std::vector<int32_t> biasShape { 1 };
     std::vector<int32_t> outputShape { 1, 2, 2, 1 };
 
-    static std::vector<uint8_t> inputValues = { 1, 2, 3, 4 };
+    static std::vector<int8_t> inputValues = { 1, 2, 3, 4 };
 
-    std::vector<uint8_t> filterValues = { 2, 1, 0, 6 };
+    std::vector<int8_t> filterValues = { 2, 1, 0, 6 };
 
     std::vector<int32_t> biasValues = { 10 };
 
-    std::vector<uint8_t> expectedOutputValues =
+    std::vector<int8_t> expectedOutputValues =
         {
             (1 * 2 + 2 * 1 + 3 * 0 + 4 * 6 + 10) / 2, // 19
             (2 * 2 + 0 * 1 + 4 * 0 + 0 * 6 + 10) / 2, // 7
@@ -97,8 +97,8 @@
 
     tflite::Padding padding = tflite::Padding_SAME;
 
-    ConvolutionTest<uint8_t, int32_t>(tflite::BuiltinOperator_CONV_2D,
-                                            ::tflite::TensorType_UINT8,
+    ConvolutionTest<int8_t, int32_t>(tflite::BuiltinOperator_CONV_2D,
+                                            ::tflite::TensorType_INT8,
                                             1, // strideX
                                             1, // strideY
                                             1, // dilationX
@@ -220,7 +220,7 @@
                                             biasValues);
 }
 
-TEST_SUITE("Convolution2dTest_CpuRef")
+TEST_SUITE("Convolution2dTest_CpuRefTests")
 {
 
 TEST_CASE ("Conv2DWithBiases_Fp32_CpuRef_Test")
@@ -229,27 +229,15 @@
     Conv2DWithBiasesFp32Test(backends);
 }
 
-TEST_CASE ("Conv2DWithBiases_Uint8_CpuRef_Test")
+TEST_CASE ("Conv2DWithBiases_Int8_CpuRef_Test")
 {
     std::vector <armnn::BackendId> backends = {armnn::Compute::CpuRef};
-    Conv2DWithBiasesUint8Test(backends);
-}
-
-TEST_CASE ("Conv2DWithBiases_Relu_Uint8_CpuRef_Test")
-{
-    std::vector <armnn::BackendId> backends = {armnn::Compute::CpuRef};
-    Conv2DWithBiasesReluUint8Test(backends);
-}
-
-TEST_CASE ("Conv2DWithBiases_Relu6_Uint8_CpuRef_Test")
-{
-    std::vector <armnn::BackendId> backends = {armnn::Compute::CpuRef};
-    Conv2DWithBiasesRelu6Uint8Test(backends);
+    Conv2DWithBiasesInt8Test(backends);
 }
 
 } //End of TEST_SUITE("Convolution2dTest_CpuRef")
 
-TEST_SUITE("Convolution2dTest_CpuAcc")
+TEST_SUITE("Convolution2dTest_CpuAccTests")
 {
 
 TEST_CASE ("Conv2DWithBiases_Fp32_CpuAcc_Test")
@@ -258,27 +246,15 @@
 Conv2DWithBiasesFp32Test(backends);
 }
 
-TEST_CASE ("Conv2DWithBiases_Uint8_CpuAcc_Test")
+TEST_CASE ("Conv2DWithBiases_Int8_CpuAcc_Test")
 {
 std::vector <armnn::BackendId> backends = {armnn::Compute::CpuAcc};
-Conv2DWithBiasesUint8Test(backends);
-}
-
-TEST_CASE ("Conv2DWithBiases_Relu_Uint8_CpuAcc_Test")
-{
-std::vector <armnn::BackendId> backends = {armnn::Compute::CpuAcc};
-Conv2DWithBiasesReluUint8Test(backends);
-}
-
-TEST_CASE ("Conv2DWithBiases_Relu6Uint8_CpuAcc_Test")
-{
-std::vector <armnn::BackendId> backends = {armnn::Compute::CpuAcc};
-Conv2DWithBiasesRelu6Uint8Test(backends);
+Conv2DWithBiasesInt8Test(backends);
 }
 
 } //End of TEST_SUITE("Convolution2dTest_CpuAcc")
 
-TEST_SUITE("Convolution2dTest_GpuAcc")
+TEST_SUITE("Convolution2dTest_GpuAccTests")
 {
 
 TEST_CASE ("Conv2DWithBiases_Fp32_GpuAcc_Test")
@@ -287,27 +263,15 @@
 Conv2DWithBiasesFp32Test(backends);
 }
 
-TEST_CASE ("Conv2DWithBiases_Uint8_GpuAcc_Test")
+TEST_CASE ("Conv2DWithBiases_Int8_GpuAcc_Test")
 {
 std::vector <armnn::BackendId> backends = {armnn::Compute::GpuAcc};
-Conv2DWithBiasesUint8Test(backends);
-}
-
-TEST_CASE ("Conv2DWithBiases_Relu_Uint8_GpuAcc_Test")
-{
-std::vector <armnn::BackendId> backends = {armnn::Compute::GpuAcc};
-Conv2DWithBiasesReluUint8Test(backends);
-}
-
-TEST_CASE ("Conv2DWithBiases_Relu_Uint8_GpuAcc_Test")
-{
-std::vector <armnn::BackendId> backends = {armnn::Compute::GpuAcc};
-Conv2DWithBiasesRelu6Uint8Test(backends);
+Conv2DWithBiasesInt8Test(backends);
 }
 
 } //End of TEST_SUITE("Convolution2dTest_GpuAcc")
 
-void TransposeConvUint8Test(std::vector<armnn::BackendId>& backends)
+void TransposeConvInt8Test(std::vector<armnn::BackendId>& backends)
 {
     // Set input data
     std::vector<int32_t> transposeTensorShape { 4 };
@@ -316,9 +280,9 @@
     std::vector<int32_t> outputShape { 1, 3, 3, 1 };
 
     std::vector<int32_t> transposeData = { 1, 3, 3, 1 };
-    static std::vector<uint8_t> inputValues = { 1, 2, 3, 4 };
-    std::vector<uint8_t> filterValues = { 0, 1, 2, 4 };
-    std::vector<uint8_t> expectedOutputValues =
+    static std::vector<int8_t> inputValues = { 1, 2, 3, 4 };
+    std::vector<int8_t> filterValues = { 0, 1, 2, 4 };
+    std::vector<int8_t> expectedOutputValues =
         {
             0, 1,  2,
             2, 11, 12,
@@ -326,8 +290,8 @@
         };
 
     tflite::Padding padding = tflite::Padding_VALID;
-    TransposeConvTest<uint8_t>(backends,
-                             ::tflite::TensorType_UINT8,
+    TransposeConvTest<int8_t>(backends,
+                             ::tflite::TensorType_INT8,
                              1, // strideX
                              1, // strideY
                              padding,
@@ -383,10 +347,10 @@
     TransposeConvFp32Test(backends);
 }
 
-TEST_CASE ("TransposeConv_Uint8_Test")
+TEST_CASE ("TransposeConv_Int8_Test")
 {
     std::vector <armnn::BackendId> backends = {armnn::Compute::CpuRef};
-    TransposeConvUint8Test(backends);
+    TransposeConvInt8Test(backends);
 }
 
 } // End of  TEST_SUITE(TransposeConv_CpuRef_Test)
@@ -396,14 +360,14 @@
 
 TEST_CASE ("TransposeConv_Fp32_Test")
 {
-std::vector <armnn::BackendId> backends = {armnn::Compute::CpuAcc};
-TransposeConvFp32Test(backends);
+    std::vector <armnn::BackendId> backends = {armnn::Compute::CpuAcc};
+    TransposeConvFp32Test(backends);
 }
 
-TEST_CASE ("TransposeConv_Uint8_Test")
+TEST_CASE ("TransposeConv_Int8_Test")
 {
-std::vector <armnn::BackendId> backends = {armnn::Compute::CpuAcc};
-TransposeConvUint8Test(backends);
+    std::vector <armnn::BackendId> backends = {armnn::Compute::CpuAcc};
+    TransposeConvInt8Test(backends);
 }
 
 } // End of  TEST_SUITE(TransposeConv_CpuAcc_Test)
@@ -413,14 +377,14 @@
 
 TEST_CASE ("TransposeConv_Fp32_Test")
 {
-std::vector <armnn::BackendId> backends = {armnn::Compute::GpuAcc};
-TransposeConvFp32Test(backends);
+    std::vector <armnn::BackendId> backends = {armnn::Compute::GpuAcc};
+    TransposeConvFp32Test(backends);
 }
 
-TEST_CASE ("TransposeConv_Uint8_Test")
+TEST_CASE ("TransposeConv_Int8_Test")
 {
-std::vector <armnn::BackendId> backends = {armnn::Compute::GpuAcc};
-TransposeConvUint8Test(backends);
+    std::vector <armnn::BackendId> backends = {armnn::Compute::GpuAcc};
+    TransposeConvInt8Test(backends);
 }
 
 } // End of  TEST_SUITE(TransposeConv_GpuAcc_Test)
diff --git a/delegate/src/test/ConvolutionTestHelper.hpp b/delegate/src/test/ConvolutionTestHelper.hpp
index b7705cc..b317517 100644
--- a/delegate/src/test/ConvolutionTestHelper.hpp
+++ b/delegate/src/test/ConvolutionTestHelper.hpp
@@ -91,7 +91,7 @@
                               filterQuantizationParameters);
 
     auto biasTensorType = ::tflite::TensorType_FLOAT32;
-    if (tensorType == ::tflite::TensorType_UINT8)
+    if (tensorType == ::tflite::TensorType_INT8 || tensorType == ::tflite::TensorType_UINT8)
     {
         biasTensorType = ::tflite::TensorType_INT32;
     }
diff --git a/delegate/src/test/ElementwiseBinaryTest.cpp b/delegate/src/test/ElementwiseBinaryTest.cpp
index 2a8c91b..cc447d9 100644
--- a/delegate/src/test/ElementwiseBinaryTest.cpp
+++ b/delegate/src/test/ElementwiseBinaryTest.cpp
@@ -657,289 +657,370 @@
                                    expectedOutputValues, 1.0f, 0);
 }
 
-TEST_SUITE("ElementwiseBinaryTest")
+TEST_SUITE("ElementwiseBinary_GpuAccTests")
 {
 
 TEST_CASE ("ADD_FP32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    AddFP32Test(backends);
-}
-
-TEST_CASE ("ADD_FP32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     AddFP32Test(backends);
 }
 
 TEST_CASE ("ADD_Broadcast_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    AddBroadcastTest(backends);
-}
-
-TEST_CASE ("ADD_Broadcast_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     AddBroadcastTest(backends);
 }
 
 TEST_CASE ("ADD_Activation_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    AddActivationTest(backends);
-}
-
-TEST_CASE ("ADD_Actiation_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     AddActivationTest(backends);
 }
 
 TEST_CASE ("ADD_UINT8_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    AddUint8Test(backends);
-}
-
-TEST_CASE ("ADD_UINT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     AddUint8Test(backends);
 }
 
 TEST_CASE ("DIV_FP32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    DivFP32Test(backends);
-}
-
-TEST_CASE ("DIV_FP32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     DivFP32Test(backends);
 }
 
 TEST_CASE ("DIV_Broadcast_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     DivBroadcastTest(backends);
 }
 
-TEST_CASE ("DIV_Broadcast_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    DivBroadcastTest(backends);
-}
-
-TEST_CASE ("DIV_UINT8_GpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    DivUint8Test(backends);
-}
-
-TEST_CASE ("DIV_UINT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    DivUint8Test(backends);
-}
-
 TEST_CASE ("MAX_FP32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    MaxFP32Test(backends);
-}
-
-TEST_CASE ("MAX_FP32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     MaxFP32Test(backends);
 }
 
 TEST_CASE ("MAX_Broadcast_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    MaxBroadcastTest(backends);
-}
-
-TEST_CASE ("MAX_Broadcast_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     MaxBroadcastTest(backends);
 }
 
 TEST_CASE ("MAX_UINT8_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    MaxUint8Test(backends);
-}
-
-TEST_CASE ("MAX_UINT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     MaxUint8Test(backends);
 }
 
 TEST_CASE ("MIN_FP32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    MinFP32Test(backends);
-}
-
-TEST_CASE ("MIN_FP32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     MinFP32Test(backends);
 }
 
 TEST_CASE ("MIN_Broadcast_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    MinBroadcastTest(backends);
-}
-
-TEST_CASE ("MIN_Broadcast_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     MinBroadcastTest(backends);
 }
 
 TEST_CASE ("MIN_UINT8_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    MinUint8Test(backends);
-}
-
-TEST_CASE ("MIN_UINT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     MinUint8Test(backends);
 }
 
 TEST_CASE ("MUL_FP32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    MulFP32Test(backends);
-}
-
-TEST_CASE ("MUL_FP32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     MulFP32Test(backends);
 }
 
 TEST_CASE ("MUL_Broadcast_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    MulBroadcastTest(backends);
-}
-
-TEST_CASE ("MUL_Broadcast_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     MulBroadcastTest(backends);
 }
 
 TEST_CASE ("MUL_Activation_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    MulActivationTest(backends);
-}
-
-TEST_CASE ("MUL_Actiation_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     MulActivationTest(backends);
 }
 
 TEST_CASE ("MUL_UINT8_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    MulUint8Test(backends);
-}
-
-TEST_CASE ("MUL_UINT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     MulUint8Test(backends);
 }
 
 TEST_CASE ("SUB_FP32_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    SubFP32Test(backends);
-}
-
-TEST_CASE ("SUB_FP32_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     SubFP32Test(backends);
 }
 
 TEST_CASE ("SUB_Broadcast_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    SubBroadcastTest(backends);
-}
-
-TEST_CASE ("SUB_Broadcast_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     SubBroadcastTest(backends);
 }
 
 TEST_CASE ("SUB_UINT8_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     SubUint8Test(backends);
 }
 
+} //TEST_SUITE("ElementwiseBinary_GpuAccTests")
+
+
+
+TEST_SUITE("ElementwiseBinary_CpuAccTests")
+{
+
+TEST_CASE ("ADD_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    AddFP32Test(backends);
+}
+
+TEST_CASE ("ADD_Broadcast_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    AddBroadcastTest(backends);
+}
+
+TEST_CASE ("ADD_Actiation_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    AddActivationTest(backends);
+}
+
+TEST_CASE ("ADD_UINT8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    AddUint8Test(backends);
+}
+
+TEST_CASE ("DIV_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    DivFP32Test(backends);
+}
+
+TEST_CASE ("DIV_Broadcast_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    DivBroadcastTest(backends);
+}
+
+TEST_CASE ("MAX_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    MaxFP32Test(backends);
+}
+
+TEST_CASE ("MAX_Broadcast_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    MaxBroadcastTest(backends);
+}
+
+TEST_CASE ("MAX_UINT8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    MaxUint8Test(backends);
+}
+
+TEST_CASE ("MIN_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    MinFP32Test(backends);
+}
+
+TEST_CASE ("MIN_Broadcast_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    MinBroadcastTest(backends);
+}
+
+TEST_CASE ("MIN_UINT8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    MinUint8Test(backends);
+}
+
+TEST_CASE ("MUL_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    MulFP32Test(backends);
+}
+
+TEST_CASE ("MUL_Broadcast_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    MulBroadcastTest(backends);
+}
+
+TEST_CASE ("MUL_Actiation_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    MulActivationTest(backends);
+}
+
+TEST_CASE ("MUL_UINT8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    MulUint8Test(backends);
+}
+
+TEST_CASE ("SUB_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    SubFP32Test(backends);
+}
+
+TEST_CASE ("SUB_Broadcast_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    SubBroadcastTest(backends);
+}
+
 TEST_CASE ("SUB_UINT8_CpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
     SubUint8Test(backends);
 }
 
-} // End of TEST_SUITE("ElementwiseBinaryTest")
+} // TEST_SUITE("ElementwiseBinary_CpuAccTests")
+
+
+TEST_SUITE("ElementwiseBinary_CpuRefTests")
+{
+
+TEST_CASE ("ADD_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    AddFP32Test(backends);
+}
+
+TEST_CASE ("ADD_Broadcast_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    AddBroadcastTest(backends);
+}
+
+TEST_CASE ("ADD_Actiation_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    AddActivationTest(backends);
+}
+
+TEST_CASE ("ADD_UINT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    AddUint8Test(backends);
+}
+
+TEST_CASE ("DIV_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    DivFP32Test(backends);
+}
+
+TEST_CASE ("DIV_Broadcast_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    DivBroadcastTest(backends);
+}
+
+TEST_CASE ("DIV_UINT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    DivUint8Test(backends);
+}
+
+TEST_CASE ("MAX_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    MaxFP32Test(backends);
+}
+
+TEST_CASE ("MAX_Broadcast_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    MaxBroadcastTest(backends);
+}
+
+TEST_CASE ("MAX_UINT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    MaxUint8Test(backends);
+}
+
+TEST_CASE ("MIN_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    MinFP32Test(backends);
+}
+
+TEST_CASE ("MIN_Broadcast_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    MinBroadcastTest(backends);
+}
+
+TEST_CASE ("MIN_UINT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    MinUint8Test(backends);
+}
+
+TEST_CASE ("MUL_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    MulFP32Test(backends);
+}
+
+TEST_CASE ("MUL_Broadcast_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    MulBroadcastTest(backends);
+}
+
+TEST_CASE ("MUL_Actiation_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    MulActivationTest(backends);
+}
+
+TEST_CASE ("MUL_UINT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    MulUint8Test(backends);
+}
+
+TEST_CASE ("SUB_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    SubFP32Test(backends);
+}
+
+TEST_CASE ("SUB_Broadcast_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    SubBroadcastTest(backends);
+}
+
+TEST_CASE ("SUB_UINT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    SubUint8Test(backends);
+}
+
+} // TEST_SUITE("ElementwiseBinary_CpuRefTests")
 
 } // namespace armnnDelegate
\ No newline at end of file
diff --git a/delegate/src/test/ElementwiseUnaryTest.cpp b/delegate/src/test/ElementwiseUnaryTest.cpp
index c504707..3200423 100644
--- a/delegate/src/test/ElementwiseUnaryTest.cpp
+++ b/delegate/src/test/ElementwiseUnaryTest.cpp
@@ -19,14 +19,13 @@
 namespace armnnDelegate
 {
 
-TEST_SUITE("ElementwiseUnaryTest")
+TEST_SUITE("ElementwiseUnary_GpuAccTests")
 {
 
 TEST_CASE ("Abs_Float32_GpuAcc_Test")
 {
     // Create the ArmNN Delegate
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                           armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     // Set input data
     std::vector<float> inputValues
     {
@@ -42,55 +41,10 @@
     ElementwiseUnaryFP32Test(tflite::BuiltinOperator_ABS, backends, inputValues, expectedOutputValues);
 }
 
-TEST_CASE ("Abs_Float32_CpuAcc_Test")
-{
-    // Create the ArmNN Delegate
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    // Set input data
-    std::vector<float> inputValues
-    {
-        -0.1f, -0.2f, -0.3f,
-        0.1f,  0.2f,  0.3f
-    };
-    // Calculate output data
-    std::vector<float> expectedOutputValues(inputValues.size());
-    for (unsigned int i = 0; i < inputValues.size(); ++i)
-    {
-        expectedOutputValues[i] = std::abs(inputValues[i]);
-    }
-
-    ElementwiseUnaryFP32Test(tflite::BuiltinOperator_ABS, backends, inputValues, expectedOutputValues);
-}
-
 TEST_CASE ("Exp_Float32_GpuAcc_Test")
 {
     // Create the ArmNN Delegate
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    // Set input data
-    std::vector<float> inputValues
-    {
-        5.0f, 4.0f,
-        3.0f, 2.0f,
-        1.0f, 1.1f
-    };
-    // Set output data
-    std::vector<float> expectedOutputValues
-    {
-        148.413159102577f, 54.598150033144f,
-        20.085536923188f,  7.389056098931f,
-        2.718281828459f,  3.004166023946f
-    };
-
-    ElementwiseUnaryFP32Test(tflite::BuiltinOperator_EXP, backends, inputValues, expectedOutputValues);
-}
-
-TEST_CASE ("Exp_Float32_CpuAcc_Test")
-{
-    // Create the ArmNN Delegate
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     // Set input data
     std::vector<float> inputValues
     {
@@ -112,29 +66,7 @@
 TEST_CASE ("Neg_Float32_GpuAcc_Test")
 {
     // Create the ArmNN Delegate
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    // Set input data
-    std::vector<float> inputValues
-    {
-        1.f, 0.f, 3.f,
-        25.f, 64.f, 100.f
-    };
-    // Set output data
-    std::vector<float> expectedOutputValues
-    {
-        -1.f, 0.f, -3.f,
-        -25.f, -64.f, -100.f
-    };
-
-    ElementwiseUnaryFP32Test(tflite::BuiltinOperator_NEG, backends, inputValues, expectedOutputValues);
-}
-
-TEST_CASE ("Neg_Float32_CpuAcc_Test")
-{
-    // Create the ArmNN Delegate
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     // Set input data
     std::vector<float> inputValues
     {
@@ -154,8 +86,7 @@
 TEST_CASE ("Rsqrt_Float32_GpuAcc_Test")
 {
     // Create the ArmNN Delegate
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     // Set input data
     std::vector<float> inputValues
     {
@@ -172,11 +103,79 @@
     ElementwiseUnaryFP32Test(tflite::BuiltinOperator_RSQRT, backends, inputValues, expectedOutputValues);
 }
 
+} // TEST_SUITE("ElementwiseUnary_GpuAccTests")
+
+
+
+TEST_SUITE("ElementwiseUnary_CpuAccTests")
+{
+
+TEST_CASE ("Abs_Float32_CpuAcc_Test")
+{
+    // Create the ArmNN Delegate
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    // Set input data
+    std::vector<float> inputValues
+    {
+        -0.1f, -0.2f, -0.3f,
+        0.1f,  0.2f,  0.3f
+    };
+    // Calculate output data
+    std::vector<float> expectedOutputValues(inputValues.size());
+    for (unsigned int i = 0; i < inputValues.size(); ++i)
+    {
+        expectedOutputValues[i] = std::abs(inputValues[i]);
+    }
+
+    ElementwiseUnaryFP32Test(tflite::BuiltinOperator_ABS, backends, inputValues, expectedOutputValues);
+}
+
+TEST_CASE ("Exp_Float32_CpuAcc_Test")
+{
+    // Create the ArmNN Delegate
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    // Set input data
+    std::vector<float> inputValues
+    {
+        5.0f, 4.0f,
+        3.0f, 2.0f,
+        1.0f, 1.1f
+    };
+    // Set output data
+    std::vector<float> expectedOutputValues
+    {
+        148.413159102577f, 54.598150033144f,
+        20.085536923188f,  7.389056098931f,
+        2.718281828459f,  3.004166023946f
+    };
+
+    ElementwiseUnaryFP32Test(tflite::BuiltinOperator_EXP, backends, inputValues, expectedOutputValues);
+}
+
+TEST_CASE ("Neg_Float32_CpuAcc_Test")
+{
+    // Create the ArmNN Delegate
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    // Set input data
+    std::vector<float> inputValues
+    {
+        1.f, 0.f, 3.f,
+        25.f, 64.f, 100.f
+    };
+    // Set output data
+    std::vector<float> expectedOutputValues
+    {
+        -1.f, 0.f, -3.f,
+        -25.f, -64.f, -100.f
+    };
+
+    ElementwiseUnaryFP32Test(tflite::BuiltinOperator_NEG, backends, inputValues, expectedOutputValues);
+}
+
 TEST_CASE ("Rsqrt_Float32_CpuAcc_Test")
 {
     // Create the ArmNN Delegate
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
     // Set input data
     std::vector<float> inputValues
     {
@@ -193,31 +192,96 @@
     ElementwiseUnaryFP32Test(tflite::BuiltinOperator_RSQRT, backends, inputValues, expectedOutputValues);
 }
 
-TEST_CASE ("Sqrt_Float32_GpuAcc_Test")
+} // TEST_SUITE("ElementwiseUnary_CpuAccTests")
+
+TEST_SUITE("ElementwiseUnary_CpuRefTests")
+{
+
+TEST_CASE ("Abs_Float32_CpuRef_Test")
 {
     // Create the ArmNN Delegate
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
     // Set input data
     std::vector<float> inputValues
     {
-        9.0f, 4.25f, 81.9f,
-        0.1f,  0.9f,  169.0f
+        -0.1f, -0.2f, -0.3f,
+        0.1f,  0.2f,  0.3f
     };
     // Calculate output data
     std::vector<float> expectedOutputValues(inputValues.size());
     for (unsigned int i = 0; i < inputValues.size(); ++i)
     {
-        expectedOutputValues[i] = std::sqrt(inputValues[i]);
+        expectedOutputValues[i] = std::abs(inputValues[i]);
     }
 
-    ElementwiseUnaryFP32Test(tflite::BuiltinOperator_SQRT, backends, inputValues, expectedOutputValues);
+    ElementwiseUnaryFP32Test(tflite::BuiltinOperator_ABS, backends, inputValues, expectedOutputValues);
 }
 
-TEST_CASE ("Sqrt_Float32_CpuAcc_Test")
+TEST_CASE ("Exp_Float32_CpuRef_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    // Create the ArmNN Delegate
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    // Set input data
+    std::vector<float> inputValues
+    {
+        5.0f, 4.0f,
+        3.0f, 2.0f,
+        1.0f, 1.1f
+    };
+    // Set output data
+    std::vector<float> expectedOutputValues
+    {
+        148.413159102577f, 54.598150033144f,
+        20.085536923188f,  7.389056098931f,
+        2.718281828459f,  3.004166023946f
+    };
+
+    ElementwiseUnaryFP32Test(tflite::BuiltinOperator_EXP, backends, inputValues, expectedOutputValues);
+}
+
+TEST_CASE ("Neg_Float32_CpuRef_Test")
+{
+    // Create the ArmNN Delegate
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    // Set input data
+    std::vector<float> inputValues
+    {
+        1.f, 0.f, 3.f,
+        25.f, 64.f, 100.f
+    };
+    // Set output data
+    std::vector<float> expectedOutputValues
+    {
+        -1.f, 0.f, -3.f,
+        -25.f, -64.f, -100.f
+    };
+
+    ElementwiseUnaryFP32Test(tflite::BuiltinOperator_NEG, backends, inputValues, expectedOutputValues);
+}
+
+TEST_CASE ("Rsqrt_Float32_CpuRef_Test")
+{
+    // Create the ArmNN Delegate
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    // Set input data
+    std::vector<float> inputValues
+    {
+        1.f, 4.f, 16.f,
+        25.f, 64.f, 100.f
+    };
+    // Set output data
+    std::vector<float> expectedOutputValues
+    {
+        1.f, 0.5f, 0.25f,
+        0.2f, 0.125f, 0.1f
+    };
+
+    ElementwiseUnaryFP32Test(tflite::BuiltinOperator_RSQRT, backends, inputValues, expectedOutputValues);
+}
+
+TEST_CASE ("Sqrt_Float32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
     // Set input data
     std::vector<float> inputValues
     {
@@ -234,6 +298,6 @@
     ElementwiseUnaryFP32Test(tflite::BuiltinOperator_SQRT, backends, inputValues, expectedOutputValues);
 }
 
-}
+} // TEST_SUITE("ElementwiseUnary_CpuRefTests")
 
 } // namespace armnnDelegate
\ No newline at end of file
diff --git a/delegate/src/test/ElementwiseUnaryTestHelper.hpp b/delegate/src/test/ElementwiseUnaryTestHelper.hpp
index b4a55cb..348c8ab 100644
--- a/delegate/src/test/ElementwiseUnaryTestHelper.hpp
+++ b/delegate/src/test/ElementwiseUnaryTestHelper.hpp
@@ -5,6 +5,8 @@
 
 #pragma once
 
+#include "TestUtils.hpp"
+
 #include <armnn_delegate.hpp>
 
 #include <flatbuffers/flatbuffers.h>
@@ -79,7 +81,7 @@
                               std::vector<float>& expectedOutputValues)
 {
     using namespace tflite;
-    const std::vector<int32_t> inputShape  { { 3, 1, 2} };
+    std::vector<int32_t> inputShape  { { 3, 1, 2} };
     std::vector<char> modelBuffer = CreateElementwiseUnaryTfLiteModel(unaryOperatorCode,
                                                                       ::tflite::TensorType_FLOAT32,
                                                                       inputShape);
@@ -126,15 +128,7 @@
     CHECK(armnnDelegateInterpreter->Invoke() == kTfLiteOk);
 
     // Compare output data
-    auto tfLiteDelegateOutputId = tfLiteInterpreter->outputs()[0];
-    auto tfLiteDelageOutputData = tfLiteInterpreter->typed_tensor<float>(tfLiteDelegateOutputId);
-    auto armnnDelegateOutputId = armnnDelegateInterpreter->outputs()[0];
-    auto armnnDelegateOutputData = armnnDelegateInterpreter->typed_tensor<float>(armnnDelegateOutputId);
-    for (size_t i = 0; i < inputValues.size(); i++)
-    {
-        CHECK(expectedOutputValues[i] == armnnDelegateOutputData[i]);
-        CHECK(tfLiteDelageOutputData[i] == armnnDelegateOutputData[i]);
-    }
+    armnnDelegate::CompareOutputData(tfLiteInterpreter, armnnDelegateInterpreter, inputShape, expectedOutputValues);
 }
 
 } // anonymous namespace
diff --git a/delegate/src/test/FullyConnectedTest.cpp b/delegate/src/test/FullyConnectedTest.cpp
index 1d33381..018f7f5 100644
--- a/delegate/src/test/FullyConnectedTest.cpp
+++ b/delegate/src/test/FullyConnectedTest.cpp
@@ -8,9 +8,6 @@
 namespace
 {
 
-TEST_SUITE("FullyConnectedTest")
-{
-
 void FullyConnectedFp32Test(std::vector<armnn::BackendId>& backends)
 {
     std::vector<int32_t> inputTensorShape   { 1, 4, 1, 1 };
@@ -61,68 +58,100 @@
                               weightsData);
 }
 
-void FullyConnectedUint8Test(std::vector<armnn::BackendId>& backends)
+void FullyConnectedInt8Test(std::vector<armnn::BackendId>& backends)
 {
     std::vector<int32_t> inputTensorShape   { 1, 4, 2, 1 };
     std::vector<int32_t> weightsTensorShape { 1, 4 };
     std::vector<int32_t> biasTensorShape    { 1 };
     std::vector<int32_t> outputTensorShape  { 2, 1 };
 
-    std::vector<uint8_t> inputValues = { 1, 2, 3, 4, 10, 20, 30, 40 };
-    std::vector<uint8_t> weightsData = { 2, 3, 4, 5 };
+    std::vector<int8_t> inputValues = { 1, 2, 3, 4, 5, 10, 15, 20 };
+    std::vector<int8_t> weightsData = { 2, 3, 4, 5 };
 
-    std::vector<uint8_t> expectedOutputValues = { (40 + 10) / 2, (400 + 10) / 2 };
+    std::vector<int8_t> expectedOutputValues = { 25, 105 };  // (40 + 10) / 2, (200 + 10) / 2
 
     // bias is set std::vector<int32_t> biasData = { 10 } in the model
     // input and weights quantization scale 1.0f and offset 0 in the model
     // output quantization scale 2.0f and offset 0 in the model
-    FullyConnectedTest<uint8_t>(backends,
-                              ::tflite::TensorType_UINT8,
-                              tflite::ActivationFunctionType_NONE,
-                              inputTensorShape,
-                              weightsTensorShape,
-                              biasTensorShape,
-                              outputTensorShape,
-                              inputValues,
-                              expectedOutputValues,
-                              weightsData);
+    FullyConnectedTest<int8_t>(backends,
+                                ::tflite::TensorType_INT8,
+                                tflite::ActivationFunctionType_NONE,
+                                inputTensorShape,
+                                weightsTensorShape,
+                                biasTensorShape,
+                                outputTensorShape,
+                                inputValues,
+                                expectedOutputValues,
+                                weightsData);
 }
 
-TEST_CASE ("FULLY_CONNECTED_FP32_GpuAcc_Test")
+TEST_SUITE("FullyConnected_GpuAccTests")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
+
+TEST_CASE ("FullyConnected_FP32_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     FullyConnectedFp32Test(backends);
 }
 
-TEST_CASE ("FULLY_CONNECTED_FP32_CpuAcc_Test")
+TEST_CASE ("FullyConnected_Int8_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    FullyConnectedFp32Test(backends);
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    FullyConnectedInt8Test(backends);
 }
 
-TEST_CASE ("FULLY_CONNECTED_UINT8_GpuAcc_Test")
+TEST_CASE ("FullyConnected_Activation_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    FullyConnectedUint8Test(backends);
-}
-
-TEST_CASE ("FULLY_CONNECTED_UINT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    FullyConnectedUint8Test(backends);
-}
-
-TEST_CASE ("FULLY_CONNECTED_Activation_GpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     FullyConnectedActicationTest(backends);
 }
 
-} // End of TEST_SUITE("FullyConnectedTest")
+} // End of TEST_SUITE("FullyConnected_GpuAccTests")
+
+TEST_SUITE("FullyConnected_CpuAccTests")
+{
+
+TEST_CASE ("FullyConnected_FP32_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    FullyConnectedFp32Test(backends);
+}
+
+TEST_CASE ("FullyConnected_Int8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    FullyConnectedInt8Test(backends);
+}
+
+TEST_CASE ("FullyConnected_Activation_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    FullyConnectedActicationTest(backends);
+}
+
+} // End of TEST_SUITE("FullyConnected_CpuAccTests")
+
+TEST_SUITE("FullyConnected_CpuRefTests")
+{
+
+TEST_CASE ("FullyConnected_FP32_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    FullyConnectedFp32Test(backends);
+}
+
+TEST_CASE ("FullyConnected_Int8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    FullyConnectedInt8Test(backends);
+}
+
+TEST_CASE ("FullyConnected_Activation_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    FullyConnectedActicationTest(backends);
+}
+
+} // End of TEST_SUITE("FullyConnected_CpuRefTests")
 
 } // anonymous namespace
\ No newline at end of file
diff --git a/delegate/src/test/FullyConnectedTestHelper.hpp b/delegate/src/test/FullyConnectedTestHelper.hpp
index 4eed958..4b30424 100644
--- a/delegate/src/test/FullyConnectedTestHelper.hpp
+++ b/delegate/src/test/FullyConnectedTestHelper.hpp
@@ -41,7 +41,7 @@
                                                     sizeof(T) * weightsData.size()));
 
     auto biasTensorType = ::tflite::TensorType_FLOAT32;
-    if (tensorType == ::tflite::TensorType_UINT8)
+    if (tensorType == ::tflite::TensorType_INT8)
     {
         biasTensorType = ::tflite::TensorType_INT32;
         std::vector<int32_t> biasData = { 10 };
diff --git a/delegate/src/test/QuantizationTest.cpp b/delegate/src/test/QuantizationTest.cpp
index 5466d47..fbc2903 100644
--- a/delegate/src/test/QuantizationTest.cpp
+++ b/delegate/src/test/QuantizationTest.cpp
@@ -279,148 +279,174 @@
                                       expectedOutputValues);
 }
 
-TEST_SUITE("QuantizationTests")
+TEST_SUITE("CpuRef_QuantizationTests")
+{
+
+TEST_CASE ("DEQUANTIZE_UINT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    DequantizeUint8Test(backends);
+}
+
+
+TEST_CASE ("DEQUANTIZE_INT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    DequantizeInt8Test(backends);
+}
+
+
+TEST_CASE ("DEQUANTIZE_INT16_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    DequantizeInt16Test(backends);
+}
+
+
+TEST_CASE ("QUANTIZE_FLOAT32_UINT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    QuantizeFloat32Uint8Test(backends);
+}
+
+
+TEST_CASE ("QUANTIZE_FLOAT32_INT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    QuantizeFloat32Int8Test(backends);
+}
+
+
+TEST_CASE ("QUANTIZE_FLOAT32_INT16_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    QuantizeFloat32Int16Test(backends);
+}
+
+
+TEST_CASE ("QUANTIZE_INT16_INT16_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    QuantizeInt16Int16Test(backends);
+}
+
+
+TEST_CASE ("QUANTIZE_INT16_INT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    QuantizeInt16Int8Test(backends);
+}
+
+
+
+TEST_CASE ("QUANTIZE_INT8_UINT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    QuantizeInt8Uint8Test(backends);
+}
+
+
+TEST_CASE ("QUANTIZE_UINT8_INT8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    QuantizeUint8Int8Test(backends);
+}
+
+}
+
+TEST_SUITE("CpuAcc_QuantizationTests")
+{
+
+// Dequantize Operator Tests
+TEST_CASE ("DEQUANTIZE_UINT8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    DequantizeUint8Test(backends);
+}
+
+TEST_CASE ("DEQUANTIZE_INT8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    DequantizeInt8Test(backends);
+}
+
+TEST_CASE ("DEQUANTIZE_INT16_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    DequantizeInt16Test(backends);
+}
+
+// Quantize Operator Tests
+TEST_CASE ("QUANTIZE_FLOAT32_UINT8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    QuantizeFloat32Uint8Test(backends);
+}
+
+TEST_CASE ("QUANTIZE_FLOAT32_INT8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    QuantizeFloat32Int8Test(backends);
+}
+
+TEST_CASE ("QUANTIZE_INT8_UINT8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    QuantizeInt8Uint8Test(backends);
+}
+
+TEST_CASE ("QUANTIZE_UINT8_INT8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    QuantizeUint8Int8Test(backends);
+}
+
+}
+
+TEST_SUITE("GpuAcc_QuantizationTests")
 {
 
 // Dequantize Operator Tests
 TEST_CASE ("DEQUANTIZE_UINT8_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    DequantizeUint8Test(backends);
-}
-
-TEST_CASE ("DEQUANTIZE_UINT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     DequantizeUint8Test(backends);
 }
 
 TEST_CASE ("DEQUANTIZE_INT8_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    DequantizeInt8Test(backends);
-}
-
-TEST_CASE ("DEQUANTIZE_INT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     DequantizeInt8Test(backends);
 }
 
 TEST_CASE ("DEQUANTIZE_INT16_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    DequantizeInt16Test(backends);
-}
-
-TEST_CASE ("DEQUANTIZE_INT16_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     DequantizeInt16Test(backends);
 }
 
 // Quantize Operator Tests
 TEST_CASE ("QUANTIZE_FLOAT32_UINT8_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    QuantizeFloat32Uint8Test(backends);
-}
-
-TEST_CASE ("QUANTIZE_FLOAT32_UINT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     QuantizeFloat32Uint8Test(backends);
 }
 
 TEST_CASE ("QUANTIZE_FLOAT32_INT8_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     QuantizeFloat32Int8Test(backends);
 }
 
-TEST_CASE ("QUANTIZE_FLOAT32_INT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    QuantizeFloat32Int8Test(backends);
-}
-
-TEST_CASE ("QUANTIZE_FLOAT32_INT16_GpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    QuantizeFloat32Int16Test(backends);
-}
-
-TEST_CASE ("QUANTIZE_FLOAT32_INT16_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    QuantizeFloat32Int16Test(backends);
-}
-
-TEST_CASE ("QUANTIZE_INT16_INT16_GpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    QuantizeInt16Int16Test(backends);
-}
-
-TEST_CASE ("QUANTIZE_INT16_INT16_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    QuantizeInt16Int16Test(backends);
-}
-
-TEST_CASE ("QUANTIZE_INT16_INT8_GpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    QuantizeInt16Int8Test(backends);
-}
-
-TEST_CASE ("QUANTIZE_INT16_INT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
-    QuantizeInt16Int8Test(backends);
-}
-
 TEST_CASE ("QUANTIZE_INT8_UINT8_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    QuantizeInt8Uint8Test(backends);
-}
-
-TEST_CASE ("QUANTIZE_INT8_UINT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     QuantizeInt8Uint8Test(backends);
 }
 
 TEST_CASE ("QUANTIZE_UINT8_INT8_GpuAcc_Test")
 {
-    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc,
-                                               armnn::Compute::CpuRef };
-    QuantizeUint8Int8Test(backends);
-}
-
-TEST_CASE ("QUANTIZE_UINT8_INT8_CpuAcc_Test")
-{
-    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc,
-                                               armnn::Compute::CpuRef };
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
     QuantizeUint8Int8Test(backends);
 }
 
diff --git a/delegate/src/test/RedefineTestHelper.hpp b/delegate/src/test/RedefineTestHelper.hpp
new file mode 100644
index 0000000..42fc4c8
--- /dev/null
+++ b/delegate/src/test/RedefineTestHelper.hpp
@@ -0,0 +1,213 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#pragma once
+
+#include "TestUtils.hpp"
+
+#include <armnn_delegate.hpp>
+
+#include <flatbuffers/flatbuffers.h>
+#include <tensorflow/lite/interpreter.h>
+#include <tensorflow/lite/kernels/register.h>
+#include <tensorflow/lite/model.h>
+#include <tensorflow/lite/schema/schema_generated.h>
+#include <tensorflow/lite/version.h>
+
+#include <doctest/doctest.h>
+
+namespace
+{
+
+std::vector<char> CreateRedefineTfLiteModel(
+    tflite::BuiltinOperator redefineOperatorCode,
+    tflite::TensorType tensorType,
+    const std::vector<int32_t>& inputTensorShape,
+    const std::vector<int32_t>& outputTensorShape,
+    const std::vector<int32_t>& targetShape,
+    bool useOption = true,
+    float quantScale = 1.0f,
+    int quantOffset  = 0)
+{
+    using namespace tflite;
+    flatbuffers::FlatBufferBuilder flatBufferBuilder;
+    std::vector<flatbuffers::Offset<tflite::Buffer>> buffers;
+    buffers.push_back(CreateBuffer(flatBufferBuilder, flatBufferBuilder.CreateVector({})));
+    buffers.push_back(CreateBuffer(flatBufferBuilder, flatBufferBuilder.CreateVector({})));
+
+    auto quantizationParameters =
+        CreateQuantizationParameters(flatBufferBuilder,
+                                     0,
+                                     0,
+                                     flatBufferBuilder.CreateVector<float>({ quantScale }),
+                                     flatBufferBuilder.CreateVector<int64_t>({ quantOffset }));
+
+    auto inputTensor = CreateTensor(flatBufferBuilder,
+                                    flatBufferBuilder.CreateVector<int32_t>(inputTensorShape.data(),
+                                                                            inputTensorShape.size()),
+                                    tensorType,
+                                    0,
+                                    flatBufferBuilder.CreateString("input"),
+                                    quantizationParameters);
+
+    auto outputTensor = CreateTensor(flatBufferBuilder,
+                                     flatBufferBuilder.CreateVector<int32_t>(outputTensorShape.data(),
+                                                                             outputTensorShape.size()),
+                                     tensorType,
+                                     1,
+                                     flatBufferBuilder.CreateString("output"),
+                                     quantizationParameters);
+
+    std::vector<flatbuffers::Offset<Tensor>> tensors;
+    std::vector<int32_t> operatorInputs;
+    std::vector<int> subgraphInputs;
+    flatbuffers::Offset<void> operatorBuiltinOptions;
+
+    if (useOption)
+    {
+        tensors = { inputTensor, outputTensor};
+        operatorInputs = {{0}};
+        subgraphInputs = {{0}};
+        operatorBuiltinOptions = CreateReshapeOptions(
+            flatBufferBuilder,
+            flatBufferBuilder.CreateVector(targetShape.data(), targetShape.size())).Union();
+    }
+    else
+    {
+        buffers.push_back(
+            CreateBuffer(flatBufferBuilder,
+                         flatBufferBuilder.CreateVector(reinterpret_cast<const uint8_t*>(targetShape.data()),
+                                                        sizeof(int32_t) * targetShape.size())));
+        int32_t size = static_cast<int32_t>(targetShape.size());
+        auto shapeTensor = CreateTensor(flatBufferBuilder,
+                                        flatBufferBuilder.CreateVector<int32_t>( { size } ),
+                                        tflite::TensorType_INT32,
+                                        2,
+                                        flatBufferBuilder.CreateString("shape"));
+        tensors = { inputTensor, outputTensor, shapeTensor };
+        operatorInputs = {{ 0, 2 }};
+        subgraphInputs = {{ 0, 2 }};
+        operatorBuiltinOptions = CreateReshapeOptions(flatBufferBuilder).Union();
+    }
+
+    // create operator
+    tflite::BuiltinOptions operatorBuiltinOptionsType = BuiltinOptions_ReshapeOptions;
+
+    const std::vector<int32_t> operatorOutputs{{1}};
+    flatbuffers::Offset <Operator> redefineOperator =
+        CreateOperator(flatBufferBuilder,
+                       0,
+                       flatBufferBuilder.CreateVector<int32_t>(operatorInputs.data(), operatorInputs.size()),
+                       flatBufferBuilder.CreateVector<int32_t>(operatorOutputs.data(), operatorOutputs.size()),
+                       operatorBuiltinOptionsType,
+                       operatorBuiltinOptions);
+
+    const std::vector<int> subgraphOutputs{{1}};
+    flatbuffers::Offset <SubGraph> subgraph =
+        CreateSubGraph(flatBufferBuilder,
+                       flatBufferBuilder.CreateVector(tensors.data(), tensors.size()),
+                       flatBufferBuilder.CreateVector<int32_t>(subgraphInputs.data(), subgraphInputs.size()),
+                       flatBufferBuilder.CreateVector<int32_t>(subgraphOutputs.data(), subgraphOutputs.size()),
+                       flatBufferBuilder.CreateVector(&redefineOperator, 1));
+
+    flatbuffers::Offset <flatbuffers::String> modelDescription =
+        flatBufferBuilder.CreateString("ArmnnDelegate: Reshape Operator Model");
+    flatbuffers::Offset <OperatorCode> operatorCode = CreateOperatorCode(flatBufferBuilder,
+                                                                         redefineOperatorCode);
+
+    flatbuffers::Offset <Model> flatbufferModel =
+        CreateModel(flatBufferBuilder,
+                    TFLITE_SCHEMA_VERSION,
+                    flatBufferBuilder.CreateVector(&operatorCode, 1),
+                    flatBufferBuilder.CreateVector(&subgraph, 1),
+                    modelDescription,
+                    flatBufferBuilder.CreateVector(buffers.data(), buffers.size()));
+
+    flatBufferBuilder.Finish(flatbufferModel);
+
+    return std::vector<char>(flatBufferBuilder.GetBufferPointer(),
+                             flatBufferBuilder.GetBufferPointer() + flatBufferBuilder.GetSize());
+}
+
+template <typename T>
+void RedefineTest(tflite::BuiltinOperator redefineOperatorCode,
+                  tflite::TensorType tensorType,
+                  const std::vector<armnn::BackendId>& backends,
+                  const std::vector<int32_t>& inputShape,
+                  const std::vector<int32_t>& outputShape,
+                  std::vector<T>& inputValues,
+                  std::vector<T>& expectedOutputValues,
+                  std::vector<int32_t>& targetShape,
+                  bool useOption = true,
+                  float quantScale = 1.0f,
+                  int quantOffset  = 0)
+{
+    using namespace tflite;
+    std::vector<char> modelBuffer = CreateRedefineTfLiteModel(redefineOperatorCode,
+                                                              tensorType,
+                                                              inputShape,
+                                                              outputShape,
+                                                              targetShape,
+                                                              useOption,
+                                                              quantScale,
+                                                              quantOffset);
+
+    const Model* tfLiteModel = GetModel(modelBuffer.data());
+    CHECK(tfLiteModel != nullptr);
+    // Create TfLite Interpreters
+    std::unique_ptr<Interpreter> armnnDelegateInterpreter;
+    CHECK(InterpreterBuilder(tfLiteModel, ::tflite::ops::builtin::BuiltinOpResolver())
+                  (&armnnDelegateInterpreter) == kTfLiteOk);
+    CHECK(armnnDelegateInterpreter != nullptr);
+    CHECK(armnnDelegateInterpreter->AllocateTensors() == kTfLiteOk);
+
+    std::unique_ptr<Interpreter> tfLiteInterpreter;
+    CHECK(InterpreterBuilder(tfLiteModel, ::tflite::ops::builtin::BuiltinOpResolver())
+                  (&tfLiteInterpreter) == kTfLiteOk);
+    CHECK(tfLiteInterpreter != nullptr);
+    CHECK(tfLiteInterpreter->AllocateTensors() == kTfLiteOk);
+
+    // Create the ArmNN Delegate
+    armnnDelegate::DelegateOptions delegateOptions(backends);
+    std::unique_ptr<TfLiteDelegate, decltype(&armnnDelegate::TfLiteArmnnDelegateDelete)>
+        theArmnnDelegate(armnnDelegate::TfLiteArmnnDelegateCreate(delegateOptions),
+                         armnnDelegate::TfLiteArmnnDelegateDelete);
+    CHECK(theArmnnDelegate != nullptr);
+    // Modify armnnDelegateInterpreter to use armnnDelegate
+    CHECK(armnnDelegateInterpreter->ModifyGraphWithDelegate(theArmnnDelegate.get()) == kTfLiteOk);
+
+    // Set input data
+    armnnDelegate::FillInput<T>(tfLiteInterpreter, 0, inputValues);
+    armnnDelegate::FillInput<T>(armnnDelegateInterpreter, 0, inputValues);
+
+    // Run EnqueueWorkload
+    CHECK(tfLiteInterpreter->Invoke() == kTfLiteOk);
+    CHECK(armnnDelegateInterpreter->Invoke() == kTfLiteOk);
+
+    auto tfLiteDelegateOutputId = tfLiteInterpreter->outputs()[0];
+    auto tfLiteDelegateOutputData = tfLiteInterpreter->typed_tensor<T>(tfLiteDelegateOutputId);
+    auto tfLiteDelegateOutputTensor = tfLiteInterpreter->tensor(tfLiteDelegateOutputId);
+    auto armnnDelegateOutputId = armnnDelegateInterpreter->outputs()[0];
+    auto armnnDelegateOutputData = armnnDelegateInterpreter->typed_tensor<T>(armnnDelegateOutputId);
+    auto armnnDelegateOutputTensor = armnnDelegateInterpreter->tensor(armnnDelegateOutputId);
+
+    CHECK(outputShape.size() == tfLiteDelegateOutputTensor->dims->size);
+    CHECK(outputShape.size() == armnnDelegateOutputTensor->dims->size);
+
+    for (size_t i = 0; i < static_cast<size_t>(tfLiteDelegateOutputTensor->dims->size); i++)
+    {
+        CHECK(outputShape[i] == armnnDelegateOutputTensor->dims->data[i]);
+        CHECK(tfLiteDelegateOutputTensor->dims->data[i] == armnnDelegateOutputTensor->dims->data[i]);
+    }
+
+    for (size_t i = 0; i < expectedOutputValues.size(); i++)
+    {
+        CHECK(expectedOutputValues[i] == armnnDelegateOutputData[i]);
+        CHECK(tfLiteDelegateOutputData[i] == expectedOutputValues[i]);
+        CHECK(tfLiteDelegateOutputData[i] == armnnDelegateOutputData[i]);
+    }
+}
+
+} // anonymous namespace
\ No newline at end of file
diff --git a/delegate/src/test/ReshapeTest.cpp b/delegate/src/test/ReshapeTest.cpp
new file mode 100644
index 0000000..715fed6
--- /dev/null
+++ b/delegate/src/test/ReshapeTest.cpp
@@ -0,0 +1,449 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#include "RedefineTestHelper.hpp"
+
+#include <armnn_delegate.hpp>
+
+#include <flatbuffers/flatbuffers.h>
+#include <tensorflow/lite/schema/schema_generated.h>
+
+#include <doctest/doctest.h>
+
+namespace armnnDelegate
+{
+
+void ReshapeSimpleTest(std::vector<armnn::BackendId>& backends, bool useOption = true)
+{
+    // Set input data
+    std::vector<int32_t> inputShape { 1, 3, 4, 1 };
+    std::vector<int32_t> outputShape { 1, 3, 2, 2 };
+    std::vector<int32_t> targetShape { 1, 3, 2, 2 };
+
+    std::vector<float> inputValues = { -5.0f, 8.0f, -10.0f, 7.0f,
+                                       8.0f, 12.0f, -15.0f, 2.0f,
+                                       3.0f, -4.0f, -1.0f, -11.0f };
+
+    std::vector<float> expectedOutputValues = { -5.0f, 8.0f, -10.0f, 7.0f,
+                                                8.0f, 12.0f, -15.0f, 2.0f,
+                                                3.0f, -4.0f, -1.0f, -11.0f };
+
+    RedefineTest<float>(tflite::BuiltinOperator_RESHAPE,
+                        ::tflite::TensorType_FLOAT32,
+                        backends,
+                        inputShape,
+                        outputShape,
+                        inputValues,
+                        expectedOutputValues,
+                        targetShape,
+                        useOption);
+}
+
+void ReshapeReduceDimTest(std::vector<armnn::BackendId>& backends, bool useOption = true)
+{
+    // Set input data
+    std::vector<int32_t> inputShape { 1, 3, 4, 1 };
+    std::vector<int32_t> outputShape { 1, 4, 3 };
+    std::vector<int32_t> targetShape { 1, 4, 3 };
+
+    std::vector<float> inputValues = { -5.0f, 8.0f, -10.0f, 7.0f,
+                                       8.0f, 12.0f, -15.0f, 2.0f,
+                                       3.0f, -4.0f, -1.0f, -11.0f };
+
+    std::vector<float> expectedOutputValues = { -5.0f, 8.0f, -10.0f, 7.0f,
+                                                8.0f, 12.0f, -15.0f, 2.0f,
+                                                3.0f, -4.0f, -1.0f, -11.0f };
+
+    RedefineTest<float>(tflite::BuiltinOperator_RESHAPE,
+                        ::tflite::TensorType_FLOAT32,
+                        backends,
+                        inputShape,
+                        outputShape,
+                        inputValues,
+                        expectedOutputValues,
+                        targetShape,
+                        useOption);
+}
+
+void ReshapeFlattenTest(std::vector<armnn::BackendId>& backends, bool useOption = true)
+{
+    // Set input data
+    std::vector<int32_t> inputShape { 1, 3, 4, 1 };
+    std::vector<int32_t> outputShape { 6, 2 };
+    std::vector<int32_t> targetShape { -1, 2 };
+
+    std::vector<float> inputValues = { -5.0f, 8.0f, -10.0f, 7.0f,
+                                       8.0f, 12.0f, -15.0f, 2.0f,
+                                       3.0f, -4.0f, -1.0f, -11.0f };
+
+    std::vector<float> expectedOutputValues = { -5.0f, 8.0f, -10.0f, 7.0f,
+                                                8.0f, 12.0f, -15.0f, 2.0f,
+                                                3.0f, -4.0f, -1.0f, -11.0f };
+
+    RedefineTest<float>(tflite::BuiltinOperator_RESHAPE,
+                        ::tflite::TensorType_FLOAT32,
+                        backends,
+                        inputShape,
+                        outputShape,
+                        inputValues,
+                        expectedOutputValues,
+                        targetShape,
+                        useOption);
+}
+
+void ReshapeFlattenAllTest(std::vector<armnn::BackendId>& backends, bool useOption = true)
+{
+    // Set input data
+    std::vector<int32_t> inputShape { 1, 3, 4, 1 };
+    std::vector<int32_t> outputShape { 12 };
+    std::vector<int32_t> targetShape { -1 };
+
+    std::vector<float> inputValues = { -5.0f, 8.0f, -10.0f, 7.0f,
+                                       8.0f, 12.0f, -15.0f, 2.0f,
+                                       3.0f, -4.0f, -1.0f, -11.0f };
+
+    std::vector<float> expectedOutputValues = { -5.0f, 8.0f, -10.0f, 7.0f,
+                                                8.0f, 12.0f, -15.0f, 2.0f,
+                                                3.0f, -4.0f, -1.0f, -11.0f };
+
+    RedefineTest<float>(tflite::BuiltinOperator_RESHAPE,
+                        ::tflite::TensorType_FLOAT32,
+                        backends,
+                        inputShape,
+                        outputShape,
+                        inputValues,
+                        expectedOutputValues,
+                        targetShape,
+                        useOption);
+}
+
+void ReshapeInt8Test(std::vector<armnn::BackendId>& backends, bool useOption = true)
+{
+    // Set input data
+    std::vector<int32_t> inputShape { 1, 3, 4, 1 };
+    std::vector<int32_t> outputShape { 6, 2 };
+    std::vector<int32_t> targetShape { -1, 2 };
+
+    std::vector<int8_t> inputValues = { -5, 8, -10, 7,
+                                        8, 12, -15, 2,
+                                        3, -4, -1, -11 };
+
+    std::vector<int8_t> expectedOutputValues = { -5, 8, -10, 7,
+                                                 8, 12, -15, 2,
+                                                 3, -4, -1, -11 };
+
+    RedefineTest<int8_t>(tflite::BuiltinOperator_RESHAPE,
+                         ::tflite::TensorType_INT8,
+                         backends,
+                         inputShape,
+                         outputShape,
+                         inputValues,
+                         expectedOutputValues,
+                         targetShape,
+                         useOption,
+                         2.5f,
+                         1);
+}
+
+void ReshapeUint8Test(std::vector<armnn::BackendId>& backends, bool useOption = true)
+{
+    // Set input data
+    std::vector<int32_t> inputShape { 1, 3, 4, 1 };
+    std::vector<int32_t> outputShape { 6, 2 };
+    std::vector<int32_t> targetShape { -1, 2 };
+
+    std::vector<uint8_t> inputValues = { 5, 8, 10, 7,
+                                         8, 12, 15, 2,
+                                         3, 4, 1, 11 };
+
+    std::vector<uint8_t> expectedOutputValues = { 5, 8, 10, 7,
+                                                  8, 12, 15, 2,
+                                                  3, 4, 1, 11 };
+
+    RedefineTest<uint8_t>(tflite::BuiltinOperator_RESHAPE,
+                          ::tflite::TensorType_UINT8,
+                          backends,
+                          inputShape,
+                          outputShape,
+                          inputValues,
+                          expectedOutputValues,
+                          targetShape,
+                          useOption,
+                          2.5f,
+                          1);
+}
+
+void ReshapeInt16Test(std::vector<armnn::BackendId>& backends, bool useOption = true)
+{
+    // Set input data
+    std::vector<int32_t> inputShape { 1, 3, 4, 1 };
+    std::vector<int32_t> outputShape { 6, 2 };
+    std::vector<int32_t> targetShape { -1, 2 };
+
+    std::vector<int16_t> inputValues = { -5, 8, -10, 7,
+                                         8, 12, -15, 2,
+                                         3, -4, -1, -11 };
+
+    std::vector<int16_t> expectedOutputValues = { -5, 8, -10, 7,
+                                                  8, 12, -15, 2,
+                                                  3, -4, -1, -11 };
+
+    RedefineTest<int16_t>(tflite::BuiltinOperator_RESHAPE,
+                          ::tflite::TensorType_INT16,
+                          backends,
+                          inputShape,
+                          outputShape,
+                          inputValues,
+                          expectedOutputValues,
+                          targetShape,
+                          useOption,
+                          2.5f,
+                          0);
+}
+
+TEST_SUITE("Reshape_GpuAccTests")
+{
+
+TEST_CASE ("Reshape_Simple_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    ReshapeSimpleTest(backends);
+}
+
+TEST_CASE ("Reshape_ReduceDimension_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    ReshapeReduceDimTest(backends);
+}
+
+TEST_CASE ("Reshape_Flatten_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    ReshapeFlattenTest(backends);
+}
+
+TEST_CASE ("Reshape_FlattenAll_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    ReshapeFlattenAllTest(backends);
+}
+
+TEST_CASE ("Reshape_Int8_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    ReshapeInt8Test(backends);
+}
+
+TEST_CASE ("Reshape_Uint8_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    ReshapeUint8Test(backends);
+}
+
+TEST_CASE ("Reshape_Simple_ShapeTensor_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    ReshapeSimpleTest(backends, false);
+}
+
+TEST_CASE ("Reshape_ReduceDimension_ShapeTensor_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    ReshapeReduceDimTest(backends, false);
+}
+
+TEST_CASE ("Reshape_Flatten_ShapeTensor_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    ReshapeFlattenTest(backends, false);
+}
+
+TEST_CASE ("Reshape_FlattenAll_ShapeTensor_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    ReshapeFlattenAllTest(backends, false);
+}
+
+TEST_CASE ("Reshape_Int8_ShapeTensor_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    ReshapeInt8Test(backends, false);
+}
+
+TEST_CASE ("Reshape_Uint8_ShapeTensor_GpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::GpuAcc };
+    ReshapeUint8Test(backends, false);
+}
+
+} // TEST_SUITE("Reshape_GpuAccTests")
+
+TEST_SUITE("Reshape_CpuAccTests")
+{
+
+TEST_CASE ("Reshape_Simple_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    ReshapeSimpleTest(backends);
+}
+
+TEST_CASE ("Reshape_ReduceDimension_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    ReshapeReduceDimTest(backends);
+}
+
+TEST_CASE ("Reshape_Flatten_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    ReshapeFlattenTest(backends);
+}
+
+TEST_CASE ("Reshape_FlattenAll_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    ReshapeFlattenAllTest(backends);
+}
+
+TEST_CASE ("Reshape_Int8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    ReshapeInt8Test(backends);
+}
+
+TEST_CASE ("Reshape_Uint8_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    ReshapeUint8Test(backends);
+}
+
+TEST_CASE ("Reshape_Simple_ShapeTensor_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    ReshapeSimpleTest(backends, false);
+}
+
+TEST_CASE ("Reshape_ReduceDimension_ShapeTensor_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    ReshapeReduceDimTest(backends, false);
+}
+
+TEST_CASE ("Reshape_Flatten_ShapeTensor_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    ReshapeFlattenTest(backends, false);
+}
+
+TEST_CASE ("Reshape_FlattenAll_ShapeTensor_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    ReshapeFlattenAllTest(backends, false);
+}
+
+TEST_CASE ("Reshape_Int8_ShapeTensor_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    ReshapeInt8Test(backends, false);
+}
+
+TEST_CASE ("Reshape_Uint8_ShapeTensor_CpuAcc_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuAcc };
+    ReshapeUint8Test(backends, false);
+}
+
+} // TEST_SUITE("Reshape_CpuAccTests")
+
+TEST_SUITE("Reshape_CpuRefTests")
+{
+
+TEST_CASE ("Reshape_Simple_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeSimpleTest(backends);
+}
+
+TEST_CASE ("Reshape_ReduceDimension_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeReduceDimTest(backends);
+}
+
+TEST_CASE ("Reshape_Flatten_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeFlattenTest(backends);
+}
+
+TEST_CASE ("Reshape_FlattenAll_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeFlattenAllTest(backends);
+}
+
+TEST_CASE ("Reshape_Int8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeInt8Test(backends);
+}
+
+TEST_CASE ("Reshape_Uint8_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeUint8Test(backends);
+}
+
+TEST_CASE ("Reshape_Int16_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeInt16Test(backends);
+}
+
+TEST_CASE ("Reshape_Simple_ShapeTensor_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeSimpleTest(backends, false);
+}
+
+TEST_CASE ("Reshape_ReduceDimension_ShapeTensor_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeReduceDimTest(backends, false);
+}
+
+TEST_CASE ("Reshape_Flatten_ShapeTensor_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeFlattenTest(backends, false);
+}
+
+TEST_CASE ("Reshape_FlattenAll_ShapeTensor_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeFlattenAllTest(backends, false);
+}
+
+TEST_CASE ("Reshape_Int8_ShapeTensor_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeInt8Test(backends, false);
+}
+
+TEST_CASE ("Reshape_Uint8_ShapeTensor_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeUint8Test(backends, false);
+}
+
+TEST_CASE ("Reshape_Int16_ShapeTensor_CpuRef_Test")
+{
+    std::vector<armnn::BackendId> backends = { armnn::Compute::CpuRef };
+    ReshapeInt16Test(backends, false);
+}
+
+} // TEST_SUITE("Reshape_CpuRefTests")
+
+} // namespace armnnDelegate
\ No newline at end of file
diff --git a/delegate/src/test/TestUtils.cpp b/delegate/src/test/TestUtils.cpp
index cf3e1fe..31c05a6 100644
--- a/delegate/src/test/TestUtils.cpp
+++ b/delegate/src/test/TestUtils.cpp
@@ -8,6 +8,26 @@
 namespace armnnDelegate
 {
 
+
+
+void CompareData(bool tensor1[], bool tensor2[], size_t tensorSize)
+{
+    auto compareBool = [](auto a, auto b) {return (((a == 0) && (b == 0)) || ((a != 0) && (b != 0)));};
+    for (size_t i = 0; i < tensorSize; i++)
+    {
+        CHECK(compareBool(tensor1[i], tensor2[i]));
+    }
+}
+
+void CompareData(std::vector<bool>& tensor1, bool tensor2[], size_t tensorSize)
+{
+    auto compareBool = [](auto a, auto b) {return (((a == 0) && (b == 0)) || ((a != 0) && (b != 0)));};
+    for (size_t i = 0; i < tensorSize; i++)
+    {
+        CHECK(compareBool(tensor1[i], tensor2[i]));
+    }
+}
+
 void CompareData(float tensor1[], float tensor2[], size_t tensorSize)
 {
     for (size_t i = 0; i < tensorSize; i++)
diff --git a/delegate/src/test/TestUtils.hpp b/delegate/src/test/TestUtils.hpp
index d805f70..57ae3ce 100644
--- a/delegate/src/test/TestUtils.hpp
+++ b/delegate/src/test/TestUtils.hpp
@@ -25,16 +25,28 @@
     }
 }
 
-// Can be used to compare data with a tolerance depending on their data type
+/// Can be used to compare bool data coming from a tflite interpreter
+/// Boolean types get converted to a bit representation in a vector. vector.data() returns a void pointer
+/// instead of a pointer to bool. Therefore a special function to compare to vector of bool is required
+void CompareData(std::vector<bool>& tensor1, bool tensor2[], size_t tensorSize);
+void CompareData(bool tensor1[], bool tensor2[], size_t tensorSize);
+
+/// Can be used to compare float data coming from a tflite interpreter with a tolerance of limit_of_float*100
 void CompareData(float tensor1[], float tensor2[], size_t tensorSize);
+
+/// Can be used to compare int8_t data coming from a tflite interpreter with a tolerance of 1
 void CompareData(int8_t tensor1[], int8_t tensor2[], size_t tensorSize);
+
+/// Can be used to compare uint8_t data coming from a tflite interpreter with a tolerance of 1
 void CompareData(uint8_t tensor1[], uint8_t tensor2[], size_t tensorSize);
+
+/// Can be used to compare int16_t data coming from a tflite interpreter with a tolerance of 1
 void CompareData(int16_t tensor1[], int16_t tensor2[], size_t tensorSize);
 
 
-// Can be used to compare the output tensor shape and values
-// from armnnDelegateInterpreter and tfLiteInterpreter.
-// Example usage can be found in ControlTestHelper.hpp
+/// Can be used to compare the output tensor shape and values
+/// from armnnDelegateInterpreter and tfLiteInterpreter.
+/// Example usage can be found in ControlTestHelper.hpp
 template <typename T>
 void CompareOutputData(std::unique_ptr<tflite::Interpreter>& tfLiteInterpreter,
                        std::unique_ptr<tflite::Interpreter>& armnnDelegateInterpreter,
diff --git a/docs/Doxyfile b/docs/Doxyfile
index 7d20d0a..0c0b3e0 100644
--- a/docs/Doxyfile
+++ b/docs/Doxyfile
@@ -61,7 +61,7 @@
 # could be handy for archiving the generated documentation or if some version
 # control system is used.
 
-PROJECT_NUMBER         = 20.08
+PROJECT_NUMBER         = 20.11
 
 # Using the PROJECT_BRIEF tag one can provide an optional one line description
 # for a project that appears at the top of each page and should give viewer a
diff --git a/python/pyarmnn/examples/object_detection/README.md b/python/pyarmnn/examples/object_detection/README.md
index 5d40163..ea00a36 100644
--- a/python/pyarmnn/examples/object_detection/README.md
+++ b/python/pyarmnn/examples/object_detection/README.md
@@ -3,7 +3,7 @@
 ## Introduction
 This sample application guides the user and shows how to perform object detection using PyArmNN API. We assume the user has already built PyArmNN by following the instructions of the README in the main PyArmNN directory.
 
-We provide example scripts for performing object detection from video file and video stream with `run_video_file.py` and `run_video_stream.py`. 
+We provide example scripts for performing object detection from video file and video stream with `run_video_file.py` and `run_video_stream.py`.
 
 The application takes a model and video file or camera feed as input, runs inference on each frame, and draws bounding boxes around detected objects, with the corresponding labels and confidence scores overlaid.
 
@@ -49,17 +49,24 @@
 # Performing Object Detection
 
 ## Object Detection from Video File
-The `run_video_file.py` example takes a video file as input, runs inference on each frame, and produces frames with bounding boxes drawn around detected objects. The processed frames are written to video file. 
+The `run_video_file.py` example takes a video file as input, runs inference on each frame, and produces frames with bounding boxes drawn around detected objects. The processed frames are written to video file.
 
 The user can specify these arguments at command line:
 
 * `--video_file_path` - <b>Required:</b> Path to the video file to run object detection on
+
 * `--model_file_path` - <b>Required:</b> Path to <b>.tflite, .pb</b> or <b>.onnx</b> object detection model
+
 * `--model_name` - <b>Required:</b> The name of the model being used. Assembles the workflow for the input model. The examples support the model names:
-    * `ssd_mobilenet_v1`
-    * `yolo_v3_tiny`
-* `--label_path` - Path to labels file for the specified model file. Labels are provided for above model names
+
+  * `ssd_mobilenet_v1`
+
+  * `yolo_v3_tiny`
+
+* `--label_path` - <b>Required:</b> Path to labels file for the specified model file
+
 * `--output_video_file_path` - Path to the output video file with detections added in
+
 * `--preferred_backends` - You can specify one or more backend in order of preference. Accepted backends include `CpuAcc, GpuAcc, CpuRef`. Arm NN will decide which layers of the network are supported by the backend, falling back to the next if a layer is unsupported. Defaults to `['CpuAcc', 'CpuRef']`
 
 
@@ -74,11 +81,17 @@
 The user can specify these arguments at command line:
 
 * `--video_source` - Device index to access video stream. Defaults to primary device camera at index 0
+
 * `--model_file_path` - <b>Required:</b> Path to <b>.tflite, .pb</b> or <b>.onnx</b> object detection model
+
 * `--model_name` - <b>Required:</b> The name of the model being used. Assembles the workflow for the input model. The examples support the model names:
-    * `ssd_mobilenet_v1`
-    * `yolo_v3_tiny`
-* `--label_path` - Path to labels file for the specified model file. Labels are provided for above model names
+
+  * `ssd_mobilenet_v1`
+
+  * `yolo_v3_tiny`
+
+* `--label_path` - <b>Required:</b> Path to labels file for the specified model file
+
 * `--preferred_backends` - You can specify one or more backend in order of preference. Accepted backends include `CpuAcc, GpuAcc, CpuRef`. Arm NN will decide which layers of the network are supported by the backend, falling back to the next if a layer is unsupported. Defaults to `['CpuAcc', 'CpuRef']`
 
 
@@ -87,32 +100,29 @@
 $ python run_video_stream.py --model_file_path <model_file_path> --model_name <model_name>
 ```
 
+This application has been verified to work against the MobileNet SSD model, which can be downloaded along with it's label set from:
+
+* https://storage.googleapis.com/download.tensorflow.org/models/tflite/coco_ssd_mobilenet_v1_1.0_quant_2018_06_29.zip
+
 ## Implementing Your Own Network
 The examples provide support for `ssd_mobilenet_v1` and `yolo_v3_tiny` models. However, the user is able to add their own network to the object detection scripts by following the steps:
 
 1. Create a new file for your network, for example `network.py`, to contain functions to process the output of the model
 2. In that file, the user will need to write a function that decodes the output vectors obtained from running inference on their network and return the bounding box positions of detected objects plus their class index and confidence. Additionally, include a function that returns a resize factor that will scale the obtained bounding boxes to their correct positions in the original frame
 3. Import the functions into the main file and, such as with the provided networks, add a conditional statement to the `get_model_processing()` function with the new model name and functions
-4. The labels associated with the model can then either be included inside the conditional statement or passed in with `--label_path` argument when executing the main script
+4. The labels associated with the model can then be passed in with `--label_path` argument
 
 ---
 
 # Application Overview
+
 This section provides a walkthrough of the application, explaining in detail the steps:
+
 1. Initialisation
-    1.1. Reading from Video Source
-    1.2. Preparing Labels and Model Specific Functions
 2. Creating a Network
-    2.1. Creating Parser and Importing Graph
-    2.2. Optimizing Graph for Compute Device
-    2.3. Creating Input and Output Binding Information
 3. Preparing the Workload Tensors
-    3.1. Preprocessing the Captured Frame
-    3.2. Making Input and Output Tensors
 4. Executing Inference
 5. Postprocessing
-    5.1. Decoding and Processing Inference Output
-    5.2. Drawing Bounding Boxes
 
 
 ### Initialisation
@@ -133,16 +143,16 @@
 ##### Creating Parser and Importing Graph
 The first step with PyArmNN is to import a graph from file by using the appropriate parser.
 
-The Arm NN SDK provides parsers for reading graphs from a variety of model formats. In our application we specifically focus on `.tflite, .pb, .onnx` models. 
+The Arm NN SDK provides parsers for reading graphs from a variety of model formats. In our application we specifically focus on `.tflite, .pb, .onnx` models.
 
 Based on the extension of the provided model file, the corresponding parser is created and the network file loaded with `CreateNetworkFromBinaryFile()` function. The parser will handle the creation of the underlying Arm NN graph.
 
 ##### Optimizing Graph for Compute Device
 Arm NN supports optimized execution on multiple CPU and GPU devices. Prior to executing a graph, we must select the appropriate device context. We do this by creating a runtime context with default options with `IRuntime()`.
 
-We can optimize the imported graph by specifying a list of backends in order of preference and implement backend-specific optimizations. The backends are identified by a string unique to the backend, for example `CpuAcc, GpuAcc, CpuRef`. 
+We can optimize the imported graph by specifying a list of backends in order of preference and implement backend-specific optimizations. The backends are identified by a string unique to the backend, for example `CpuAcc, GpuAcc, CpuRef`.
 
-Internally and transparently, Arm NN splits the graph into subgraph based on backends, it calls a optimize subgraphs function on each of them and, if possible, substitutes the corresponding subgraph in the original graph with its optimized version. 
+Internally and transparently, Arm NN splits the graph into subgraph based on backends, it calls a optimize subgraphs function on each of them and, if possible, substitutes the corresponding subgraph in the original graph with its optimized version.
 
 Using the `Optimize()` function we optimize the graph for inference and load the optimized network onto the compute device with `LoadNetwork()`. This function creates the backend-specific workloads for the layers and a backend specific workload factory which is called to create the workloads.
 
@@ -157,7 +167,7 @@
 ### Preparing the Workload Tensors
 
 ##### Preprocessing the Captured Frame
-Each frame captured from source is read as an `ndarray` in BGR format and therefore has to be preprocessed before being passed into the network. 
+Each frame captured from source is read as an `ndarray` in BGR format and therefore has to be preprocessed before being passed into the network.
 
 This preprocessing step consists of swapping channels (BGR to RGB in this example), resizing the frame to the required resolution, expanding dimensions of the array and doing data type conversion to match the model input layer. This information about the input tensor can be readily obtained from reading the `input_binding_info`. For example, SSD MobileNet V1 takes for input a tensor with shape `[1, 300, 300, 3]` and data type `uint8`.
 
@@ -172,7 +182,7 @@
 ### Postprocessing
 
 ##### Decoding and Processing Inference Output
-The output from inference must be decoded to obtain information about detected objects in the frame. In the examples there are implementations for two networks but you may also implement your own network decoding solution here. Please refer to <i>Implementing Your Own Network</i> section of this document to learn how to do this. 
+The output from inference must be decoded to obtain information about detected objects in the frame. In the examples there are implementations for two networks but you may also implement your own network decoding solution here. Please refer to <i>Implementing Your Own Network</i> section of this document to learn how to do this.
 
 For SSD MobileNet V1 models, we decode the results to obtain the bounding box positions, classification index, confidence and number of detections in the input frame.
 
diff --git a/python/pyarmnn/examples/object_detection/run_video_file.py b/python/pyarmnn/examples/object_detection/run_video_file.py
index fc3e214..e31b779 100644
--- a/python/pyarmnn/examples/object_detection/run_video_file.py
+++ b/python/pyarmnn/examples/object_detection/run_video_file.py
@@ -36,11 +36,9 @@
         Model labels, decoding and processing functions.

     """

     if model_name == 'ssd_mobilenet_v1':

-        labels = os.path.join(script_dir, 'ssd_labels.txt')

-        return labels, ssd_processing, ssd_resize_factor(video)

+        return ssd_processing, ssd_resize_factor(video)

     elif model_name == 'yolo_v3_tiny':

-        labels = os.path.join(script_dir, 'yolo_labels.txt')

-        return labels, yolo_processing, yolo_resize_factor(video, input_binding_info)

+        return yolo_processing, yolo_resize_factor(video, input_binding_info)

     else:

         raise ValueError(f'{model_name} is not a valid model name')

 

@@ -49,8 +47,8 @@
     video, video_writer, frame_count = init_video_file_capture(args.video_file_path, args.output_video_file_path)

 

     executor = ArmnnNetworkExecutor(args.model_file_path, args.preferred_backends)

-    labels, process_output, resize_factor = get_model_processing(args.model_name, video, executor.input_binding_info)

-    labels = dict_labels(labels if args.label_path is None else args.label_path, include_rgb=True)

+    process_output, resize_factor = get_model_processing(args.model_name, video, executor.input_binding_info)

+    labels = dict_labels(args.label_path, include_rgb=True)

 

     for _ in tqdm(frame_count, desc='Processing frames'):

         frame_present, frame = video.read()

@@ -73,7 +71,7 @@
                         help='Path to the Object Detection model to use')

     parser.add_argument('--model_name', required=True, type=str,

                         help='The name of the model being used. Accepted options: ssd_mobilenet_v1, yolo_v3_tiny')

-    parser.add_argument('--label_path', type=str,

+    parser.add_argument('--label_path', required=True, type=str,

                         help='Path to the labelset for the provided model file')

     parser.add_argument('--output_video_file_path', type=str,

                         help='Path to the output video file with detections added in')

diff --git a/python/pyarmnn/examples/object_detection/run_video_stream.py b/python/pyarmnn/examples/object_detection/run_video_stream.py
index 9a303e8..8635a40 100644
--- a/python/pyarmnn/examples/object_detection/run_video_stream.py
+++ b/python/pyarmnn/examples/object_detection/run_video_stream.py
@@ -36,11 +36,9 @@
         Model labels, decoding and processing functions.

     """

     if model_name == 'ssd_mobilenet_v1':

-        labels = os.path.join(script_dir, 'ssd_labels.txt')

-        return labels, ssd_processing, ssd_resize_factor(video)

+        return ssd_processing, ssd_resize_factor(video)

     elif model_name == 'yolo_v3_tiny':

-        labels = os.path.join(script_dir, 'yolo_labels.txt')

-        return labels, yolo_processing, yolo_resize_factor(video, input_binding_info)

+        return yolo_processing, yolo_resize_factor(video, input_binding_info)

     else:

         raise ValueError(f'{model_name} is not a valid model name')

 

@@ -49,8 +47,8 @@
     video = init_video_stream_capture(args.video_source)

     executor = ArmnnNetworkExecutor(args.model_file_path, args.preferred_backends)

 

-    labels, process_output, resize_factor = get_model_processing(args.model_name, video, executor.input_binding_info)

-    labels = dict_labels(labels if args.label_path is None else args.label_path, include_rgb=True)

+    process_output, resize_factor = get_model_processing(args.model_name, video, executor.input_binding_info)

+    labels = dict_labels(args.label_path, include_rgb=True)

 

     while True:

         frame_present, frame = video.read()

@@ -77,7 +75,7 @@
                         help='Path to the Object Detection model to use')

     parser.add_argument('--model_name', required=True, type=str,

                         help='The name of the model being used. Accepted options: ssd_mobilenet_v1, yolo_v3_tiny')

-    parser.add_argument('--label_path', type=str,

+    parser.add_argument('--label_path', required=True, type=str,

                         help='Path to the labelset for the provided model file')

     parser.add_argument('--preferred_backends', type=str, nargs='+', default=['CpuAcc', 'CpuRef'],

                         help='Takes the preferred backends in preference order, separated by whitespace, '

diff --git a/python/pyarmnn/examples/object_detection/ssd_labels.txt b/python/pyarmnn/examples/object_detection/ssd_labels.txt
deleted file mode 100644
index 5378c6c..0000000
--- a/python/pyarmnn/examples/object_detection/ssd_labels.txt
+++ /dev/null
@@ -1,91 +0,0 @@
-person
-bicycle
-car
-motorcycle
-airplane
-bus
-train
-truck
-boat
-traffic light
-fire hydrant
-street sign
-stop sign
-parking meter
-bench
-bird
-cat
-dog
-horse
-sheep
-cow
-elephant
-bear
-zebra
-giraffe
-hat
-backpack
-umbrella
-shoe
-eye glasses
-handbag
-tie
-suitcase
-frisbee
-skis
-snowboard
-sports ball
-kite
-baseball bat
-baseball glove
-skateboard
-surfboard
-tennis racket
-bottle
-plate
-wine glass
-cup
-fork
-knife
-spoon
-bowl
-banana
-apple
-sandwich
-orange
-broccoli
-carrot
-hot dog
-pizza
-donut
-cake
-chair
-couch
-potted plant
-bed
-mirror
-dining table
-window
-desk
-toilet
-door
-tv
-laptop
-mouse
-remote
-keyboard
-cell phone
-microwave
-oven
-toaster
-sink
-refrigerator
-blender
-book
-clock
-vase
-scissors
-teddy bear
-hair drier
-toothbrush
-hair brush
\ No newline at end of file
diff --git a/python/pyarmnn/examples/object_detection/yolo_labels.txt b/python/pyarmnn/examples/object_detection/yolo_labels.txt
deleted file mode 100644
index c5b80f7..0000000
--- a/python/pyarmnn/examples/object_detection/yolo_labels.txt
+++ /dev/null
@@ -1,80 +0,0 @@
-person

-bicycle

-car

-motorcycle

-airplane

-bus

-train

-truck

-boat

-traffic light

-fire hydrant

-stop sign

-parking meter

-bench

-bird

-cat

-dog

-horse

-sheep

-cow

-elephant

-bear

-zebra

-giraffe

-backpack

-umbrella

-handbag

-tie

-suitcase

-frisbee

-skis

-snowboard

-sports ball

-kite

-baseball bat

-baseball glove

-skateboard

-surfboard

-tennis racket

-bottle

-wine glass

-cup

-fork

-knife

-spoon

-bowl

-banana

-apple

-sandwich

-orange

-broccoli

-carrot

-hot dog

-pizza

-donut

-cake

-chair

-couch

-potted plant

-bed

-dining table

-toilet

-tv

-laptop

-mouse

-remote

-keyboard

-cell phone

-microwave

-oven

-toaster

-sink

-refrigerator

-book

-clock

-vase

-scissors

-teddy bear

-hair drier

-toothbrush
\ No newline at end of file
diff --git a/scripts/get_compute_library.sh b/scripts/get_compute_library.sh
index d9c1293..291f08a 100755
--- a/scripts/get_compute_library.sh
+++ b/scripts/get_compute_library.sh
@@ -7,10 +7,10 @@
 CMD=$( basename $0 )
 
 # For pinning to a ref use this:
-#DEFAULT_CLFRAMEWORKREVISION="branches/arm_compute_20_08" # Release 20.08
+DEFAULT_CLFRAMEWORKREVISION="branches/arm_compute_20_11" # Release 20.11
 #
 # For pinning to a revision use this:
-DEFAULT_CLFRAMEWORKREVISION="04a0706dddc6ca24cb80e3e0789c6b0f54c48b28" #COMPMID-3979 Sanitise Padding Removal epic
+#DEFAULT_CLFRAMEWORKREVISION="75eea338eb232ebdafa2fb84d22e711b5f964785" #COMPMID-3961: Add Logical OR/AND/NOT operator on CL
 
 usage() {
     echo "Usage: $CMD (Use the default clframework SHA)"
diff --git a/src/armnn/LoadedNetwork.cpp b/src/armnn/LoadedNetwork.cpp
index 7ef7f5f..f851910 100644
--- a/src/armnn/LoadedNetwork.cpp
+++ b/src/armnn/LoadedNetwork.cpp
@@ -255,6 +255,11 @@
         }
     }
 
+    for (auto&& workloadFactory : m_WorkloadFactories)
+    {
+        workloadFactory.second.first->AfterWorkloadsCreated();
+    }
+
     if (timelineUtils)
     {
         // Commit to send the post-optimisation network structure
diff --git a/src/armnn/layers/ElementwiseUnaryLayer.cpp b/src/armnn/layers/ElementwiseUnaryLayer.cpp
index 74fa16e..8c94106 100644
--- a/src/armnn/layers/ElementwiseUnaryLayer.cpp
+++ b/src/armnn/layers/ElementwiseUnaryLayer.cpp
@@ -24,6 +24,11 @@
 {
     ElementwiseUnaryQueueDescriptor descriptor;
 
+    if (descriptor.m_Parameters.m_Operation == UnaryOperation::LogicalNot)
+    {
+        return factory.CreateLogicalUnary(descriptor, PrepInfoAndDesc(descriptor));
+    }
+
     return factory.CreateElementwiseUnary(descriptor, PrepInfoAndDesc(descriptor));
 }
 
diff --git a/src/armnnDeserializer/DeserializerSupport.md b/src/armnnDeserializer/DeserializerSupport.md
index 4e2ead4..2ff3a24 100644
--- a/src/armnnDeserializer/DeserializerSupport.md
+++ b/src/armnnDeserializer/DeserializerSupport.md
@@ -29,6 +29,7 @@
 * Input
 * InstanceNormalization
 * L2Normalization
+* Logical
 * LogSoftmax
 * Lstm
 * Maximum
diff --git a/src/armnnSerializer/SerializerSupport.md b/src/armnnSerializer/SerializerSupport.md
index 4383353..67dc5d1 100644
--- a/src/armnnSerializer/SerializerSupport.md
+++ b/src/armnnSerializer/SerializerSupport.md
@@ -28,6 +28,7 @@
 * Input
 * InstanceNormalization
 * L2Normalization
+* Logical
 * LogSoftmax
 * Lstm
 * Maximum
diff --git a/src/armnnTfLiteParser/TensorFlowLiteSupport.md b/src/armnnTfLiteParser/TensorFlowLiteSupport.md
index 9718b22..faad3d0 100644
--- a/src/armnnTfLiteParser/TensorFlowLiteSupport.md
+++ b/src/armnnTfLiteParser/TensorFlowLiteSupport.md
@@ -104,6 +104,8 @@
 
 * FSRCNN
 
+* EfficientNet-lite
+
 * RDN converted from [TensorFlow model](https://github.com/hengchuan/RDN-TensorFlow)
 
 * Quantized RDN (CpuRef)
diff --git a/src/backends/backendsCommon/WorkloadData.cpp b/src/backends/backendsCommon/WorkloadData.cpp
index 676559c..530dc48 100644
--- a/src/backends/backendsCommon/WorkloadData.cpp
+++ b/src/backends/backendsCommon/WorkloadData.cpp
@@ -1675,7 +1675,8 @@
         DataType::QAsymmS8,
         DataType::QAsymmU8,
         DataType::QSymmS16,
-        DataType::Signed32
+        DataType::Signed32,
+        DataType::Boolean
     };
 
     ValidateDataTypes(inputTensorInfo, supportedTypes, descriptorName);
diff --git a/src/backends/backendsCommon/WorkloadFactory.hpp b/src/backends/backendsCommon/WorkloadFactory.hpp
index df08b9a..2e813e9 100644
--- a/src/backends/backendsCommon/WorkloadFactory.hpp
+++ b/src/backends/backendsCommon/WorkloadFactory.hpp
@@ -23,6 +23,8 @@
 public:
     virtual ~IWorkloadFactory() { }
 
+    virtual void AfterWorkloadsCreated() {};
+
     virtual const BackendId& GetBackendId() const = 0;
 
     static bool IsLayerSupported(const BackendId& backendId,
diff --git a/src/backends/backendsCommon/WorkloadFactoryBase.hpp b/src/backends/backendsCommon/WorkloadFactoryBase.hpp
index bfdb5e9..2952023 100644
--- a/src/backends/backendsCommon/WorkloadFactoryBase.hpp
+++ b/src/backends/backendsCommon/WorkloadFactoryBase.hpp
@@ -119,6 +119,10 @@
             RsqrtQueueDescriptor rsqrtDescriptor;
             return CreateRsqrt(rsqrtDescriptor, info);
         }
+        else if (descriptor.m_Parameters.m_Operation == UnaryOperation::LogicalNot)
+        {
+            return CreateLogicalUnary(descriptor, info);
+        }
         return nullptr;
     }
 
diff --git a/src/backends/backendsCommon/test/BackendProfilingTests.cpp b/src/backends/backendsCommon/test/BackendProfilingTests.cpp
index 377ca22..4b0d26e 100644
--- a/src/backends/backendsCommon/test/BackendProfilingTests.cpp
+++ b/src/backends/backendsCommon/test/BackendProfilingTests.cpp
@@ -123,15 +123,22 @@
     // Create a runtime
     armnn::Runtime runtime(options);
 
+    unsigned int shiftedId = 0;
+
+#if defined(ETHOSN_SUPPORT_ENABLED)
+    // Shift the id as ETHOSN is enabled.
+    shiftedId = 4;
+#endif
+
     // Check if the MockBackends 3 dummy counters {0, 1, 2-5 (four cores)} are registered
     armnn::BackendId mockId = armnn::MockBackendId();
     const armnn::profiling::ICounterMappings& counterMap = GetProfilingService(&runtime).GetCounterMappings();
-    BOOST_CHECK(counterMap.GetGlobalId(0, mockId) == 5);
-    BOOST_CHECK(counterMap.GetGlobalId(1, mockId) == 6);
-    BOOST_CHECK(counterMap.GetGlobalId(2, mockId) == 7);
-    BOOST_CHECK(counterMap.GetGlobalId(3, mockId) == 8);
-    BOOST_CHECK(counterMap.GetGlobalId(4, mockId) == 9);
-    BOOST_CHECK(counterMap.GetGlobalId(5, mockId) == 10);
+    BOOST_CHECK(counterMap.GetGlobalId(0, mockId) == 5 + shiftedId);
+    BOOST_CHECK(counterMap.GetGlobalId(1, mockId) == 6 + shiftedId);
+    BOOST_CHECK(counterMap.GetGlobalId(2, mockId) == 7 + shiftedId);
+    BOOST_CHECK(counterMap.GetGlobalId(3, mockId) == 8 + shiftedId);
+    BOOST_CHECK(counterMap.GetGlobalId(4, mockId) == 9 + shiftedId);
+    BOOST_CHECK(counterMap.GetGlobalId(5, mockId) == 10 + shiftedId);
     options.m_ProfilingOptions.m_EnableProfiling = false;
     GetProfilingService(&runtime).ResetExternalProfilingOptions(options.m_ProfilingOptions, true);
 }
diff --git a/src/backends/backendsCommon/test/IsLayerSupportedTestImpl.hpp b/src/backends/backendsCommon/test/IsLayerSupportedTestImpl.hpp
index 7c7ad5f..1492a80 100644
--- a/src/backends/backendsCommon/test/IsLayerSupportedTestImpl.hpp
+++ b/src/backends/backendsCommon/test/IsLayerSupportedTestImpl.hpp
@@ -907,6 +907,70 @@
 }
 
 template<typename FactoryType, armnn::DataType InputDataType , armnn::DataType OutputDataType>
+bool IsLogicalBinaryLayerSupportedTests(std::string& reasonIfUnsupported)
+{
+    armnn::Graph graph;
+    armnn::LogicalBinaryDescriptor desc(armnn::LogicalBinaryOperation::LogicalOr);
+
+    armnn::Layer* const input0 = graph.AddLayer<armnn::InputLayer>(0, "input0");
+    armnn::Layer* const input1 = graph.AddLayer<armnn::InputLayer>(1, "input1");
+
+    armnn::Layer* const layer = graph.AddLayer<armnn::LogicalBinaryLayer>(desc, "logicalOrLayer");
+
+    armnn::Layer* const output = graph.AddLayer<armnn::OutputLayer>(0, "output1");
+
+    armnn::TensorInfo inputTensorInfo0({1, 1, 1, 4}, InputDataType);
+    armnn::TensorInfo inputTensorInfo1({1, 1, 1, 4}, InputDataType);
+
+    armnn::TensorInfo outputTensorInfo({1, 1, 1, 4}, OutputDataType);
+
+    input0->GetOutputSlot(0).Connect(layer->GetInputSlot(0));
+    input1->GetOutputSlot(0).Connect(layer->GetInputSlot(1));
+
+    input0->GetOutputHandler(0).SetTensorInfo(inputTensorInfo0);
+    input1->GetOutputHandler(0).SetTensorInfo(inputTensorInfo1);
+
+    layer->GetOutputSlot(0).Connect(output->GetInputSlot(0));
+    layer->GetOutputHandler(0).SetTensorInfo(outputTensorInfo);
+
+    bool result = FactoryType::IsLayerSupported(*layer, InputDataType, reasonIfUnsupported);
+
+    return result;
+}
+
+template<typename FactoryType, armnn::DataType InputDataType , armnn::DataType OutputDataType>
+bool IsLogicalBinaryLayerBroadcastSupportedTests(std::string& reasonIfUnsupported)
+{
+    armnn::Graph graph;
+    armnn::LogicalBinaryDescriptor desc(armnn::LogicalBinaryOperation::LogicalAnd);
+
+    armnn::Layer* const input0 = graph.AddLayer<armnn::InputLayer>(0, "input0");
+    armnn::Layer* const input1 = graph.AddLayer<armnn::InputLayer>(1, "input1");
+
+    armnn::Layer* const layer = graph.AddLayer<armnn::LogicalBinaryLayer>(desc, "logicalAndLayer");
+
+    armnn::Layer* const output = graph.AddLayer<armnn::OutputLayer>(0, "output2");
+
+    armnn::TensorInfo inputTensorInfo0({1, 1, 1, 4}, InputDataType);
+    armnn::TensorInfo inputTensorInfo1({1, 1, 1, 1}, InputDataType);
+
+    armnn::TensorInfo outputTensorInfo({1, 1, 1, 4}, OutputDataType);
+
+    input0->GetOutputSlot(0).Connect(layer->GetInputSlot(0));
+    input1->GetOutputSlot(0).Connect(layer->GetInputSlot(1));
+
+    input0->GetOutputHandler(0).SetTensorInfo(inputTensorInfo0);
+    input1->GetOutputHandler(0).SetTensorInfo(inputTensorInfo1);
+
+    layer->GetOutputSlot(0).Connect(output->GetInputSlot(0));
+    layer->GetOutputHandler(0).SetTensorInfo(outputTensorInfo);
+
+    bool result = FactoryType::IsLayerSupported(*layer, InputDataType, reasonIfUnsupported);
+
+    return result;
+}
+
+template<typename FactoryType, armnn::DataType InputDataType , armnn::DataType OutputDataType>
 bool IsMeanLayerSupportedTests(std::string& reasonIfUnsupported)
 {
     armnn::Graph graph;
diff --git a/src/backends/backendsCommon/test/layerTests/LogicalTestImpl.cpp b/src/backends/backendsCommon/test/layerTests/LogicalTestImpl.cpp
index 2225de3..4f04673 100644
--- a/src/backends/backendsCommon/test/layerTests/LogicalTestImpl.cpp
+++ b/src/backends/backendsCommon/test/layerTests/LogicalTestImpl.cpp
@@ -50,7 +50,7 @@
     AddInputToWorkload(qDesc, info, inputTensorInfo, inputHandle.get());
     AddOutputToWorkload(qDesc, info, outputTensorInfo, outputHandle.get());
 
-    auto workload = workloadFactory.CreateLogicalUnary(qDesc, info);
+    auto workload = workloadFactory.CreateElementwiseUnary(qDesc, info);
 
     inputHandle->Allocate();
     outputHandle->Allocate();
diff --git a/src/backends/backendsCommon/test/layerTests/ReshapeTestImpl.cpp b/src/backends/backendsCommon/test/layerTests/ReshapeTestImpl.cpp
index d233e89..fbedb94 100644
--- a/src/backends/backendsCommon/test/layerTests/ReshapeTestImpl.cpp
+++ b/src/backends/backendsCommon/test/layerTests/ReshapeTestImpl.cpp
@@ -170,6 +170,30 @@
         workloadFactory, memoryManager, tensorHandleFactory, inputTensorInfo, outputTensorInfo, input, outputExpected);
 }
 
+LayerTestResult<uint8_t, 2> ReshapeBooleanTest(
+    armnn::IWorkloadFactory& workloadFactory,
+    const armnn::IBackendInternal::IMemoryManagerSharedPtr& memoryManager,
+    const armnn::ITensorHandleFactory& tensorHandleFactory)
+{
+    armnn::TensorInfo inputTensorInfo;
+    armnn::TensorInfo outputTensorInfo;
+
+    unsigned int inputShape[] = { 1, 4 };
+    unsigned int outputShape[] = { 2, 2 };
+
+    inputTensorInfo = armnn::TensorInfo(2, inputShape, armnn::DataType::Boolean);
+    inputTensorInfo.SetQuantizationScale(1.0f);
+    outputTensorInfo = armnn::TensorInfo(2, outputShape, armnn::DataType::Boolean);
+    outputTensorInfo.SetQuantizationScale(1.0f);
+
+    const std::vector<uint8_t> input = { true, false, false, true };
+
+    const std::vector<uint8_t> outputExpected = { true, false, false, true };
+
+    return SimpleReshapeTestImpl<uint8_t, 2>(
+        workloadFactory, memoryManager, tensorHandleFactory, inputTensorInfo, outputTensorInfo, input, outputExpected);
+}
+
 //
 // Explicit template specializations
 //
diff --git a/src/backends/backendsCommon/test/layerTests/ReshapeTestImpl.hpp b/src/backends/backendsCommon/test/layerTests/ReshapeTestImpl.hpp
index 661702b..a29a965 100644
--- a/src/backends/backendsCommon/test/layerTests/ReshapeTestImpl.hpp
+++ b/src/backends/backendsCommon/test/layerTests/ReshapeTestImpl.hpp
@@ -23,3 +23,8 @@
     armnn::IWorkloadFactory& workloadFactory,
     const armnn::IBackendInternal::IMemoryManagerSharedPtr& memoryManager,
     const armnn::ITensorHandleFactory& tensorHandleFactory);
+
+LayerTestResult<uint8_t, 2> ReshapeBooleanTest(
+    armnn::IWorkloadFactory& workloadFactory,
+    const armnn::IBackendInternal::IMemoryManagerSharedPtr& memoryManager,
+    const armnn::ITensorHandleFactory& tensorHandleFactory);
diff --git a/src/backends/cl/ClBackendModelContext.cpp b/src/backends/cl/ClBackendModelContext.cpp
index 0ef26b6..b685bc2 100644
--- a/src/backends/cl/ClBackendModelContext.cpp
+++ b/src/backends/cl/ClBackendModelContext.cpp
@@ -17,13 +17,22 @@
     return defaultValue;
 }
 
+std::string ParseFile(const armnn::BackendOptions::Var& value, std::string defaultValue)
+{
+    if (value.IsString())
+    {
+        return value.AsString();
+    }
+    return defaultValue;
+}
+
 } // namespace anonymous
 
 namespace armnn
 {
 
 ClBackendModelContext::ClBackendModelContext(const ModelOptions& modelOptions)
-    : m_IsFastMathEnabled(false)
+    : m_CachedNetworkFilePath(""), m_IsFastMathEnabled(false), m_SaveCachedNetwork(false)
 {
    if (!modelOptions.empty())
    {
@@ -33,13 +42,31 @@
            {
                m_IsFastMathEnabled |= ParseBool(value, false);
            }
+           if (name == "SaveCachedNetwork")
+           {
+               m_SaveCachedNetwork |= ParseBool(value, false);
+           }
+           if (name == "CachedNetworkFilePath")
+           {
+               m_CachedNetworkFilePath = ParseFile(value, "");
+           }
        });
    }
 }
 
+std::string ClBackendModelContext::GetCachedNetworkFilePath() const
+{
+    return m_CachedNetworkFilePath;
+}
+
 bool ClBackendModelContext::IsFastMathEnabled() const
 {
     return m_IsFastMathEnabled;
 }
 
+bool ClBackendModelContext::SaveCachedNetwork() const
+{
+    return m_SaveCachedNetwork;
+}
+
 } // namespace armnn
\ No newline at end of file
diff --git a/src/backends/cl/ClBackendModelContext.hpp b/src/backends/cl/ClBackendModelContext.hpp
index 577649a..c84cdbb 100644
--- a/src/backends/cl/ClBackendModelContext.hpp
+++ b/src/backends/cl/ClBackendModelContext.hpp
@@ -6,6 +6,8 @@
 
 #include <armnn/backends/IBackendContext.hpp>
 
+#include<string>
+
 namespace armnn
 {
 
@@ -19,10 +21,17 @@
 public:
     ClBackendModelContext(const ModelOptions& modelOptions);
 
+    std::string GetCachedNetworkFilePath() const;
+
     bool IsFastMathEnabled() const;
 
+    bool SaveCachedNetwork() const;
+
 private:
+    std::string m_CachedNetworkFilePath;
     bool m_IsFastMathEnabled;
+    bool m_SaveCachedNetwork;
+
 };
 
 } // namespace armnn
\ No newline at end of file
diff --git a/src/backends/cl/ClLayerSupport.cpp b/src/backends/cl/ClLayerSupport.cpp
index cce5c9b..65454d4 100644
--- a/src/backends/cl/ClLayerSupport.cpp
+++ b/src/backends/cl/ClLayerSupport.cpp
@@ -42,6 +42,9 @@
 #include "workloads/ClInstanceNormalizationWorkload.hpp"
 #include "workloads/ClL2NormalizationFloatWorkload.hpp"
 #include "workloads/ClLogSoftmaxWorkload.hpp"
+#include "workloads/ClLogicalAndWorkload.hpp"
+#include "workloads/ClLogicalNotWorkload.hpp"
+#include "workloads/ClLogicalOrWorkload.hpp"
 #include "workloads/ClLstmFloatWorkload.hpp"
 #include "workloads/ClMaximumWorkload.hpp"
 #include "workloads/ClMeanWorkload.hpp"
@@ -460,6 +463,11 @@
                                            reasonIfUnsupported,
                                            input,
                                            output);
+        case UnaryOperation::LogicalNot:
+            FORWARD_WORKLOAD_VALIDATE_FUNC(ClLogicalNotWorkloadValidate,
+                                           reasonIfUnsupported,
+                                           input,
+                                           output);
         default:
             return false;
     }
@@ -557,6 +565,34 @@
                                    descriptor);
 }
 
+bool ClLayerSupport::IsLogicalBinarySupported(const TensorInfo& input0,
+                                              const TensorInfo& input1,
+                                              const TensorInfo& output,
+                                              const LogicalBinaryDescriptor& descriptor,
+                                              Optional<std::string&> reasonIfUnsupported) const
+{
+    IgnoreUnused(output);
+
+    switch(descriptor.m_Operation)
+    {
+        case LogicalBinaryOperation::LogicalAnd:
+            FORWARD_WORKLOAD_VALIDATE_FUNC(ClLogicalAndWorkloadValidate,
+                                           reasonIfUnsupported,
+                                           input0,
+                                           input1,
+                                           output);
+        case LogicalBinaryOperation::LogicalOr:
+            FORWARD_WORKLOAD_VALIDATE_FUNC(ClLogicalOrWorkloadValidate,
+                                           reasonIfUnsupported,
+                                           input0,
+                                           input1,
+                                           output);
+        default:
+            return false;
+    }
+}
+
+
 bool ClLayerSupport::IsLogSoftmaxSupported(const TensorInfo& input,
                                                 const TensorInfo& output,
                                                 const LogSoftmaxDescriptor& descriptor,
diff --git a/src/backends/cl/ClLayerSupport.hpp b/src/backends/cl/ClLayerSupport.hpp
index d7e2553..f2df94c 100644
--- a/src/backends/cl/ClLayerSupport.hpp
+++ b/src/backends/cl/ClLayerSupport.hpp
@@ -155,6 +155,12 @@
                                     const L2NormalizationDescriptor& descriptor,
                                     Optional<std::string&> reasonIfUnsupported = EmptyOptional()) const override;
 
+    bool IsLogicalBinarySupported(const TensorInfo& input0,
+                                  const TensorInfo& input1,
+                                  const TensorInfo& output,
+                                  const LogicalBinaryDescriptor& descriptor,
+                                  Optional<std::string&> reasonIfUnsupported) const override;
+
     bool IsLogSoftmaxSupported(const TensorInfo& input,
                                const TensorInfo& output,
                                const LogSoftmaxDescriptor& descriptor,
diff --git a/src/backends/cl/ClWorkloadFactory.cpp b/src/backends/cl/ClWorkloadFactory.cpp
index cb4aa92..41b779f 100644
--- a/src/backends/cl/ClWorkloadFactory.cpp
+++ b/src/backends/cl/ClWorkloadFactory.cpp
@@ -27,6 +27,8 @@
 #include <arm_compute/runtime/CL/CLBufferAllocator.h>
 #include <arm_compute/runtime/CL/CLScheduler.h>
 
+#include <Filesystem.hpp>
+
 namespace armnn
 {
 
@@ -55,6 +57,23 @@
     return s_Id;
 }
 
+void ClWorkloadFactory::AfterWorkloadsCreated()
+{
+    if(m_ModelContextPtr)
+    {
+        auto modelOptions = dynamic_cast<ClBackendModelContext*>(m_ModelContextPtr.get());
+        if (modelOptions->SaveCachedNetwork())
+        {
+            // Save map to a filepath provided in ModelOptions
+            auto filePath = modelOptions->GetCachedNetworkFilePath();
+            if (filePath != "" && fs::exists(filePath) && fs::is_regular_file(filePath))
+            {
+                ///  Saving will be implemented within IVGCVSW-5483 story.
+            }
+        }
+    }
+}
+
 template <typename FloatWorkload, typename Uint8Workload, typename QueueDescriptorType, typename... Args>
 std::unique_ptr<IWorkload> ClWorkloadFactory::MakeWorkload(const QueueDescriptorType& descriptor,
                                                            const WorkloadInfo& info,
@@ -85,15 +104,40 @@
     }
 }
 
+void ClWorkloadFactory::InitializeCLCompileContext()
+{
+    // Initialize our m_CLCompileContext using default device and context
+    cl::Device device = cl::Device::getDefault();
+    cl::Context context = cl::Context(device);
+
+    m_CLCompileContext = arm_compute::CLCompileContext(context, device);
+
+    if (m_ModelContextPtr)
+    {
+        // Load saved programs if the user has set a filepath
+        auto modelOptions = dynamic_cast<ClBackendModelContext*>(m_ModelContextPtr.get());
+        auto filePath = modelOptions->GetCachedNetworkFilePath();
+        if (filePath != ""
+            && fs::exists(filePath)
+            && fs::is_regular_file(filePath)
+            && !(modelOptions->SaveCachedNetwork()))
+        {
+            ///  Loading will be implemented within IVGCVSW-5483 story.
+        }
+    }
+}
+
 ClWorkloadFactory::ClWorkloadFactory(const std::shared_ptr<ClMemoryManager>& memoryManager)
     : m_MemoryManager(memoryManager), m_ModelContextPtr(IBackendInternal::IBackendSpecificModelContextPtr{})
 {
+    InitializeCLCompileContext();
 }
 
 ClWorkloadFactory::ClWorkloadFactory(const std::shared_ptr<ClMemoryManager>& memoryManager,
                                      const IBackendInternal::IBackendSpecificModelContextPtr& modelContextPtr)
     : m_MemoryManager(memoryManager), m_ModelContextPtr(modelContextPtr)
 {
+    InitializeCLCompileContext();
 }
 
 std::unique_ptr<ITensorHandle> ClWorkloadFactory::CreateTensorHandle(const TensorInfo& tensorInfo,
@@ -281,25 +325,27 @@
     switch(descriptor.m_Parameters.m_Operation)
     {
         case UnaryOperation::Abs:
-             {
-                 AbsQueueDescriptor absQueueDescriptor;
-                 absQueueDescriptor.m_Inputs  = descriptor.m_Inputs;
-                 absQueueDescriptor.m_Outputs = descriptor.m_Outputs;
+        {
+            AbsQueueDescriptor absQueueDescriptor;
+            absQueueDescriptor.m_Inputs  = descriptor.m_Inputs;
+            absQueueDescriptor.m_Outputs = descriptor.m_Outputs;
 
-                 return  std::make_unique<ClAbsWorkload>(absQueueDescriptor, info);
-             }
+            return  std::make_unique<ClAbsWorkload>(absQueueDescriptor, info);
+        }
         case UnaryOperation::Exp:
             return std::make_unique<ClExpWorkload>(descriptor, info);
         case UnaryOperation::Neg:
             return std::make_unique<ClNegWorkload>(descriptor, info);
         case UnaryOperation::Rsqrt:
-             {
-                 RsqrtQueueDescriptor rsqrtQueueDescriptor;
-                 rsqrtQueueDescriptor.m_Inputs  = descriptor.m_Inputs;
-                 rsqrtQueueDescriptor.m_Outputs = descriptor.m_Outputs;
+        {
+            RsqrtQueueDescriptor rsqrtQueueDescriptor;
+            rsqrtQueueDescriptor.m_Inputs  = descriptor.m_Inputs;
+            rsqrtQueueDescriptor.m_Outputs = descriptor.m_Outputs;
 
-                 return std::make_unique<ClRsqrtWorkload>(rsqrtQueueDescriptor, info);
-             }
+            return std::make_unique<ClRsqrtWorkload>(rsqrtQueueDescriptor, info);
+        }
+        case UnaryOperation::LogicalNot:
+            return std::make_unique<ClLogicalNotWorkload>(descriptor, info);
         default:
             return nullptr;
     }
@@ -370,6 +416,20 @@
     return MakeWorkload<ClL2NormalizationFloatWorkload, NullWorkload>(descriptor, info);
 }
 
+std::unique_ptr<IWorkload> ClWorkloadFactory::CreateLogicalBinary(const LogicalBinaryQueueDescriptor& descriptor,
+                                                                  const WorkloadInfo& info) const
+{
+    switch(descriptor.m_Parameters.m_Operation)
+    {
+        case LogicalBinaryOperation::LogicalAnd:
+            return std::make_unique<ClLogicalAndWorkload>(descriptor, info);
+        case LogicalBinaryOperation::LogicalOr:
+            return std::make_unique<ClLogicalOrWorkload>(descriptor, info);
+        default:
+            return nullptr;
+    }
+}
+
 std::unique_ptr<IWorkload> ClWorkloadFactory::CreateLogSoftmax(const LogSoftmaxQueueDescriptor& descriptor,
                                                                const WorkloadInfo& info) const
 {
diff --git a/src/backends/cl/ClWorkloadFactory.hpp b/src/backends/cl/ClWorkloadFactory.hpp
index fad5dd0..c8812cf 100644
--- a/src/backends/cl/ClWorkloadFactory.hpp
+++ b/src/backends/cl/ClWorkloadFactory.hpp
@@ -12,6 +12,8 @@
 #include <backendsCommon/WorkloadFactoryBase.hpp>
 #include <aclCommon/BaseMemoryManager.hpp>
 
+#include <arm_compute/core/CL/CLCompileContext.h>
+
 namespace armnn
 {
 
@@ -24,6 +26,8 @@
     ClWorkloadFactory(const std::shared_ptr<ClMemoryManager>& memoryManager,
                       const IBackendInternal::IBackendSpecificModelContextPtr& modelContextPtr);
 
+    void AfterWorkloadsCreated() override;
+
     const BackendId& GetBackendId() const override;
 
     static bool IsLayerSupported(const Layer& layer,
@@ -138,6 +142,9 @@
     std::unique_ptr<IWorkload> CreateL2Normalization(const L2NormalizationQueueDescriptor& descriptor,
                                                      const WorkloadInfo& info) const override;
 
+    std::unique_ptr<IWorkload> CreateLogicalBinary(const LogicalBinaryQueueDescriptor& descriptor,
+                                                   const WorkloadInfo& info) const override;
+
     std::unique_ptr<IWorkload> CreateLogSoftmax(const LogSoftmaxQueueDescriptor& descriptor,
                                                 const WorkloadInfo& info) const override;
 
@@ -251,8 +258,11 @@
                                                    const WorkloadInfo& info,
                                                    Args&&... args);
 
+    void InitializeCLCompileContext();
+
     mutable std::shared_ptr<ClMemoryManager> m_MemoryManager;
     const IBackendInternal::IBackendSpecificModelContextPtr m_ModelContextPtr;
+    arm_compute::CLCompileContext m_CLCompileContext;
 };
 
 } // namespace armnn
diff --git a/src/backends/cl/backend.mk b/src/backends/cl/backend.mk
index 9cbe21e..52295cc 100644
--- a/src/backends/cl/backend.mk
+++ b/src/backends/cl/backend.mk
@@ -46,6 +46,9 @@
         workloads/ClGatherWorkload.cpp \
         workloads/ClInstanceNormalizationWorkload.cpp \
         workloads/ClL2NormalizationFloatWorkload.cpp \
+        workloads/ClLogicalAndWorkload.cpp \
+        workloads/ClLogicalNotWorkload.cpp \
+        workloads/ClLogicalOrWorkload.cpp \
         workloads/ClLogSoftmaxWorkload.cpp \
         workloads/ClLstmFloatWorkload.cpp \
         workloads/ClMaximumWorkload.cpp \
diff --git a/src/backends/cl/test/ClLayerSupportTests.cpp b/src/backends/cl/test/ClLayerSupportTests.cpp
index 81d0cc2..2b8b0d4 100644
--- a/src/backends/cl/test/ClLayerSupportTests.cpp
+++ b/src/backends/cl/test/ClLayerSupportTests.cpp
@@ -121,6 +121,26 @@
     BOOST_CHECK_EQUAL(reasonIfUnsupported, "Output should be Float16");
 }
 
+BOOST_FIXTURE_TEST_CASE(IsLogicalBinarySupportedCl, ClContextControlFixture)
+{
+    std::string reasonIfUnsupported;
+
+    bool result = IsLogicalBinaryLayerSupportedTests<armnn::ClWorkloadFactory,
+      armnn::DataType::Boolean, armnn::DataType::Boolean>(reasonIfUnsupported);
+
+    BOOST_CHECK(result);
+}
+
+BOOST_FIXTURE_TEST_CASE(IsLogicalBinaryBroadcastSupportedCl, ClContextControlFixture)
+{
+    std::string reasonIfUnsupported;
+
+    bool result = IsLogicalBinaryLayerBroadcastSupportedTests<armnn::ClWorkloadFactory,
+      armnn::DataType::Boolean, armnn::DataType::Boolean>(reasonIfUnsupported);
+
+    BOOST_CHECK(result);
+}
+
 BOOST_FIXTURE_TEST_CASE(IsMeanSupportedCl, ClContextControlFixture)
 {
     std::string reasonIfUnsupported;
diff --git a/src/backends/cl/test/ClLayerTests.cpp b/src/backends/cl/test/ClLayerTests.cpp
index 0d87b74..7d40a69 100644
--- a/src/backends/cl/test/ClLayerTests.cpp
+++ b/src/backends/cl/test/ClLayerTests.cpp
@@ -485,6 +485,7 @@
 ARMNN_AUTO_TEST_CASE_WITH_THF(SimpleReshapeInt8, SimpleReshapeTest<DataType::QAsymmS8>)
 ARMNN_AUTO_TEST_CASE_WITH_THF(SimpleReshapeUint8, SimpleReshapeTest<DataType::QAsymmU8>)
 ARMNN_AUTO_TEST_CASE_WITH_THF(Reshape5d, Reshape5dTest<DataType::Float32>)
+ARMNN_AUTO_TEST_CASE_WITH_THF(ReshapeBoolean, ReshapeBooleanTest)
 
 // Pad
 ARMNN_AUTO_TEST_CASE_WITH_THF(PadFloat322d, PadFloat322dTest)
@@ -1225,6 +1226,22 @@
 ARMNN_AUTO_TEST_CASE_WITH_THF(Exp2dFloat16, Exp2dTest<DataType::Float16>)
 ARMNN_AUTO_TEST_CASE_WITH_THF(Exp3dFloat16, Exp3dTest<DataType::Float16>)
 
+// Logical
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalNot, LogicalNotTest)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalNotInt, LogicalNotIntTest)
+
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalAnd, LogicalAndTest)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalAndInt, LogicalAndIntTest)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalAndBroadcast1, LogicalAndBroadcast1Test)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalAndBroadcast2, LogicalAndBroadcast2Test)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalAndBroadcast3, LogicalAndBroadcast3Test)
+
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalOr, LogicalOrTest)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalOrInt, LogicalOrIntTest)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalOrBroadcast1, LogicalOrBroadcast1Test)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalOrBroadcast2, LogicalOrBroadcast2Test)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalOrBroadcast3, LogicalOrBroadcast3Test)
+
 #if defined(ARMNNREF_ENABLED)
 
 // The ARMNN_COMPARE_REF_AUTO_TEST_CASE and the ARMNN_COMPARE_REF_FIXTURE_TEST_CASE test units are not available
diff --git a/src/backends/cl/workloads/CMakeLists.txt b/src/backends/cl/workloads/CMakeLists.txt
index 24c09ad..6118d9b 100644
--- a/src/backends/cl/workloads/CMakeLists.txt
+++ b/src/backends/cl/workloads/CMakeLists.txt
@@ -50,6 +50,12 @@
     ClInstanceNormalizationWorkload.hpp
     ClL2NormalizationFloatWorkload.cpp
     ClL2NormalizationFloatWorkload.hpp
+    ClLogicalAndWorkload.cpp
+    ClLogicalAndWorkload.hpp
+    ClLogicalNotWorkload.cpp
+    ClLogicalNotWorkload.hpp
+    ClLogicalOrWorkload.cpp
+    ClLogicalOrWorkload.hpp
     ClLogSoftmaxWorkload.cpp
     ClLogSoftmaxWorkload.hpp
     ClLstmFloatWorkload.cpp
diff --git a/src/backends/cl/workloads/ClLogicalAndWorkload.cpp b/src/backends/cl/workloads/ClLogicalAndWorkload.cpp
new file mode 100644
index 0000000..9418d73
--- /dev/null
+++ b/src/backends/cl/workloads/ClLogicalAndWorkload.cpp
@@ -0,0 +1,53 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#include "ClLogicalAndWorkload.hpp"
+
+#include "ClWorkloadUtils.hpp"
+
+#include <armnn/utility/PolymorphicDowncast.hpp>
+
+#include <aclCommon/ArmComputeTensorUtils.hpp>
+
+#include <cl/ClTensorHandle.hpp>
+
+namespace armnn
+{
+using namespace armcomputetensorutils;
+
+arm_compute::Status ClLogicalAndWorkloadValidate(const TensorInfo& input0,
+                                                 const TensorInfo& input1,
+                                                 const TensorInfo& output)
+{
+    const arm_compute::TensorInfo aclInputInfo0 = BuildArmComputeTensorInfo(input0);
+    const arm_compute::TensorInfo aclInputInfo1 = BuildArmComputeTensorInfo(input1);
+    const arm_compute::TensorInfo aclOutputInfo = BuildArmComputeTensorInfo(output);
+
+    const arm_compute::Status aclStatus = arm_compute::CLLogicalAnd::validate(&aclInputInfo0,
+                                                                              &aclInputInfo1,
+                                                                              &aclOutputInfo);
+    return aclStatus;
+}
+
+ClLogicalAndWorkload::ClLogicalAndWorkload(const LogicalBinaryQueueDescriptor& descriptor,
+                                           const WorkloadInfo& info)
+    : BaseWorkload<LogicalBinaryQueueDescriptor>(descriptor, info)
+{
+    m_Data.ValidateInputsOutputs("ClLogicalAndWorkload", 2, 1);
+
+    arm_compute::ICLTensor& input0 = PolymorphicDowncast<ClTensorHandle*>(m_Data.m_Inputs[0])->GetTensor();
+    arm_compute::ICLTensor& input1 = PolymorphicDowncast<ClTensorHandle*>(m_Data.m_Inputs[1])->GetTensor();
+    arm_compute::ICLTensor& output = PolymorphicDowncast<ClTensorHandle*>(m_Data.m_Outputs[0])->GetTensor();
+
+    m_LogicalAndLayer.configure(&input0, &input1, &output);
+}
+
+void ClLogicalAndWorkload::Execute() const
+{
+    ARMNN_SCOPED_PROFILING_EVENT_CL("ClLogicalAndWorkload_Execute");
+    m_LogicalAndLayer.run();
+}
+
+} // namespace armnn
diff --git a/src/backends/cl/workloads/ClLogicalAndWorkload.hpp b/src/backends/cl/workloads/ClLogicalAndWorkload.hpp
new file mode 100644
index 0000000..3bf6afe
--- /dev/null
+++ b/src/backends/cl/workloads/ClLogicalAndWorkload.hpp
@@ -0,0 +1,30 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#pragma once
+
+#include <backendsCommon/Workload.hpp>
+
+#include <arm_compute/core/Error.h>
+#include <arm_compute/runtime/CL/functions/CLLogicalAnd.h>
+
+namespace armnn
+{
+
+arm_compute::Status ClLogicalAndWorkloadValidate(const TensorInfo& input0,
+                                                 const TensorInfo& input1,
+                                                 const TensorInfo& output);
+
+class ClLogicalAndWorkload : public BaseWorkload<LogicalBinaryQueueDescriptor>
+{
+public:
+    ClLogicalAndWorkload(const LogicalBinaryQueueDescriptor& descriptor, const WorkloadInfo& info);
+    virtual void Execute() const override;
+
+private:
+    mutable arm_compute::CLLogicalAnd m_LogicalAndLayer;
+};
+
+} //namespace armnn
diff --git a/src/backends/cl/workloads/ClLogicalNotWorkload.cpp b/src/backends/cl/workloads/ClLogicalNotWorkload.cpp
new file mode 100644
index 0000000..eb90caf
--- /dev/null
+++ b/src/backends/cl/workloads/ClLogicalNotWorkload.cpp
@@ -0,0 +1,49 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#include "ClLogicalNotWorkload.hpp"
+
+#include "ClWorkloadUtils.hpp"
+
+#include <armnn/utility/PolymorphicDowncast.hpp>
+
+#include <aclCommon/ArmComputeTensorUtils.hpp>
+
+#include <cl/ClTensorHandle.hpp>
+
+namespace armnn
+{
+using namespace armcomputetensorutils;
+
+arm_compute::Status ClLogicalNotWorkloadValidate(const TensorInfo& input,
+                                                 const TensorInfo& output)
+{
+    const arm_compute::TensorInfo aclInputInfo  = BuildArmComputeTensorInfo(input);
+    const arm_compute::TensorInfo aclOutputInfo = BuildArmComputeTensorInfo(output);
+
+    const arm_compute::Status aclStatus = arm_compute::CLLogicalNot::validate(&aclInputInfo,
+                                                                              &aclOutputInfo);
+    return aclStatus;
+}
+
+ClLogicalNotWorkload::ClLogicalNotWorkload(const ElementwiseUnaryQueueDescriptor& descriptor,
+                                           const WorkloadInfo& info)
+    : BaseWorkload<ElementwiseUnaryQueueDescriptor>(descriptor, info)
+{
+    m_Data.ValidateInputsOutputs("ClLogicalNotWorkload", 1, 1);
+
+    arm_compute::ICLTensor& input  = PolymorphicDowncast<ClTensorHandle*>(m_Data.m_Inputs[0])->GetTensor();
+    arm_compute::ICLTensor& output = PolymorphicDowncast<ClTensorHandle*>(m_Data.m_Outputs[0])->GetTensor();
+
+    m_LogicalNotLayer.configure(&input, &output);
+}
+
+void ClLogicalNotWorkload::Execute() const
+{
+    ARMNN_SCOPED_PROFILING_EVENT_CL("ClLogicalNotWorkload_Execute");
+    m_LogicalNotLayer.run();
+}
+
+} // namespace armnn
diff --git a/src/backends/cl/workloads/ClLogicalNotWorkload.hpp b/src/backends/cl/workloads/ClLogicalNotWorkload.hpp
new file mode 100644
index 0000000..f1225c7
--- /dev/null
+++ b/src/backends/cl/workloads/ClLogicalNotWorkload.hpp
@@ -0,0 +1,28 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#pragma once
+
+#include <backendsCommon/Workload.hpp>
+
+#include <arm_compute/core/Error.h>
+#include <arm_compute/runtime/CL/functions/CLLogicalNot.h>
+
+namespace armnn
+{
+
+arm_compute::Status ClLogicalNotWorkloadValidate(const TensorInfo& input, const TensorInfo& output);
+
+class ClLogicalNotWorkload : public BaseWorkload<ElementwiseUnaryQueueDescriptor>
+{
+public:
+    ClLogicalNotWorkload(const ElementwiseUnaryQueueDescriptor& descriptor, const WorkloadInfo& info);
+    virtual void Execute() const override;
+
+private:
+    mutable arm_compute::CLLogicalNot m_LogicalNotLayer;
+};
+
+} //namespace armnn
diff --git a/src/backends/cl/workloads/ClLogicalOrWorkload.cpp b/src/backends/cl/workloads/ClLogicalOrWorkload.cpp
new file mode 100644
index 0000000..e9895bf
--- /dev/null
+++ b/src/backends/cl/workloads/ClLogicalOrWorkload.cpp
@@ -0,0 +1,53 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#include "ClLogicalOrWorkload.hpp"
+
+#include "ClWorkloadUtils.hpp"
+
+#include <armnn/utility/PolymorphicDowncast.hpp>
+
+#include <aclCommon/ArmComputeTensorUtils.hpp>
+
+#include <cl/ClTensorHandle.hpp>
+
+namespace armnn
+{
+using namespace armcomputetensorutils;
+
+arm_compute::Status ClLogicalOrWorkloadValidate(const TensorInfo& input0,
+                                                const TensorInfo& input1,
+                                                const TensorInfo& output)
+{
+    const arm_compute::TensorInfo aclInputInfo0 = BuildArmComputeTensorInfo(input0);
+    const arm_compute::TensorInfo aclInputInfo1 = BuildArmComputeTensorInfo(input1);
+    const arm_compute::TensorInfo aclOutputInfo = BuildArmComputeTensorInfo(output);
+
+    const arm_compute::Status aclStatus = arm_compute::CLLogicalOr::validate(&aclInputInfo0,
+                                                                             &aclInputInfo1,
+                                                                             &aclOutputInfo);
+    return aclStatus;
+}
+
+ClLogicalOrWorkload::ClLogicalOrWorkload(const LogicalBinaryQueueDescriptor& descriptor,
+                                         const WorkloadInfo& info)
+    : BaseWorkload<LogicalBinaryQueueDescriptor>(descriptor, info)
+{
+    m_Data.ValidateInputsOutputs("ClLogicalOrWorkload", 2, 1);
+
+    arm_compute::ICLTensor& input0 = PolymorphicDowncast<ClTensorHandle*>(m_Data.m_Inputs[0])->GetTensor();
+    arm_compute::ICLTensor& input1 = PolymorphicDowncast<ClTensorHandle*>(m_Data.m_Inputs[1])->GetTensor();
+    arm_compute::ICLTensor& output = PolymorphicDowncast<ClTensorHandle*>(m_Data.m_Outputs[0])->GetTensor();
+
+    m_LogicalOrLayer.configure(&input0, &input1, &output);
+}
+
+void ClLogicalOrWorkload::Execute() const
+{
+    ARMNN_SCOPED_PROFILING_EVENT_CL("ClLogicalOrWorkload_Execute");
+    m_LogicalOrLayer.run();
+}
+
+} // namespace armnn
diff --git a/src/backends/cl/workloads/ClLogicalOrWorkload.hpp b/src/backends/cl/workloads/ClLogicalOrWorkload.hpp
new file mode 100644
index 0000000..8faabde
--- /dev/null
+++ b/src/backends/cl/workloads/ClLogicalOrWorkload.hpp
@@ -0,0 +1,30 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#pragma once
+
+#include <backendsCommon/Workload.hpp>
+
+#include <arm_compute/core/Error.h>
+#include <arm_compute/runtime/CL/functions/CLLogicalOr.h>
+
+namespace armnn
+{
+
+arm_compute::Status ClLogicalOrWorkloadValidate(const TensorInfo& input0,
+                                                const TensorInfo& input1,
+                                                const TensorInfo& output);
+
+class ClLogicalOrWorkload : public BaseWorkload<LogicalBinaryQueueDescriptor>
+{
+public:
+    ClLogicalOrWorkload(const LogicalBinaryQueueDescriptor& descriptor, const WorkloadInfo& info);
+    virtual void Execute() const override;
+
+private:
+    mutable arm_compute::CLLogicalOr m_LogicalOrLayer;
+};
+
+} //namespace armnn
diff --git a/src/backends/cl/workloads/ClWorkloads.hpp b/src/backends/cl/workloads/ClWorkloads.hpp
index b48e5a6..efcccb3 100644
--- a/src/backends/cl/workloads/ClWorkloads.hpp
+++ b/src/backends/cl/workloads/ClWorkloads.hpp
@@ -24,6 +24,9 @@
 #include "ClGatherWorkload.hpp"
 #include "ClInstanceNormalizationWorkload.hpp"
 #include "ClL2NormalizationFloatWorkload.hpp"
+#include "ClLogicalAndWorkload.hpp"
+#include "ClLogicalNotWorkload.hpp"
+#include "ClLogicalOrWorkload.hpp"
 #include "ClLogSoftmaxWorkload.hpp"
 #include "ClLstmFloatWorkload.hpp"
 #include "ClConcatWorkload.hpp"
diff --git a/src/backends/neon/NeonLayerSupport.cpp b/src/backends/neon/NeonLayerSupport.cpp
index f55d1c8..2d22576 100644
--- a/src/backends/neon/NeonLayerSupport.cpp
+++ b/src/backends/neon/NeonLayerSupport.cpp
@@ -37,6 +37,9 @@
 #include "workloads/NeonInstanceNormalizationWorkload.hpp"
 #include "workloads/NeonL2NormalizationFloatWorkload.hpp"
 #include "workloads/NeonLogSoftmaxWorkload.hpp"
+#include "workloads/NeonLogicalAndWorkload.hpp"
+#include "workloads/NeonLogicalNotWorkload.hpp"
+#include "workloads/NeonLogicalOrWorkload.hpp"
 #include "workloads/NeonLstmFloatWorkload.hpp"
 #include "workloads/NeonMaximumWorkload.hpp"
 #include "workloads/NeonMeanWorkload.hpp"
@@ -434,6 +437,11 @@
                                            reasonIfUnsupported,
                                            input,
                                            output);
+        case UnaryOperation::LogicalNot:
+            FORWARD_WORKLOAD_VALIDATE_FUNC(NeonLogicalNotWorkloadValidate,
+                                           reasonIfUnsupported,
+                                           input,
+                                           output);
         default:
             return false;
     }
@@ -532,6 +540,31 @@
     FORWARD_WORKLOAD_VALIDATE_FUNC(NeonL2NormalizationWorkloadValidate, reasonIfUnsupported, input, output, descriptor);
 }
 
+bool NeonLayerSupport::IsLogicalBinarySupported(const TensorInfo& input0,
+                                                const TensorInfo& input1,
+                                                const TensorInfo& output,
+                                                const LogicalBinaryDescriptor& descriptor,
+                                                Optional<std::string&> reasonIfUnsupported) const
+{
+    switch(descriptor.m_Operation)
+    {
+        case LogicalBinaryOperation::LogicalAnd:
+            FORWARD_WORKLOAD_VALIDATE_FUNC(NeonLogicalAndWorkloadValidate,
+                                           reasonIfUnsupported,
+                                           input0,
+                                           input1,
+                                           output);
+        case LogicalBinaryOperation::LogicalOr:
+            FORWARD_WORKLOAD_VALIDATE_FUNC(NeonLogicalOrWorkloadValidate,
+                                           reasonIfUnsupported,
+                                           input0,
+                                           input1,
+                                           output);
+        default:
+            return false;
+    }
+}
+
 bool NeonLayerSupport::IsLogSoftmaxSupported(const TensorInfo& input,
                                              const TensorInfo& output,
                                              const LogSoftmaxDescriptor& descriptor,
diff --git a/src/backends/neon/NeonLayerSupport.hpp b/src/backends/neon/NeonLayerSupport.hpp
index d477dcd..dc13cc2 100644
--- a/src/backends/neon/NeonLayerSupport.hpp
+++ b/src/backends/neon/NeonLayerSupport.hpp
@@ -160,6 +160,12 @@
                                     const L2NormalizationDescriptor& descriptor,
                                     Optional<std::string&> reasonIfUnsupported = EmptyOptional()) const override;
 
+    bool IsLogicalBinarySupported(const TensorInfo& input0,
+                                  const TensorInfo& input1,
+                                  const TensorInfo& output,
+                                  const LogicalBinaryDescriptor& descriptor,
+                                  Optional<std::string&> reasonIfUnsupported) const override;
+
     bool IsLogSoftmaxSupported(const TensorInfo& input,
                                const TensorInfo& output,
                                const LogSoftmaxDescriptor& descriptor,
diff --git a/src/backends/neon/NeonWorkloadFactory.cpp b/src/backends/neon/NeonWorkloadFactory.cpp
index 709dd93..7218052 100644
--- a/src/backends/neon/NeonWorkloadFactory.cpp
+++ b/src/backends/neon/NeonWorkloadFactory.cpp
@@ -260,25 +260,27 @@
     switch(descriptor.m_Parameters.m_Operation)
     {
         case UnaryOperation::Abs:
-            {
-                AbsQueueDescriptor absQueueDescriptor;
-                absQueueDescriptor.m_Inputs  = descriptor.m_Inputs;
-                absQueueDescriptor.m_Outputs = descriptor.m_Outputs;
+        {
+            AbsQueueDescriptor absQueueDescriptor;
+            absQueueDescriptor.m_Inputs  = descriptor.m_Inputs;
+            absQueueDescriptor.m_Outputs = descriptor.m_Outputs;
 
-                return std::make_unique<NeonAbsWorkload>(absQueueDescriptor, info);
-            }
+            return std::make_unique<NeonAbsWorkload>(absQueueDescriptor, info);
+        }
         case UnaryOperation::Rsqrt:
-            {
-                RsqrtQueueDescriptor rsqrtQueueDescriptor;
-                rsqrtQueueDescriptor.m_Inputs  = descriptor.m_Inputs;
-                rsqrtQueueDescriptor.m_Outputs = descriptor.m_Outputs;
+        {
+            RsqrtQueueDescriptor rsqrtQueueDescriptor;
+            rsqrtQueueDescriptor.m_Inputs  = descriptor.m_Inputs;
+            rsqrtQueueDescriptor.m_Outputs = descriptor.m_Outputs;
 
-                return std::make_unique<NeonRsqrtWorkload>(rsqrtQueueDescriptor, info);
-            }
+            return std::make_unique<NeonRsqrtWorkload>(rsqrtQueueDescriptor, info);
+        }
         case UnaryOperation::Neg:
             return std::make_unique<NeonNegWorkload>(descriptor, info);
         case UnaryOperation::Exp:
             return std::make_unique<NeonExpWorkload>(descriptor, info);
+        case UnaryOperation::LogicalNot:
+            return std::make_unique<NeonLogicalNotWorkload>(descriptor, info);
         default:
             return nullptr;
     }
@@ -356,6 +358,20 @@
     return std::make_unique<NeonLogSoftmaxWorkload>(descriptor, info, m_MemoryManager->GetIntraLayerManager());
 }
 
+std::unique_ptr<IWorkload> NeonWorkloadFactory::CreateLogicalBinary(const LogicalBinaryQueueDescriptor& descriptor,
+                                                                    const WorkloadInfo& info) const
+{
+    switch(descriptor.m_Parameters.m_Operation)
+    {
+        case LogicalBinaryOperation::LogicalAnd:
+            return std::make_unique<NeonLogicalAndWorkload>(descriptor, info);
+        case LogicalBinaryOperation::LogicalOr:
+            return std::make_unique<NeonLogicalOrWorkload>(descriptor, info);
+        default:
+            return nullptr;
+    }
+}
+
 std::unique_ptr<IWorkload> NeonWorkloadFactory::CreateLstm(const LstmQueueDescriptor& descriptor,
                                                            const WorkloadInfo& info) const
 {
diff --git a/src/backends/neon/NeonWorkloadFactory.hpp b/src/backends/neon/NeonWorkloadFactory.hpp
index 6a514e2..444574e 100644
--- a/src/backends/neon/NeonWorkloadFactory.hpp
+++ b/src/backends/neon/NeonWorkloadFactory.hpp
@@ -143,6 +143,9 @@
     std::unique_ptr<IWorkload> CreateL2Normalization(const L2NormalizationQueueDescriptor& descriptor,
                                                      const WorkloadInfo& info) const override;
 
+    std::unique_ptr<IWorkload> CreateLogicalBinary(const LogicalBinaryQueueDescriptor& descriptor,
+                                                   const WorkloadInfo& info) const override;
+
     std::unique_ptr<IWorkload> CreateLogSoftmax(const LogSoftmaxQueueDescriptor& descriptor,
                                                 const WorkloadInfo& info) const override;
 
diff --git a/src/backends/neon/backend.mk b/src/backends/neon/backend.mk
index 9bd08a1..54560cb 100644
--- a/src/backends/neon/backend.mk
+++ b/src/backends/neon/backend.mk
@@ -47,6 +47,9 @@
         workloads/NeonGatherWorkload.cpp \
         workloads/NeonInstanceNormalizationWorkload.cpp \
         workloads/NeonL2NormalizationFloatWorkload.cpp \
+        workloads/NeonLogicalAndWorkload.cpp \
+        workloads/NeonLogicalNotWorkload.cpp \
+        workloads/NeonLogicalOrWorkload.cpp \
         workloads/NeonLogSoftmaxWorkload.cpp \
         workloads/NeonLstmFloatWorkload.cpp \
         workloads/NeonMaximumWorkload.cpp \
diff --git a/src/backends/neon/test/NeonLayerSupportTests.cpp b/src/backends/neon/test/NeonLayerSupportTests.cpp
index 3b086ad..a14122f 100644
--- a/src/backends/neon/test/NeonLayerSupportTests.cpp
+++ b/src/backends/neon/test/NeonLayerSupportTests.cpp
@@ -75,6 +75,26 @@
     BOOST_CHECK(result);
 }
 
+BOOST_AUTO_TEST_CASE(IsLogicalBinarySupportedNeon)
+{
+    std::string reasonIfUnsupported;
+
+    bool result = IsLogicalBinaryLayerSupportedTests<armnn::NeonWorkloadFactory,
+      armnn::DataType::Boolean, armnn::DataType::Boolean>(reasonIfUnsupported);
+
+    BOOST_CHECK(result);
+}
+
+BOOST_AUTO_TEST_CASE(IsLogicalBinaryBroadcastSupportedNeon)
+{
+    std::string reasonIfUnsupported;
+
+    bool result = IsLogicalBinaryLayerBroadcastSupportedTests<armnn::NeonWorkloadFactory,
+      armnn::DataType::Boolean, armnn::DataType::Boolean>(reasonIfUnsupported);
+
+    BOOST_CHECK(result);
+}
+
 BOOST_AUTO_TEST_CASE(IsMeanSupportedNeon)
 {
     std::string reasonIfUnsupported;
diff --git a/src/backends/neon/test/NeonLayerTests.cpp b/src/backends/neon/test/NeonLayerTests.cpp
index 20afbcb..8e7742a 100644
--- a/src/backends/neon/test/NeonLayerTests.cpp
+++ b/src/backends/neon/test/NeonLayerTests.cpp
@@ -747,6 +747,7 @@
 ARMNN_AUTO_TEST_CASE_WITH_THF(SimpleReshapeInt8, SimpleReshapeTest<armnn::DataType::QAsymmS8>)
 ARMNN_AUTO_TEST_CASE_WITH_THF(SimpleReshapeUint8, SimpleReshapeTest<armnn::DataType::QAsymmU8>)
 ARMNN_AUTO_TEST_CASE_WITH_THF(Reshape5d, Reshape5dTest<armnn::DataType::Float32>)
+ARMNN_AUTO_TEST_CASE_WITH_THF(ReshapeBoolean, ReshapeBooleanTest)
 
 // Pad
 ARMNN_AUTO_TEST_CASE_WITH_THF(PadFloat322d, PadFloat322dTest)
@@ -1322,6 +1323,22 @@
 ARMNN_AUTO_TEST_CASE_WITH_THF(SimpleFillF16, SimpleFillTest<DataType::Float16>)
 ARMNN_AUTO_TEST_CASE_WITH_THF(SimpleFillS32, SimpleFillTest<DataType::Signed32>)
 
+// Logical
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalNot, LogicalNotTest)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalNotInt, LogicalNotIntTest)
+
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalAnd, LogicalAndTest)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalAndInt, LogicalAndIntTest)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalAndBroadcast1, LogicalAndBroadcast1Test)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalAndBroadcast2, LogicalAndBroadcast2Test)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalAndBroadcast3, LogicalAndBroadcast3Test)
+
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalOr, LogicalOrTest)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalOrInt, LogicalOrIntTest)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalOrBroadcast1, LogicalOrBroadcast1Test)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalOrBroadcast2, LogicalOrBroadcast2Test)
+ARMNN_AUTO_TEST_CASE_WITH_THF(LogicalOrBroadcast3, LogicalOrBroadcast3Test)
+
 #if defined(ARMNNREF_ENABLED)
 
 // The ARMNN_COMPARE_REF_AUTO_TEST_CASE and the ARMNN_COMPARE_REF_FIXTURE_TEST_CASE test units are not available
diff --git a/src/backends/neon/workloads/CMakeLists.txt b/src/backends/neon/workloads/CMakeLists.txt
index ca9497e..b03db99 100644
--- a/src/backends/neon/workloads/CMakeLists.txt
+++ b/src/backends/neon/workloads/CMakeLists.txt
@@ -54,10 +54,16 @@
     NeonInstanceNormalizationWorkload.hpp
     NeonL2NormalizationFloatWorkload.cpp
     NeonL2NormalizationFloatWorkload.hpp
-    NeonLstmFloatWorkload.cpp
-    NeonLstmFloatWorkload.hpp
+    NeonLogicalAndWorkload.cpp
+    NeonLogicalAndWorkload.hpp
+    NeonLogicalNotWorkload.cpp
+    NeonLogicalNotWorkload.hpp
+    NeonLogicalOrWorkload.cpp
+    NeonLogicalOrWorkload.hpp
     NeonLogSoftmaxWorkload.cpp
     NeonLogSoftmaxWorkload.hpp
+    NeonLstmFloatWorkload.cpp
+    NeonLstmFloatWorkload.hpp
     NeonMaximumWorkload.cpp
     NeonMaximumWorkload.hpp
     NeonMeanWorkload.cpp
diff --git a/src/backends/neon/workloads/NeonFullyConnectedWorkload.cpp b/src/backends/neon/workloads/NeonFullyConnectedWorkload.cpp
index 31489a0..39fb4c9 100644
--- a/src/backends/neon/workloads/NeonFullyConnectedWorkload.cpp
+++ b/src/backends/neon/workloads/NeonFullyConnectedWorkload.cpp
@@ -27,6 +27,16 @@
                                                        const FullyConnectedDescriptor& descriptor,
                                                        const ActivationDescriptor* activationDescriptor)
 {
+    if (activationDescriptor)
+    {
+        std::vector<ActivationFunction> activations = {ActivationFunction::ReLu, ActivationFunction::BoundedReLu};
+        if (std::find(activations.begin(), activations.end(), activationDescriptor->m_Function) == activations.end())
+        {
+            return arm_compute::Status{
+                arm_compute::ErrorCode::RUNTIME_ERROR, "NeonFullyConnectedWorkload :Unsupported Activation Function"};
+        }
+    }
+
     const arm_compute::TensorInfo aclInput = BuildArmComputeTensorInfo(input);
     const arm_compute::TensorInfo aclOutput = BuildArmComputeTensorInfo(output);
     const arm_compute::TensorInfo aclWeights = BuildArmComputeTensorInfo(weights);
diff --git a/src/backends/neon/workloads/NeonLogicalAndWorkload.cpp b/src/backends/neon/workloads/NeonLogicalAndWorkload.cpp
new file mode 100644
index 0000000..d85e05c
--- /dev/null
+++ b/src/backends/neon/workloads/NeonLogicalAndWorkload.cpp
@@ -0,0 +1,51 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#include "NeonLogicalAndWorkload.hpp"
+
+#include "NeonWorkloadUtils.hpp"
+
+#include <aclCommon/ArmComputeTensorHandle.hpp>
+#include <aclCommon/ArmComputeTensorUtils.hpp>
+
+#include <armnn/utility/PolymorphicDowncast.hpp>
+
+namespace armnn
+{
+
+arm_compute::Status NeonLogicalAndWorkloadValidate(const TensorInfo& input0,
+                                                   const TensorInfo& input1,
+                                                   const TensorInfo& output)
+{
+    const arm_compute::TensorInfo aclInputInfo0 = BuildArmComputeTensorInfo(input0);
+    const arm_compute::TensorInfo aclInputInfo1 = BuildArmComputeTensorInfo(input1);
+    const arm_compute::TensorInfo aclOutputInfo = BuildArmComputeTensorInfo(output);
+
+    const arm_compute::Status aclStatus = arm_compute::NELogicalAnd::validate(&aclInputInfo0,
+                                                                              &aclInputInfo1,
+                                                                              &aclOutputInfo);
+    return aclStatus;
+}
+
+NeonLogicalAndWorkload::NeonLogicalAndWorkload(const LogicalBinaryQueueDescriptor& descriptor,
+                                               const WorkloadInfo& info)
+    : BaseWorkload<LogicalBinaryQueueDescriptor>(descriptor, info)
+{
+    m_Data.ValidateInputsOutputs("NeonLogicalAndWorkload", 2, 1);
+
+    arm_compute::ITensor& input0 = PolymorphicDowncast<IAclTensorHandle*>(m_Data.m_Inputs[0])->GetTensor();
+    arm_compute::ITensor& input1 = PolymorphicDowncast<IAclTensorHandle*>(m_Data.m_Inputs[1])->GetTensor();
+    arm_compute::ITensor& output = PolymorphicDowncast<IAclTensorHandle*>(m_Data.m_Outputs[0])->GetTensor();
+
+    m_LogicalAndLayer.configure(&input0, &input1, &output);
+}
+
+void NeonLogicalAndWorkload::Execute() const
+{
+    ARMNN_SCOPED_PROFILING_EVENT_NEON("NeonLogicalAndWorkload_Execute");
+    m_LogicalAndLayer.run();
+}
+
+} // namespace armnn
diff --git a/src/backends/neon/workloads/NeonLogicalAndWorkload.hpp b/src/backends/neon/workloads/NeonLogicalAndWorkload.hpp
new file mode 100644
index 0000000..1daadab
--- /dev/null
+++ b/src/backends/neon/workloads/NeonLogicalAndWorkload.hpp
@@ -0,0 +1,30 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#pragma once
+
+#include <backendsCommon/Workload.hpp>
+
+#include <arm_compute/core/Error.h>
+#include <arm_compute/runtime/NEON/functions/NELogical.h>
+
+namespace armnn
+{
+
+arm_compute::Status NeonLogicalAndWorkloadValidate(const TensorInfo& input0,
+                                                   const TensorInfo& input1,
+                                                   const TensorInfo& output);
+
+class NeonLogicalAndWorkload : public BaseWorkload<LogicalBinaryQueueDescriptor>
+{
+public:
+    NeonLogicalAndWorkload(const LogicalBinaryQueueDescriptor& descriptor, const WorkloadInfo& info);
+    virtual void Execute() const override;
+
+private:
+    mutable arm_compute::NELogicalAnd m_LogicalAndLayer;
+};
+
+} //namespace armnn
diff --git a/src/backends/neon/workloads/NeonLogicalNotWorkload.cpp b/src/backends/neon/workloads/NeonLogicalNotWorkload.cpp
new file mode 100644
index 0000000..cff5eaf
--- /dev/null
+++ b/src/backends/neon/workloads/NeonLogicalNotWorkload.cpp
@@ -0,0 +1,48 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#include "NeonLogicalNotWorkload.hpp"
+
+#include "NeonWorkloadUtils.hpp"
+
+#include <aclCommon/ArmComputeTensorHandle.hpp>
+#include <aclCommon/ArmComputeTensorUtils.hpp>
+
+#include <armnn/utility/PolymorphicDowncast.hpp>
+
+
+namespace armnn
+{
+
+arm_compute::Status NeonLogicalNotWorkloadValidate(const TensorInfo& input,
+                                                   const TensorInfo& output)
+{
+    const arm_compute::TensorInfo aclInputInfo  = BuildArmComputeTensorInfo(input);
+    const arm_compute::TensorInfo aclOutputInfo = BuildArmComputeTensorInfo(output);
+
+    const arm_compute::Status aclStatus = arm_compute::NELogicalNot::validate(&aclInputInfo,
+                                                                              &aclOutputInfo);
+    return aclStatus;
+}
+
+NeonLogicalNotWorkload::NeonLogicalNotWorkload(const ElementwiseUnaryQueueDescriptor& descriptor,
+                                               const WorkloadInfo& info)
+    : BaseWorkload<ElementwiseUnaryQueueDescriptor>(descriptor, info)
+{
+    m_Data.ValidateInputsOutputs("NeonLogicalNotWorkload", 1, 1);
+
+    arm_compute::ITensor& input  = PolymorphicDowncast<IAclTensorHandle*>(m_Data.m_Inputs[0])->GetTensor();
+    arm_compute::ITensor& output = PolymorphicDowncast<IAclTensorHandle*>(m_Data.m_Outputs[0])->GetTensor();
+
+    m_LogicalNotLayer.configure(&input, &output);
+}
+
+void NeonLogicalNotWorkload::Execute() const
+{
+    ARMNN_SCOPED_PROFILING_EVENT_NEON("NeonLogicalNotWorkload_Execute");
+    m_LogicalNotLayer.run();
+}
+
+} // namespace armnn
diff --git a/src/backends/neon/workloads/NeonLogicalNotWorkload.hpp b/src/backends/neon/workloads/NeonLogicalNotWorkload.hpp
new file mode 100644
index 0000000..31420f7
--- /dev/null
+++ b/src/backends/neon/workloads/NeonLogicalNotWorkload.hpp
@@ -0,0 +1,28 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#pragma once
+
+#include <backendsCommon/Workload.hpp>
+
+#include <arm_compute/core/Error.h>
+#include <arm_compute/runtime/NEON/functions/NELogical.h>
+
+namespace armnn
+{
+
+arm_compute::Status NeonLogicalNotWorkloadValidate(const TensorInfo& input, const TensorInfo& output);
+
+class NeonLogicalNotWorkload : public BaseWorkload<ElementwiseUnaryQueueDescriptor>
+{
+public:
+    NeonLogicalNotWorkload(const ElementwiseUnaryQueueDescriptor& descriptor, const WorkloadInfo& info);
+    virtual void Execute() const override;
+
+private:
+    mutable arm_compute::NELogicalNot m_LogicalNotLayer;
+};
+
+} //namespace armnn
diff --git a/src/backends/neon/workloads/NeonLogicalOrWorkload.cpp b/src/backends/neon/workloads/NeonLogicalOrWorkload.cpp
new file mode 100644
index 0000000..c3f21e1
--- /dev/null
+++ b/src/backends/neon/workloads/NeonLogicalOrWorkload.cpp
@@ -0,0 +1,51 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#include "NeonLogicalOrWorkload.hpp"
+
+#include "NeonWorkloadUtils.hpp"
+
+#include <aclCommon/ArmComputeTensorHandle.hpp>
+#include <aclCommon/ArmComputeTensorUtils.hpp>
+
+#include <armnn/utility/PolymorphicDowncast.hpp>
+
+namespace armnn
+{
+
+arm_compute::Status NeonLogicalOrWorkloadValidate(const TensorInfo& input0,
+                                                  const TensorInfo& input1,
+                                                  const TensorInfo& output)
+{
+    const arm_compute::TensorInfo aclInputInfo0 = BuildArmComputeTensorInfo(input0);
+    const arm_compute::TensorInfo aclInputInfo1 = BuildArmComputeTensorInfo(input1);
+    const arm_compute::TensorInfo aclOutputInfo = BuildArmComputeTensorInfo(output);
+
+    const arm_compute::Status aclStatus = arm_compute::NELogicalOr::validate(&aclInputInfo0,
+                                                                             &aclInputInfo1,
+                                                                             &aclOutputInfo);
+    return aclStatus;
+}
+
+NeonLogicalOrWorkload::NeonLogicalOrWorkload(const LogicalBinaryQueueDescriptor& descriptor,
+                                               const WorkloadInfo& info)
+    : BaseWorkload<LogicalBinaryQueueDescriptor>(descriptor, info)
+{
+    m_Data.ValidateInputsOutputs("NeonLogicalOrWorkload", 2, 1);
+
+    arm_compute::ITensor& input0 = PolymorphicDowncast<IAclTensorHandle*>(m_Data.m_Inputs[0])->GetTensor();
+    arm_compute::ITensor& input1 = PolymorphicDowncast<IAclTensorHandle*>(m_Data.m_Inputs[1])->GetTensor();
+    arm_compute::ITensor& output = PolymorphicDowncast<IAclTensorHandle*>(m_Data.m_Outputs[0])->GetTensor();
+
+    m_LogicalOrLayer.configure(&input0, &input1, &output);
+}
+
+void NeonLogicalOrWorkload::Execute() const
+{
+    ARMNN_SCOPED_PROFILING_EVENT_NEON("NeonLogicalOrWorkload_Execute");
+    m_LogicalOrLayer.run();
+}
+
+} // namespace armnn
diff --git a/src/backends/neon/workloads/NeonLogicalOrWorkload.hpp b/src/backends/neon/workloads/NeonLogicalOrWorkload.hpp
new file mode 100644
index 0000000..3b4ddb2
--- /dev/null
+++ b/src/backends/neon/workloads/NeonLogicalOrWorkload.hpp
@@ -0,0 +1,30 @@
+//
+// Copyright © 2020 Arm Ltd and Contributors. All rights reserved.
+// SPDX-License-Identifier: MIT
+//
+
+#pragma once
+
+#include <backendsCommon/Workload.hpp>
+
+#include <arm_compute/core/Error.h>
+#include <arm_compute/runtime/NEON/functions/NELogical.h>
+
+namespace armnn
+{
+
+arm_compute::Status NeonLogicalOrWorkloadValidate(const TensorInfo& input0,
+                                                  const TensorInfo& input1,
+                                                  const TensorInfo& output);
+
+class NeonLogicalOrWorkload : public BaseWorkload<LogicalBinaryQueueDescriptor>
+{
+public:
+    NeonLogicalOrWorkload(const LogicalBinaryQueueDescriptor& descriptor, const WorkloadInfo& info);
+    virtual void Execute() const override;
+
+private:
+    mutable arm_compute::NELogicalOr m_LogicalOrLayer;
+};
+
+} //namespace armnn
diff --git a/src/backends/neon/workloads/NeonWorkloads.hpp b/src/backends/neon/workloads/NeonWorkloads.hpp
index 590b6f7..1a17b9a 100644
--- a/src/backends/neon/workloads/NeonWorkloads.hpp
+++ b/src/backends/neon/workloads/NeonWorkloads.hpp
@@ -30,6 +30,9 @@
 #include "NeonGatherWorkload.hpp"
 #include "NeonInstanceNormalizationWorkload.hpp"
 #include "NeonL2NormalizationFloatWorkload.hpp"
+#include "NeonLogicalAndWorkload.hpp"
+#include "NeonLogicalNotWorkload.hpp"
+#include "NeonLogicalOrWorkload.hpp"
 #include "NeonLogSoftmaxWorkload.hpp"
 #include "NeonLstmFloatWorkload.hpp"
 #include "NeonMaximumWorkload.hpp"
diff --git a/src/backends/reference/RefLayerSupport.cpp b/src/backends/reference/RefLayerSupport.cpp
index b3feae6..bdaaafb 100644
--- a/src/backends/reference/RefLayerSupport.cpp
+++ b/src/backends/reference/RefLayerSupport.cpp
@@ -807,13 +807,29 @@
         DataType::Signed32
     };
 
+    std::array<DataType, 1> logicalSupportedTypes =
+    {
+        DataType::Boolean
+    };
+
     bool supported = true;
 
-    supported &= CheckSupportRule(TypeAnyOf(input, supportedTypes), reasonIfUnsupported,
-                                  "Reference elementwise unary: input type not supported");
+    if (descriptor.m_Operation == UnaryOperation::LogicalNot)
+    {
+        supported &= CheckSupportRule(TypeAnyOf(input, logicalSupportedTypes), reasonIfUnsupported,
+                                      "Reference elementwise unary: input type not supported");
 
-    supported &= CheckSupportRule(TypeAnyOf(output, supportedTypes), reasonIfUnsupported,
-                                  "Reference elementwise unary: output type not supported");
+        supported &= CheckSupportRule(TypeAnyOf(output, logicalSupportedTypes), reasonIfUnsupported,
+                                      "Reference elementwise unary: output type not supported");
+    }
+    else
+    {
+        supported &= CheckSupportRule(TypeAnyOf(input, supportedTypes), reasonIfUnsupported,
+                                      "Reference elementwise unary: input type not supported");
+
+        supported &= CheckSupportRule(TypeAnyOf(output, supportedTypes), reasonIfUnsupported,
+                                      "Reference elementwise unary: output type not supported");
+    }
 
     supported &= CheckSupportRule(TypesAreEqual(input, output), reasonIfUnsupported,
                                   "Reference elementwise unary: input and output types not matching");
@@ -1131,28 +1147,6 @@
     return supported;
 }
 
-bool RefLayerSupport::IsLogicalUnarySupported(const TensorInfo& input,
-                                              const TensorInfo& output,
-                                              const ElementwiseUnaryDescriptor& descriptor,
-                                              Optional<std::string&> reasonIfUnsupported) const
-{
-    IgnoreUnused(descriptor);
-
-    std::array<DataType, 1> supportedTypes =
-    {
-        DataType::Boolean
-    };
-
-    bool supported = true;
-    supported &= CheckSupportRule(TypeAnyOf(input, supportedTypes), reasonIfUnsupported,
-                                  "Reference LogicalUnary: input type not supported");
-
-    supported &= CheckSupportRule(TypesAreEqual(input, output), reasonIfUnsupported,
-                                  "Reference LogicalUnary: input and output types do not match");
-
-    return supported;
-}
-
 bool RefLayerSupport::IsLogSoftmaxSupported(const TensorInfo& input,
                                             const TensorInfo& output,
                                             const LogSoftmaxDescriptor& descriptor,
@@ -1720,7 +1714,7 @@
     IgnoreUnused(output);
     IgnoreUnused(descriptor);
     // Define supported output types.
-    std::array<DataType,7> supportedOutputTypes =
+    std::array<DataType,8> supportedOutputTypes =
     {
         DataType::BFloat16,
         DataType::Float32,
@@ -1728,7 +1722,8 @@
         DataType::Signed32,
         DataType::QAsymmS8,
         DataType::QAsymmU8,
-        DataType::QSymmS16
+        DataType::QSymmS16,
+        DataType::Boolean
     };
 
     return CheckSupportRule(TypeAnyOf(input, supportedOutputTypes), reasonIfUnsupported,
diff --git a/src/backends/reference/RefLayerSupport.hpp b/src/backends/reference/RefLayerSupport.hpp
index 318eb40..6b64408 100644
--- a/src/backends/reference/RefLayerSupport.hpp
+++ b/src/backends/reference/RefLayerSupport.hpp
@@ -188,11 +188,6 @@
                                   const LogicalBinaryDescriptor& descriptor,
                                   Optional<std::string&> reasonIfUnsupported) const override;
 
-    bool IsLogicalUnarySupported(const TensorInfo& input,
-                                 const TensorInfo& output,
-                                 const ElementwiseUnaryDescriptor& descriptor,
-                                 Optional<std::string&> reasonIfUnsupported) const override;
-
     bool IsLogSoftmaxSupported(const TensorInfo& input,
                                const TensorInfo& output,
                                const LogSoftmaxDescriptor& descriptor,
diff --git a/src/backends/reference/RefWorkloadFactory.cpp b/src/backends/reference/RefWorkloadFactory.cpp
index 9080028..468aeb3 100644
--- a/src/backends/reference/RefWorkloadFactory.cpp
+++ b/src/backends/reference/RefWorkloadFactory.cpp
@@ -307,6 +307,10 @@
 std::unique_ptr<IWorkload> RefWorkloadFactory::CreateElementwiseUnary(const ElementwiseUnaryQueueDescriptor& descriptor,
                                                                       const WorkloadInfo& info) const
 {
+    if (descriptor.m_Parameters.m_Operation == UnaryOperation::LogicalNot)
+    {
+        return std::make_unique<RefLogicalUnaryWorkload>(descriptor, info);
+    }
     return std::make_unique<RefElementwiseUnaryWorkload>(descriptor, info);
 }
 
@@ -407,13 +411,6 @@
     return std::make_unique<RefLogicalBinaryWorkload>(descriptor, info);
 }
 
-std::unique_ptr<IWorkload> RefWorkloadFactory::CreateLogicalUnary(const ElementwiseUnaryQueueDescriptor& descriptor,
-                                                                  const WorkloadInfo& info) const
-{
-    return std::make_unique<RefLogicalUnaryWorkload>(descriptor, info);
-}
-
-
 std::unique_ptr<IWorkload> RefWorkloadFactory::CreateLogSoftmax(const LogSoftmaxQueueDescriptor& descriptor,
                                                                 const WorkloadInfo& info) const
 {
diff --git a/src/backends/reference/RefWorkloadFactory.hpp b/src/backends/reference/RefWorkloadFactory.hpp
index 8c3d719..41cefd3 100644
--- a/src/backends/reference/RefWorkloadFactory.hpp
+++ b/src/backends/reference/RefWorkloadFactory.hpp
@@ -165,9 +165,6 @@
     std::unique_ptr<IWorkload> CreateLogicalBinary(const LogicalBinaryQueueDescriptor& descriptor,
                                                    const WorkloadInfo& info) const override;
 
-    std::unique_ptr<IWorkload> CreateLogicalUnary(const ElementwiseUnaryQueueDescriptor& descriptor,
-                                                  const WorkloadInfo& info) const override;
-
     std::unique_ptr<IWorkload> CreateLogSoftmax(const LogSoftmaxQueueDescriptor& descriptor,
                                                 const WorkloadInfo& info) const override;
 
diff --git a/src/backends/reference/test/RefLayerTests.cpp b/src/backends/reference/test/RefLayerTests.cpp
index 60400c5..be95ad7 100644
--- a/src/backends/reference/test/RefLayerTests.cpp
+++ b/src/backends/reference/test/RefLayerTests.cpp
@@ -1386,6 +1386,8 @@
 ARMNN_AUTO_TEST_CASE_WITH_THF(SimpleReshapeQuantisedAsymm8, SimpleReshapeTest<DataType::QAsymmU8>)
 ARMNN_AUTO_TEST_CASE_WITH_THF(SimpleReshapeQuantisedSymm16, SimpleReshapeTest<DataType::QSymmS16>)
 ARMNN_AUTO_TEST_CASE_WITH_THF(Reshape5d, Reshape5dTest<DataType::Float32>)
+ARMNN_AUTO_TEST_CASE_WITH_THF(ReshapeBoolean, ReshapeBooleanTest)
+
 
 // Rsqrt
 ARMNN_AUTO_TEST_CASE_WITH_THF(Rsqrt2d, Rsqrt2dTest<DataType::Float32>)
diff --git a/tests/ExecuteNetwork/ExecuteNetwork.cpp b/tests/ExecuteNetwork/ExecuteNetwork.cpp
index ba7ce29..be341b6 100644
--- a/tests/ExecuteNetwork/ExecuteNetwork.cpp
+++ b/tests/ExecuteNetwork/ExecuteNetwork.cpp
@@ -88,57 +88,50 @@
         if (params.m_InputTypes[inputIndex].compare("float") == 0)
         {
             auto inputData = tfLiteInterpreter->typed_tensor<float>(input);
-            TContainer tensorData;
-            PopulateTensorWithData(tensorData,
-                                   params.m_InputTensorShapes[inputIndex]->GetNumElements(),
-                                   params.m_InputTypes[inputIndex],
-                                   armnn::EmptyOptional(),
-                                   dataFile);
+            std::vector<float> tensorData;
+            PopulateTensorWithDataGeneric<float>(tensorData,
+                                                  params.m_InputTensorShapes[inputIndex]->GetNumElements(),
+                                                  dataFile,
+                                                  [](const std::string& s)
+                                                  { return std::stof(s); });
 
-            mapbox::util::apply_visitor([&](auto&& value)
-            {
-                for (unsigned int i = 0; i < inputSize; ++i)
-                {
-                    inputData[i] = value.data()[i];
-                }
-            },
-            tensorData);
+            std::copy(tensorData.begin(), tensorData.end(), inputData);
+        }
+        else if (params.m_InputTypes[inputIndex].compare("int8") == 0)
+        {
+            auto inputData = tfLiteInterpreter->typed_tensor<int8_t>(input);
+            std::vector<int8_t> tensorData;
+            PopulateTensorWithDataGeneric<int8_t>(tensorData,
+                                                  params.m_InputTensorShapes[inputIndex]->GetNumElements(),
+                                                  dataFile,
+                                                  [](const std::string& s)
+                                                  { return armnn::numeric_cast<int8_t>(std::stoi(s)); });
+
+            std::copy(tensorData.begin(), tensorData.end(), inputData);
         }
         else if (params.m_InputTypes[inputIndex].compare("int") == 0)
         {
             auto inputData = tfLiteInterpreter->typed_tensor<int32_t>(input);
-            TContainer tensorData;
-            PopulateTensorWithData(tensorData,
-                                   params.m_InputTensorShapes[inputIndex]->GetNumElements(),
-                                   params.m_InputTypes[inputIndex],
-                                   armnn::EmptyOptional(),
-                                   dataFile);
-            mapbox::util::apply_visitor([&](auto&& value)
-            {
-                for (unsigned int i = 0; i < inputSize; ++i)
-                {
-                    inputData[i] = value.data()[i];
-                }
-            },
-            tensorData);
+            std::vector<int32_t> tensorData;
+            PopulateTensorWithDataGeneric<int32_t>(tensorData,
+                                                   params.m_InputTensorShapes[inputIndex]->GetNumElements(),
+                                                   dataFile,
+                                                   [](const std::string& s)
+                                                   { return std::stoi(s); });
+
+            std::copy(tensorData.begin(), tensorData.end(), inputData);
         }
         else if (params.m_InputTypes[inputIndex].compare("qasymm8") == 0)
         {
             auto inputData = tfLiteInterpreter->typed_tensor<uint8_t>(input);
-            TContainer tensorData;
-            PopulateTensorWithData(tensorData,
-                                   params.m_InputTensorShapes[inputIndex]->GetNumElements(),
-                                   params.m_InputTypes[inputIndex],
-                                   armnn::EmptyOptional(),
-                                   dataFile);
-            mapbox::util::apply_visitor([&](auto&& value)
-            {
-                for (unsigned int i = 0; i < inputSize; ++i)
-                {
-                    inputData[i] = value.data()[i];
-                }
-            },
-            tensorData);
+            std::vector<uint8_t> tensorData;
+            PopulateTensorWithDataGeneric<uint8_t>(tensorData,
+                                                   params.m_InputTensorShapes[inputIndex]->GetNumElements(),
+                                                   dataFile,
+                                                   [](const std::string& s)
+                                                   { return armnn::numeric_cast<uint8_t>(std::stoi(s)); });
+
+            std::copy(tensorData.begin(), tensorData.end(), inputData);
         }
         else
         {
@@ -203,6 +196,25 @@
                     }
                 }
             }
+            else if (params.m_OutputTypes[outputIndex].compare("int8") == 0)
+            {
+                auto tfLiteDelageOutputData = tfLiteInterpreter->typed_tensor<int8_t>(tfLiteDelegateOutputId);
+                if(tfLiteDelageOutputData == NULL)
+                {
+                    ARMNN_LOG(fatal) << "Output tensor is null, output type: "
+                                        "\"" << params.m_OutputTypes[outputIndex] << "\" may be incorrect.";
+                    return EXIT_FAILURE;
+                }
+
+                for (int i = 0; i < outputSize; ++i)
+                {
+                    std::cout << signed(tfLiteDelageOutputData[i]) << ", ";
+                    if (i % 60 == 0)
+                    {
+                        std::cout << std::endl;
+                    }
+                }
+            }
             else if (params.m_OutputTypes[outputIndex].compare("qasymm8") == 0)
             {
                 auto tfLiteDelageOutputData = tfLiteInterpreter->typed_tensor<uint8_t>(tfLiteDelegateOutputId);
diff --git a/tests/NetworkExecutionUtils/NetworkExecutionUtils.cpp b/tests/NetworkExecutionUtils/NetworkExecutionUtils.cpp
index 3e7c87d..2afd941 100644
--- a/tests/NetworkExecutionUtils/NetworkExecutionUtils.cpp
+++ b/tests/NetworkExecutionUtils/NetworkExecutionUtils.cpp
@@ -25,36 +25,6 @@
 #include "armnnOnnxParser/IOnnxParser.hpp"
 #endif
 
-
-template<typename T, typename TParseElementFunc>
-std::vector<T> ParseArrayImpl(std::istream& stream, TParseElementFunc parseElementFunc, const char* chars = "\t ,:")
-{
-    std::vector<T> result;
-    // Processes line-by-line.
-    std::string line;
-    while (std::getline(stream, line))
-    {
-        std::vector<std::string> tokens = armnn::stringUtils::StringTokenizer(line, chars);
-        for (const std::string& token : tokens)
-        {
-            if (!token.empty()) // See https://stackoverflow.com/questions/10437406/
-            {
-                try
-                {
-                    result.push_back(parseElementFunc(token));
-                }
-                catch (const std::exception&)
-                {
-                    ARMNN_LOG(error) << "'" << token << "' is not a valid number. It has been ignored.";
-                }
-            }
-        }
-    }
-
-    return result;
-}
-
-
 template<armnn::DataType NonQuantizedType>
 auto ParseDataArray(std::istream& stream);
 
diff --git a/tests/NetworkExecutionUtils/NetworkExecutionUtils.hpp b/tests/NetworkExecutionUtils/NetworkExecutionUtils.hpp
index 9d9e616..742f968 100644
--- a/tests/NetworkExecutionUtils/NetworkExecutionUtils.hpp
+++ b/tests/NetworkExecutionUtils/NetworkExecutionUtils.hpp
@@ -7,10 +7,13 @@
 
 #include <armnn/IRuntime.hpp>
 #include <armnn/Types.hpp>
+#include <armnn/Logging.hpp>
+#include <armnn/utility/StringUtils.hpp>
 
 #include <mapbox/variant.hpp>
 
 #include <iostream>
+#include <fstream>
 
 
 std::vector<unsigned int> ParseArray(std::istream& stream);
@@ -68,4 +71,51 @@
  * @param expectFile bool - If true, checks for a regular file.
  * @return bool - True if all given strings are valid paths., false otherwise.
  * */
-bool ValidatePaths(const std::vector<std::string>& fileVec, const bool expectFile);
\ No newline at end of file
+bool ValidatePaths(const std::vector<std::string>& fileVec, const bool expectFile);
+
+template<typename T, typename TParseElementFunc>
+std::vector<T> ParseArrayImpl(std::istream& stream, TParseElementFunc parseElementFunc, const char* chars = "\t ,:")
+{
+    std::vector<T> result;
+    // Processes line-by-line.
+    std::string line;
+    while (std::getline(stream, line))
+    {
+        std::vector<std::string> tokens = armnn::stringUtils::StringTokenizer(line, chars);
+        for (const std::string& token : tokens)
+        {
+            if (!token.empty()) // See https://stackoverflow.com/questions/10437406/
+            {
+                try
+                {
+                    result.push_back(parseElementFunc(token));
+                }
+                catch (const std::exception&)
+                {
+                    ARMNN_LOG(error) << "'" << token << "' is not a valid number. It has been ignored.";
+                }
+            }
+        }
+    }
+
+    return result;
+}
+
+template <typename T, typename TParseElementFunc>
+void PopulateTensorWithDataGeneric(std::vector<T>& tensorData,
+                                   unsigned int numElements,
+                                   const armnn::Optional<std::string>& dataFile,
+                                   TParseElementFunc parseFunction)
+{
+    const bool readFromFile = dataFile.has_value() && !dataFile.value().empty();
+
+    std::ifstream inputTensorFile;
+    if (readFromFile)
+    {
+        inputTensorFile = std::ifstream(dataFile.value());
+    }
+
+    tensorData = readFromFile ?
+                 ParseArrayImpl<T>(inputTensorFile, parseFunction) :
+                 std::vector<T>(numElements, static_cast<T>(0));
+}