Merge internal changes manually

- Error handling fixes
- Low-level support for comparing to GPU delegate
- Run inference at least 1000 times

Test: ./build_and_run_benchmark.sh scoring

Change-Id: Ic393ff9e067bc0084afa9db4478f9afca32e76f1
14 files changed