Allow BATCH_MATMUL with different input and output scales.

The CPU reference implementation already supported this, so the only
change needed is to relax the validation and modify the test.

Also updated sample shim driver prebuilts.

Cherrypicked from I4513c2c73d6d920378e32ee8491bb642796a386d

Bug: 206089870
Test: NNT_static
Test: VtsHalNeuralnetworksTargetTest
Change-Id: I4513c2c73d6d920378e32ee8491bb642796a386d
7 files changed