If a general reduction kernel lacks a combiner function, synthesize one.

The CPU reference driver can only go multithreaded for a reduction kernel
if that kernel has a combiner function.

Bug: 27299475

Change-Id: If7f3a9ba8ec5e15ed5f3ef96968d28d650b01c20
(cherry picked from commit 57fd9f882f3359be4201c42b02aebf785d311df2)
4 files changed