add SkVM_Overhead bench, simple improvements

This new bench lets us measure the overhead of program building,
optimization, and JITting.  Surprisingly, at head the optimization in
Builder::done() takes longer than the JIT.

The new bench clocks in around 40µs on my laptop at head,
  then 32µs after switching val_to_reg to be an std::vector,
  then 27µs after switching deaths to be an std::vector too,
  then 22µs after switching fIndex to be an SkTHashMap,
  then 20µs after calling program.reserve(fProgram.size()),
  then 19µs after switching JIT data maps to SkTHashMap too.

I tried swapping some std::vector for SkTDArray to no benefit, actually
a little detriment.  So I think this is roughly all the low-hanging
fruit, with time split now roughly equally between Builder::Done(),
JITting in Program::eval(), and the original calls to Builder
themselves.

Also disable perf dumps on Mac.  No real value there until I can dump a
dylib, and it's just one more thing I have to remember to disable before
running this sort of benchmark.

Change-Id: I1c6e58ed00ac94ad622c7d740712634f60787102
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/222984
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
3 files changed