thread-local caches?

Here's another idea... these SkVMBlitter program caches are probably
best thread-local.  If there are a bunch of threads doing the same work
with the same program, we don't need them fighting over that one slot in
the cache, and if there are a bunch of threads doing _different_ work,
they'll get the best cache behavior if they don't fight over slots in
the LRU with different programs.  Either way, seems like a win?

(I've kept the try-acquire/release pattern just to make the focus of
this change more clear.  We can fold it through more if we like it.)

Change-Id: Ib1ee270069c48446845ce27225652896661c5dfe
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/233060
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
1 file changed