patch from mcculls:
Rolling back one of the optimizations to avoid leaving garbage behind in a ThreadLocal. The extra reference can cause the thread to strongly hold the injector, which is bad for application unloading.

Regrettably, this costs about 12% throughput. Even so, we're still creating almost a million objects per second on my laptop, so we're still good.

