Sync-patch with libwebp ver 0.4.1

Sync-patch with libwebp ver 0.4.1-rc1 (change#I5346984d2).
  - NEON assembly optimizations:
    - ~25% faster lossy decode / encode (-m 4)
    - ~10% faster lossless decode
    - ~5-10% faster lossless encode (-m 3/4)
  - Arch64 (arm64) & MIPS support/optimizations.
The bug for this request is b/16624377

Ran (OK/Pass) following cts tests for N7 (Razor/flo) & N8 (Volantis/flounder).

cts-tradefed run cts -d -c android.graphics.cts.BitmapTest
cts-tradefed run cts -d -c android.graphics.cts.BitmapFactoryTest
cts-tradefed run cts -d -c android.graphics.cts.BitmapRegionDecoderTest
cts-tradefed run cts -d -c android.graphics.cts.Bitmap_CompressFormatTest
cts-tradefed run cts -d -c android.graphics.cts.Bitmap_ConfigTest
cts-tradefed run cts -d -c android.graphics.cts.BitmapFactory_OptionsTest
cts-tradefed run cts -d -c android.graphics.cts.BitmapShaderTest

Change-Id: Idf2756b8881d10001c0663bca454aac86ab30a39
93 files changed