Snap for 4632767 from f5c8ca7bb600d1e5488f65c6d5447b19b18d899c to pi-release

Change-Id: I9b0f10add399fcdff158913499780f5ef46874b5
diff --git a/Android.bp b/Android.bp
index 8b1438d..1c5d924 100644
--- a/Android.bp
+++ b/Android.bp
@@ -6,6 +6,7 @@
         "-O3",
         "-fstrict-aliasing",
 
+        "-Wno-sign-compare",
         "-Wno-unused-parameter",
         "-Werror",
     ],
diff --git a/BUILDING.md b/BUILDING.md
index 4a147ce..2725f30 100644
--- a/BUILDING.md
+++ b/BUILDING.md
@@ -1,5 +1,5 @@
-Building on Un*x Platforms (including Cygwin and OS X)
-=======================================================
+Un*x Platforms (including Mac and Cygwin)
+=========================================
 
 
 Build Requirements
@@ -10,21 +10,21 @@
 - libtool 1.4 or later
   * If using Xcode 4.3 or later on OS X, autoconf and automake are no longer
     provided.  The easiest way to obtain them is from
-    [MacPorts](http://www.MacPorts.org).
+    [MacPorts](http://www.MacPorts.org) or [Homebrew](http://brew.sh/).
 
-- NASM or YASM (if building x86 or x86-64 SIMD extensions)
+- [NASM](http://www.nasm.us) or [YASM](http://yasm.tortall.net)
+  (if building x86 or x86-64 SIMD extensions)
   * If using NASM, 0.98, or 2.01 or later is required for an x86 build (0.99
     and 2.00 do not work properly with libjpeg-turbo's x86 SIMD code.)
   * If using NASM, 2.00 or later is required for an x86-64 build.
   * If using NASM, 2.07 or later (except 2.11.08) is required for an x86-64
     Mac build (2.11.08 does not work properly with libjpeg-turbo's x86-64 SIMD
     code when building macho64 objects.)  NASM or YASM can be obtained from
-    [MacPorts](http://www.macports.org/).
+    [MacPorts](http://www.macports.org/) or [Homebrew](http://brew.sh/).
 
   The binary RPMs released by the NASM project do not work on older Linux
-  systems, such as Red Hat Enterprise Linux 4.  On such systems, you can
-   easily build and install NASM from a source RPM by downloading one of the
-  SRPMs from
+  systems, such as Red Hat Enterprise Linux 5.  On such systems, you can easily
+  build and install NASM from a source RPM by downloading one of the SRPMs from
 
   <http://www.nasm.us/pub/nasm/releasebuilds>
 
@@ -36,13 +36,13 @@
 
   NOTE: the NASM build will fail if texinfo is not installed.
 
-- GCC v4.1 (or later) or clang recommended for best performance
+- GCC v4.1 (or later) or Clang recommended for best performance
 
 - If building the TurboJPEG Java wrapper, JDK or OpenJDK 1.5 or later is
-  required.  Some systems, such as Solaris 10 and later and Red Hat Enterprise
-  Linux 5 and later, have this pre-installed.  On OS X 10.5 and 10.6, it will
-  be necessary to install the Java Developer Package, which can be downloaded
-  from <http://developer.apple.com/downloads> (Apple ID required.)  For other
+  required.  Most modern Linux distributions, as well as Solaris 10 and later,
+  include JDK or OpenJDK.  On OS X 10.5 and 10.6, it will be necessary to
+  install the Java Developer Package, which can be downloaded from
+  <http://developer.apple.com/downloads> (Apple ID required.)  For other
   systems, you can obtain the Oracle Java Development Kit from
   <http://www.java.com>.
 
@@ -50,23 +50,22 @@
 Out-of-Tree Builds
 ------------------
 
-Binary objects, libraries, and executables are generated in the same directory
-from which `configure` was executed (the "binary directory"), and this
-directory need not necessarily be the same as the libjpeg-turbo source
-directory.  You can create multiple independent binary directories, in which
-different versions of libjpeg-turbo can be built from the same source tree
-using different compilers or settings.  In the sections below,
-*{build_directory}* refers to the binary directory, whereas
-*{source_directory}* refers to the libjpeg-turbo source directory.  For in-tree
-builds, these directories are the same.
+Binary objects, libraries, and executables are generated in the directory from
+which `configure` is executed (the "binary directory"), and this directory need
+not necessarily be the same as the libjpeg-turbo source directory.  You can
+create multiple independent binary directories, in which different versions of
+libjpeg-turbo can be built from the same source tree using different compilers
+or settings.  In the sections below, *{build_directory}* refers to the binary
+directory, whereas *{source_directory}* refers to the libjpeg-turbo source
+directory.  For in-tree builds, these directories are the same.
 
 
-Building libjpeg-turbo
-----------------------
+Build Procedure
+---------------
 
-The following procedure will build libjpeg-turbo on Linux, FreeBSD, Cygwin, and
-Solaris/x86 systems (on Solaris, this generates a 32-bit library.  See below
-for 64-bit build instructions.)
+The following procedure will build libjpeg-turbo on Unix and Unix-like systems.
+(On Solaris, this generates a 32-bit build.  See "Build Recipes" below for
+64-bit build instructions.)
 
     cd {source_directory}
     autoreconf -fiv
@@ -77,40 +76,40 @@
 NOTE: Running autoreconf in the source directory is not necessary if building
 libjpeg-turbo from one of the official release tarballs.
 
-This will generate the following files under .libs/:
+This will generate the following files under **.libs/**:
 
-**libjpeg.a**  
+**libjpeg.a**<br>
 Static link library for the libjpeg API
 
-**libjpeg.so.{version}** (Linux, Unix)  
-**libjpeg.{version}.dylib** (OS X)  
-**cygjpeg-{version}.dll** (Cygwin)  
+**libjpeg.so.{version}** (Linux, Unix)<br>
+**libjpeg.{version}.dylib** (Mac)<br>
+**cygjpeg-{version}.dll** (Cygwin)<br>
 Shared library for the libjpeg API
 
-By default, *{version}* is 62.1.0, 7.1.0, or 8.0.2, depending on whether
+By default, *{version}* is 62.2.0, 7.2.0, or 8.1.2, depending on whether
 libjpeg v6b (default), v7, or v8 emulation is enabled.  If using Cygwin,
 *{version}* is 62, 7, or 8.
 
-**libjpeg.so** (Linux, Unix)  
-**libjpeg.dylib** (OS X)  
+**libjpeg.so** (Linux, Unix)<br>
+**libjpeg.dylib** (Mac)<br>
 Development symlink for the libjpeg API
 
-**libjpeg.dll.a** (Cygwin)  
+**libjpeg.dll.a** (Cygwin)<br>
 Import library for the libjpeg API
 
-**libturbojpeg.a**  
+**libturbojpeg.a**<br>
 Static link library for the TurboJPEG API
 
-**libturbojpeg.so.0.1.0** (Linux, Unix)  
-**libturbojpeg.0.1.0.dylib** (OS X)  
-**cygturbojpeg-0.dll** (Cygwin)  
+**libturbojpeg.so.0.1.0** (Linux, Unix)<br>
+**libturbojpeg.0.1.0.dylib** (Mac)<br>
+**cygturbojpeg-0.dll** (Cygwin)<br>
 Shared library for the TurboJPEG API
 
-**libturbojpeg.so** (Linux, Unix)  
-**libturbojpeg.dylib** (OS X)  
+**libturbojpeg.so** (Linux, Unix)<br>
+**libturbojpeg.dylib** (Mac)<br>
 Development symlink for the TurboJPEG API
 
-**libturbojpeg.dll.a** (Cygwin)  
+**libturbojpeg.dll.a** (Cygwin)<br>
 Import library for the TurboJPEG API
 
 
@@ -120,7 +119,7 @@
 libjpeg-turbo that is API/ABI-compatible with libjpeg v7.  Add `--with-jpeg8`
 to the `configure` command to build a version of libjpeg-turbo that is
 API/ABI-compatible with libjpeg v8.  See [README.md](README.md) for more
-information on libjpeg v7 and v8 emulation.
+information about libjpeg v7 and v8 emulation.
 
 
 ### In-Memory Source/Destination Managers
@@ -146,46 +145,19 @@
 ### TurboJPEG Java Wrapper
 
 Add `--with-java` to the `configure` command line to incorporate an optional
-Java Native Interface wrapper into the TurboJPEG shared library and build the
-Java front-end classes to support it.  This allows the TurboJPEG shared library
-to be used directly from Java applications.  See [java/README](java/README) for
-more details.
+Java Native Interface (JNI) wrapper into the TurboJPEG shared library and build
+the Java front-end classes to support it.  This allows the TurboJPEG shared
+library to be used directly from Java applications.  See
+[java/README](java/README) for more details.
 
 You can set the `JAVAC`, `JAR`, and `JAVA` configure variables to specify
 alternate commands for javac, jar, and java (respectively.)  You can also
 set the `JAVACFLAGS` configure variable to specify arguments that should be
-passed to the Java compiler when building the front-end classes, and
+passed to the Java compiler when building the TurboJPEG classes, and
 `JNI_CFLAGS` to specify arguments that should be passed to the C compiler when
 building the JNI wrapper.  Run `configure --help` for more details.
 
 
-Installing libjpeg-turbo
-------------------------
-
-If you intend to install these libraries and the associated header files, then
-replace 'make' in the instructions above with
-
-    make install prefix={base dir} libdir={library directory}
-
-For example,
-
-    make install prefix=/usr/local libdir=/usr/local/lib64
-
-will install the header files in /usr/local/include and the library files in
-/usr/local/lib64.  If `prefix` and `libdir` are not specified, then the default
-is to install the header files in /opt/libjpeg-turbo/include and the library
-files in /opt/libjpeg-turbo/lib32 (32-bit) or /opt/libjpeg-turbo/lib64
-(64-bit.)
-
-NOTE: You can specify a prefix of /usr and a libdir of, for instance,
-/usr/lib64 to overwrite the system's version of libjpeg.  If you do this,
-however, then be sure to BACK UP YOUR SYSTEM'S INSTALLATION OF LIBJPEG before
-overwriting it.  It is recommended that you instead install libjpeg-turbo into
-a non-system directory and manipulate the `LD_LIBRARY_PATH` or create symlinks
-to force applications to use libjpeg-turbo instead of libjpeg.  See
-[README.md](README.md) for more information.
-
-
 Build Recipes
 -------------
 
@@ -205,8 +177,9 @@
 
     --host x86_64-apple-darwin NASM=/opt/local/bin/nasm
 
-to the `configure` command line.  NASM 2.07 or later from MacPorts must be
-installed.
+to the `configure` command line.  NASM 2.07 or later from MacPorts or Homebrew
+must be installed.  If using Homebrew, then replace `/opt/local` with
+`/usr/local`.
 
 
 ### 32-bit Build on 64-bit OS X
@@ -226,8 +199,9 @@
       CFLAGS='-mmacosx-version-min=10.5 -O3' \
       LDFLAGS='-mmacosx-version-min=10.5'
 
-to the `configure` command line.  NASM 2.07 or later from MacPorts must be
-installed.
+to the `configure` command line.  NASM 2.07 or later from MacPorts or Homebrew
+must be installed.  If using Homebrew, then replace `/opt/local` with
+`/usr/local`.
 
 
 ### 32-bit Backward-Compatible Build on OS X
@@ -254,8 +228,7 @@
 
 Add
 
-    --host i386-unknown-freebsd CC='gcc -B /usr/lib32' CFLAGS='-O3 -m32' \
-      LDFLAGS='-B/usr/lib32'
+    --host i386-unknown-freebsd CFLAGS='-O3 -m32' LDFLAGS=-m32
 
 to the `configure` command line.  NASM 2.07 or later from FreeBSD ports must be
 installed.
@@ -282,153 +255,144 @@
 Use CMake (see recipes below)
 
 
-ARM Support
------------
+Building libjpeg-turbo for iOS
+------------------------------
 
-This release of libjpeg-turbo can use ARM NEON SIMD instructions to accelerate
-JPEG compression/decompression by approximately 2-4x on ARMv7 and later
-platforms.  If libjpeg-turbo is configured on an ARM Linux platform, then the
-build system will automatically include the NEON SIMD routines, if they are
-supported.  Build instructions for other ARM-based platforms follow.
+iOS platforms, such as the iPhone and iPad, use ARM processors, and all
+currently supported models include NEON instructions.  Thus, they can take
+advantage of libjpeg-turbo's SIMD extensions to significantly accelerate JPEG
+compression/decompression.  This section describes how to build libjpeg-turbo
+for these platforms.
 
 
-### Building libjpeg-turbo for iOS
+### Additional build requirements
 
-iOS platforms, such as the iPhone and iPad, use ARM processors, some of which
-support NEON instructions.  Additional steps are required in order to build
-libjpeg-turbo for these platforms.
+- For configurations that require [gas-preprocessor.pl]
+  (https://raw.githubusercontent.com/libjpeg-turbo/gas-preprocessor/master/gas-preprocessor.pl),
+  it should be installed in your `PATH`.
 
 
-#### Additional build requirements
+### ARMv7 (32-bit)
 
-- [gas-preprocessor.pl]
-  (https://raw.githubusercontent.com/libjpeg-turbo/gas-preprocessor/master/gas-preprocessor.pl)
-  should be installed in your `PATH`.
+**gas-preprocessor.pl required**
 
+The following scripts demonstrate how to build libjpeg-turbo to run on the
+iPhone 3GS-4S/iPad 1st-3rd Generation and newer:
 
-#### ARM 32-bit Build (Xcode 4.6.x and earlier, LLVM-GCC)
+#### Xcode 4.2 and earlier (LLVM-GCC)
 
-Set the following shell variables for simplicity:
+    IOS_PLATFORMDIR=/Developer/Platforms/iPhoneOS.platform
+    IOS_SYSROOT=($IOS_PLATFORMDIR/Developer/SDKs/iPhoneOS*.sdk)
 
-  *Xcode 4.2 and earlier*
+    export host_alias=arm-apple-darwin10
+    export CC=${IOS_PLATFORMDIR}/Developer/usr/bin/arm-apple-darwin10-llvm-gcc-4.2
+    export CFLAGS="-mfloat-abi=softfp -isysroot ${IOS_SYSROOT[0]} -O3 -march=armv7 -mcpu=cortex-a8 -mtune=cortex-a8 -mfpu=neon -miphoneos-version-min=3.0"
 
-    IOS_PLATFORMDIR=/Developer/Platforms/iPhoneOS.platform`
+    cd {build_directory}
+    sh {source_directory}/configure [additional configure flags]
+    make
 
-  *Xcode 4.3 and later*
+#### Xcode 4.3-4.6 (LLVM-GCC)
+
+Same as above, but replace the first line with:
 
     IOS_PLATFORMDIR=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform
 
-  *All Xcode versions*
-
-    IOS_SYSROOT=$IOS_PLATFORMDIR/Developer/SDKs/iPhoneOS*.sdk
-    IOS_GCC=$IOS_PLATFORMDIR/Developer/usr/bin/arm-apple-darwin10-llvm-gcc-4.2
-
-  *ARMv7 (code will run on iPhone 3GS-4S/iPad 1st-3rd Generation and newer)*
-
-    IOS_CFLAGS="-march=armv7 -mcpu=cortex-a8 -mtune=cortex-a8 -mfpu=neon"
-
-  *ARMv7s (code will run on iPhone 5/iPad 4th Generation and newer)*  
-  [NOTE: Requires Xcode 4.5 or later]
-
-    IOS_CFLAGS="-march=armv7s -mcpu=swift -mtune=swift -mfpu=neon"
-
-Follow the procedure under "Building libjpeg-turbo" above, adding
-
-    --host arm-apple-darwin10 \
-      CC="$IOS_GCC" LD="$IOS_GCC" \
-      CFLAGS="-mfloat-abi=softfp -isysroot $IOS_SYSROOT -O3 $IOS_CFLAGS" \
-      LDFLAGS="-mfloat-abi=softfp -isysroot $IOS_SYSROOT $IOS_CFLAGS"
-
-to the `configure` command line.
-
-
-#### ARM 32-bit Build (Xcode 5.0.x and later, Clang)
-
-Set the following shell variables for simplicity:
+#### Xcode 5 and later (Clang)
 
     IOS_PLATFORMDIR=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform
-    IOS_SYSROOT=$IOS_PLATFORMDIR/Developer/SDKs/iPhoneOS*.sdk
-    IOS_GCC=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
+    IOS_SYSROOT=($IOS_PLATFORMDIR/Developer/SDKs/iPhoneOS*.sdk)
 
-  *ARMv7 (code will run on iPhone 3GS-4S/iPad 1st-3rd Generation and newer)*
+    export host_alias=arm-apple-darwin10
+    export CC=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
+    export CFLAGS="-mfloat-abi=softfp -isysroot ${IOS_SYSROOT[0]} -O3 -arch armv7 -miphoneos-version-min=3.0"
+    export CCASFLAGS="$CFLAGS -no-integrated-as"
 
-    IOS_CFLAGS="-arch armv7"
-
-  *ARMv7s (code will run on iPhone 5/iPad 4th Generation and newer)*
-
-    IOS_CFLAGS="-arch armv7s"
-
-Follow the procedure under "Building libjpeg-turbo" above, adding
-
-    --host arm-apple-darwin10 \
-      CC="$IOS_GCC" LD="$IOS_GCC" \
-      CFLAGS="-mfloat-abi=softfp -isysroot $IOS_SYSROOT -O3 $IOS_CFLAGS" \
-      LDFLAGS="-mfloat-abi=softfp -isysroot $IOS_SYSROOT $IOS_CFLAGS" \
-      CCASFLAGS="-no-integrated-as $IOS_CFLAGS"
-
-to the `configure` command line.
+    cd {build_directory}
+    sh {source_directory}/configure [additional configure flags]
+    make
 
 
-#### ARMv8 64-bit Build (Xcode 5.0.x and later, Clang)
+### ARMv7s (32-bit)
 
-Code will run on iPhone 5S/iPad Mini 2/iPad Air and newer.
+**gas-preprocessor.pl required**
 
-Set the following shell variables for simplicity:
+The following scripts demonstrate how to build libjpeg-turbo to run on the
+iPhone 5/iPad 4th Generation and newer:
+
+#### Xcode 4.5-4.6 (LLVM-GCC)
 
     IOS_PLATFORMDIR=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform
-    IOS_SYSROOT=$IOS_PLATFORMDIR/Developer/SDKs/iPhoneOS*.sdk
-    IOS_GCC=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
-    IOS_CFLAGS="-arch arm64"
+    IOS_SYSROOT=($IOS_PLATFORMDIR/Developer/SDKs/iPhoneOS*.sdk)
 
-Follow the procedure under "Building libjpeg-turbo" above, adding
+    export host_alias=arm-apple-darwin10
+    export CC=${IOS_PLATFORMDIR}/Developer/usr/bin/arm-apple-darwin10-llvm-gcc-4.2
+    export CFLAGS="-mfloat-abi=softfp -isysroot ${IOS_SYSROOT[0]} -O3 -march=armv7s -mcpu=swift -mtune=swift -mfpu=neon -miphoneos-version-min=6.0"
 
-    --host aarch64-apple-darwin \
-      CC="$IOS_GCC" LD="$IOS_GCC" \
-      CFLAGS="-isysroot $IOS_SYSROOT -O3 $IOS_CFLAGS" \
-      LDFLAGS="-isysroot $IOS_SYSROOT $IOS_CFLAGS"
+    cd {build_directory}
+    sh {source_directory}/configure [additional configure flags]
+    make
 
-to the `configure` command line.
+#### Xcode 5 and later (Clang)
+
+Same as the ARMv7 build procedure for Xcode 5 and later, except replace the
+compiler flags as follows:
+
+    export CFLAGS="-mfloat-abi=softfp -isysroot ${IOS_SYSROOT[0]} -O3 -arch armv7s -miphoneos-version-min=6.0"
 
 
-NOTE:  You can also add `-miphoneos-version-min={version}` to `$IOS_CFLAGS`
-above in order to support older versions of iOS than the default version
-supported by the SDK.
+### ARMv8 (64-bit)
+
+**gas-preprocessor.pl required if using Xcode < 6**
+
+The following script demonstrates how to build libjpeg-turbo to run on the
+iPhone 5S/iPad Mini 2/iPad Air and newer.
+
+    IOS_PLATFORMDIR=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform
+    IOS_SYSROOT=($IOS_PLATFORMDIR/Developer/SDKs/iPhoneOS*.sdk)
+
+    export host_alias=aarch64-apple-darwin
+    export CC=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
+    export CFLAGS="-isysroot ${IOS_SYSROOT[0]} -O3 -arch arm64 -miphoneos-version-min=7.0 -funwind-tables"
+
+    cd {build_directory}
+    sh {source_directory}/configure [additional configure flags]
+    make
 
 Once built, lipo can be used to combine the ARMv7, v7s, and/or v8 variants into
 a universal library.
 
 
-### Building libjpeg-turbo for Android
+Building libjpeg-turbo for Android
+----------------------------------
 
 Building libjpeg-turbo for Android platforms requires the
-{Android NDK}(https://developer.android.com/tools/sdk/ndk)
-and autotools.  The following is a general recipe script that can be modified for your specific needs.
+[Android NDK](https://developer.android.com/tools/sdk/ndk) and autotools.
+
+
+### ARMv7 (32-bit)
+
+The following is a general recipe script that can be modified for your specific
+needs.
 
     # Set these variables to suit your needs
-    NDK_PATH={full path to the "ndk" directory-- for example, /opt/android/ndk}
+    NDK_PATH={full path to the "ndk" directory-- for example, /opt/android/sdk/ndk-bundle}
     BUILD_PLATFORM={the platform name for the NDK package you installed--
       for example, "windows-x86" or "linux-x86_64" or "darwin-x86_64"}
     TOOLCHAIN_VERSION={"4.8", "4.9", "clang3.5", etc.  This corresponds to a
       toolchain directory under ${NDK_PATH}/toolchains/.}
     ANDROID_VERSION={The minimum version of Android to support-- for example,
-      "16", "19", etc.  "21" or later is required for a 64-bit build.}
+      "16", "19", etc.}
 
-    # 32-bit ARMv7 build
+    # It should not be necessary to modify the rest
     HOST=arm-linux-androideabi
     SYSROOT=${NDK_PATH}/platforms/android-${ANDROID_VERSION}/arch-arm
     ANDROID_CFLAGS="-march=armv7-a -mfloat-abi=softfp -fprefetch-loop-arrays \
       --sysroot=${SYSROOT}"
 
-    # 64-bit ARMv8 build
-    HOST=aarch64-linux-android
-    SYSROOT=${NDK_PATH}/platforms/android-${ANDROID_VERSION}/arch-arm64
-    ANDROID_CFLAGS="--sysroot=${SYSROOT}"
-
     TOOLCHAIN=${NDK_PATH}/toolchains/${HOST}-${TOOLCHAIN_VERSION}/prebuilt/${BUILD_PLATFORM}
-    ANDROID_INCLUDES="-I${SYSROOT}/usr/include -I${TOOLCHAIN}/include"
     export CPP=${TOOLCHAIN}/bin/${HOST}-cpp
     export AR=${TOOLCHAIN}/bin/${HOST}-ar
-    export AS=${TOOLCHAIN}/bin/${HOST}-as
     export NM=${TOOLCHAIN}/bin/${HOST}-nm
     export CC=${TOOLCHAIN}/bin/${HOST}-gcc
     export LD=${TOOLCHAIN}/bin/${HOST}-ld
@@ -437,17 +401,144 @@
     export STRIP=${TOOLCHAIN}/bin/${HOST}-strip
     cd {build_directory}
     sh {source_directory}/configure --host=${HOST} \
-      CFLAGS="${ANDROID_INCLUDES} ${ANDROID_CFLAGS} -O3 -fPIE" \
-      CPPFLAGS="${ANDROID_INCLUDES} ${ANDROID_CFLAGS}" \
+      CFLAGS="${ANDROID_CFLAGS} -O3 -fPIE" \
+      CPPFLAGS="${ANDROID_CFLAGS}" \
       LDFLAGS="${ANDROID_CFLAGS} -pie" --with-simd ${1+"$@"}
     make
 
+
+### ARMv8 (64-bit)
+
+The following is a general recipe script that can be modified for your specific
+needs.
+
+    # Set these variables to suit your needs
+    NDK_PATH={full path to the "ndk" directory-- for example, /opt/android/sdk/ndk-bundle}
+    BUILD_PLATFORM={the platform name for the NDK package you installed--
+      for example, "windows-x86" or "linux-x86_64" or "darwin-x86_64"}
+    TOOLCHAIN_VERSION={"4.8", "4.9", "clang3.5", etc.  This corresponds to a
+      toolchain directory under ${NDK_PATH}/toolchains/.}
+    ANDROID_VERSION={The minimum version of Android to support.  "21" or later
+      is required for a 64-bit build.}
+
+    # It should not be necessary to modify the rest
+    HOST=aarch64-linux-android
+    SYSROOT=${NDK_PATH}/platforms/android-${ANDROID_VERSION}/arch-arm64
+    ANDROID_CFLAGS="--sysroot=${SYSROOT}"
+
+    TOOLCHAIN=${NDK_PATH}/toolchains/${HOST}-${TOOLCHAIN_VERSION}/prebuilt/${BUILD_PLATFORM}
+    export CPP=${TOOLCHAIN}/bin/${HOST}-cpp
+    export AR=${TOOLCHAIN}/bin/${HOST}-ar
+    export NM=${TOOLCHAIN}/bin/${HOST}-nm
+    export CC=${TOOLCHAIN}/bin/${HOST}-gcc
+    export LD=${TOOLCHAIN}/bin/${HOST}-ld
+    export RANLIB=${TOOLCHAIN}/bin/${HOST}-ranlib
+    export OBJDUMP=${TOOLCHAIN}/bin/${HOST}-objdump
+    export STRIP=${TOOLCHAIN}/bin/${HOST}-strip
+    cd {build_directory}
+    sh {source_directory}/configure --host=${HOST} \
+      CFLAGS="${ANDROID_CFLAGS} -O3 -fPIE" \
+      CPPFLAGS="${ANDROID_CFLAGS}" \
+      LDFLAGS="${ANDROID_CFLAGS} -pie" --with-simd ${1+"$@"}
+    make
+
+
+### x86 (32-bit)
+
+The following is a general recipe script that can be modified for your specific
+needs.
+
+    # Set these variables to suit your needs
+    NDK_PATH={full path to the "ndk" directory-- for example, /opt/android/sdk/ndk-bundle}
+    BUILD_PLATFORM={the platform name for the NDK package you installed--
+      for example, "windows-x86" or "linux-x86_64" or "darwin-x86_64"}
+    TOOLCHAIN_VERSION={"4.8", "4.9", "clang3.5", etc.  This corresponds to a
+      toolchain directory under ${NDK_PATH}/toolchains/.}
+    ANDROID_VERSION={The minimum version of Android to support-- for example,
+      "16", "19", etc.}
+
+    # It should not be necessary to modify the rest
+    HOST=i686-linux-android
+    SYSROOT=${NDK_PATH}/platforms/android-${ANDROID_VERSION}/arch-x86
+    ANDROID_CFLAGS="--sysroot=${SYSROOT}"
+
+    TOOLCHAIN=${NDK_PATH}/toolchains/x86-${TOOLCHAIN_VERSION}/prebuilt/${BUILD_PLATFORM}
+    export CPP=${TOOLCHAIN}/bin/${HOST}-cpp
+    export AR=${TOOLCHAIN}/bin/${HOST}-ar
+    export NM=${TOOLCHAIN}/bin/${HOST}-nm
+    export CC=${TOOLCHAIN}/bin/${HOST}-gcc
+    export LD=${TOOLCHAIN}/bin/${HOST}-ld
+    export RANLIB=${TOOLCHAIN}/bin/${HOST}-ranlib
+    export OBJDUMP=${TOOLCHAIN}/bin/${HOST}-objdump
+    export STRIP=${TOOLCHAIN}/bin/${HOST}-strip
+    cd {build_directory}
+    sh {source_directory}/configure --host=${HOST} \
+      CFLAGS="${ANDROID_CFLAGS} -O3 -fPIE" \
+      CPPFLAGS="${ANDROID_CFLAGS}" \
+      LDFLAGS="${ANDROID_CFLAGS} -pie" --with-simd ${1+"$@"}
+    make
+
+
+### x86-64 (64-bit)
+
+The following is a general recipe script that can be modified for your specific
+needs.
+
+    # Set these variables to suit your needs
+    NDK_PATH={full path to the "ndk" directory-- for example, /opt/android/sdk/ndk-bundle}
+    BUILD_PLATFORM={the platform name for the NDK package you installed--
+      for example, "windows-x86" or "linux-x86_64" or "darwin-x86_64"}
+    TOOLCHAIN_VERSION={"4.8", "4.9", "clang3.5", etc.  This corresponds to a
+      toolchain directory under ${NDK_PATH}/toolchains/.}
+    ANDROID_VERSION={The minimum version of Android to support.  "21" or later
+      is required for a 64-bit build.}
+
+    # It should not be necessary to modify the rest
+    HOST=x86_64-linux-android
+    SYSROOT=${NDK_PATH}/platforms/android-${ANDROID_VERSION}/arch-x86_64
+    ANDROID_CFLAGS="--sysroot=${SYSROOT}"
+
+    TOOLCHAIN=${NDK_PATH}/toolchains/x86_64-${TOOLCHAIN_VERSION}/prebuilt/${BUILD_PLATFORM}
+    export CPP=${TOOLCHAIN}/bin/${HOST}-cpp
+    export AR=${TOOLCHAIN}/bin/${HOST}-ar
+    export NM=${TOOLCHAIN}/bin/${HOST}-nm
+    export CC=${TOOLCHAIN}/bin/${HOST}-gcc
+    export LD=${TOOLCHAIN}/bin/${HOST}-ld
+    export RANLIB=${TOOLCHAIN}/bin/${HOST}-ranlib
+    export OBJDUMP=${TOOLCHAIN}/bin/${HOST}-objdump
+    export STRIP=${TOOLCHAIN}/bin/${HOST}-strip
+    cd {build_directory}
+    sh {source_directory}/configure --host=${HOST} \
+      CFLAGS="${ANDROID_CFLAGS} -O3 -fPIE" \
+      CPPFLAGS="${ANDROID_CFLAGS}" \
+      LDFLAGS="${ANDROID_CFLAGS} -pie" --with-simd ${1+"$@"}
+    make
+
+
 If building for Android 4.0.x (API level < 16) or earlier, remove `-fPIE` from
 `CFLAGS` and `-pie` from `LDFLAGS`.
 
 
-Building on Windows (Visual C++ or MinGW)
-=========================================
+Installing libjpeg-turbo
+------------------------
+
+To install libjpeg-turbo after it is built, replace `make` in the build
+instructions with `make install`.
+
+The `--prefix` argument to configure (or the `prefix` configure variable) can
+be used to specify an installation directory of your choosing.  If you don't
+specify an installation directory, then the default is to install libjpeg-turbo
+under **/opt/libjpeg-turbo** and to place the libraries in
+**/opt/libjpeg-turbo/lib32** (32-bit) or **/opt/libjpeg-turbo/lib64** (64-bit.)
+
+The `bindir`, `datadir`, `docdir`, `includedir`, `libdir`, and `mandir`
+configure variables allow a finer degree of control over where specific files in
+the libjpeg-turbo distribution should be installed.  These variables can either
+be specified at configure time or passed as arguments to `make install`.
+
+
+Windows (Visual C++ or MinGW)
+=============================
 
 
 Build Requirements
@@ -458,7 +549,7 @@
 - [NASM](http://www.nasm.us) or [YASM](http://yasm.tortall.net)
   * If using NASM, 0.98 or later is required for an x86 build.
   * If using NASM, 2.05 or later is required for an x86-64 build.
-  * nasm.exe/yasm.exe should be in your `PATH`.
+  * **nasm.exe**/**yasm.exe** should be in your `PATH`.
 
 - Microsoft Visual C++ 2005 or later
 
@@ -484,11 +575,10 @@
 
 - MinGW
 
-  [MinGW-builds](http://sourceforge.net/projects/mingwbuilds/) or
-  [tdm-gcc](http://tdm-gcc.tdragon.net/) recommended if building on a Windows
-  machine.  Both distributions install a Start Menu link that can be used to
-  launch a command prompt with the appropriate compiler paths automatically
-  set.
+  [MSYS2](http://msys2.github.io/) or [tdm-gcc](http://tdm-gcc.tdragon.net/)
+  recommended if building on a Windows machine.  Both distributions install a
+  Start Menu link that can be used to launch a command prompt with the
+  appropriate compiler paths automatically set.
 
 - If building the TurboJPEG Java wrapper, JDK 1.5 or later is required.  This
   can be downloaded from <http://www.java.com>.
@@ -497,47 +587,51 @@
 Out-of-Tree Builds
 ------------------
 
-Binary objects, libraries, and executables are generated in the same directory
-from which `cmake` was executed (the "binary directory"), and this directory
-need not necessarily be the same as the libjpeg-turbo source directory.  You
-can create multiple independent binary directories, in which different versions
-of libjpeg-turbo can be built from the same source tree using different
-compilers or settings.  In the sections below, *{build_directory}* refers to
-the binary directory, whereas *{source_directory}* refers to the libjpeg-turbo
-source directory.  For in-tree builds, these directories are the same.
+Binary objects, libraries, and executables are generated in the directory from
+which CMake is executed (the "binary directory"), and this directory need not
+necessarily be the same as the libjpeg-turbo source directory.  You can create
+multiple independent binary directories, in which different versions of
+libjpeg-turbo can be built from the same source tree using different compilers
+or settings.  In the sections below, *{build_directory}* refers to the binary
+directory, whereas *{source_directory}* refers to the libjpeg-turbo source
+directory.  For in-tree builds, these directories are the same.
 
 
-Building libjpeg-turbo
-----------------------
+Build Procedure
+---------------
+
+NOTE: The build procedures below assume that CMake is invoked from the command
+line, but all of these procedures can be adapted to the CMake GUI as
+well.
 
 
 ### Visual C++ (Command Line)
 
     cd {build_directory}
-    cmake -G "NMake Makefiles" -DCMAKE_BUILD_TYPE=Release {source_directory}
+    cmake -G"NMake Makefiles" -DCMAKE_BUILD_TYPE=Release [additional CMake flags] {source_directory}
     nmake
 
 This will build either a 32-bit or a 64-bit version of libjpeg-turbo, depending
-on which version of cl.exe is in the `PATH`.
+on which version of **cl.exe** is in the `PATH`.
 
 The following files will be generated under *{build_directory}*:
 
-**jpeg-static.lib**  
+**jpeg-static.lib**<br>
 Static link library for the libjpeg API
 
-**sharedlib/jpeg{version}.dll**  
+**sharedlib/jpeg{version}.dll**<br>
 DLL for the libjpeg API
 
-**sharedlib/jpeg.lib**  
+**sharedlib/jpeg.lib**<br>
 Import library for the libjpeg API
 
-**turbojpeg-static.lib**  
+**turbojpeg-static.lib**<br>
 Static link library for the TurboJPEG API
 
-**turbojpeg.dll**  
+**turbojpeg.dll**<br>
 DLL for the TurboJPEG API
 
-**turbojpeg.lib**  
+**turbojpeg.lib**<br>
 Import library for the TurboJPEG API
 
 *{version}* is 62, 7, or 8, depending on whether libjpeg v6b (default), v7, or
@@ -551,35 +645,34 @@
 instance:
 
     cd {build_directory}
-    cmake -G "Visual Studio 10" {source_directory}
+    cmake -G"Visual Studio 10" [additional CMake flags] {source_directory}
 
-NOTE:  Add "Win64" to the generator name (for example, "Visual Studio 10
-Win64") to build a 64-bit version of libjpeg-turbo.  Recent versions of CMake
-no longer document that.  A separate build directory must be used for 32-bit
-and 64-bit builds.
+NOTE: Add "Win64" to the generator name (for example, "Visual Studio 10
+Win64") to build a 64-bit version of libjpeg-turbo.  A separate build directory
+must be used for 32-bit and 64-bit builds.
 
-You can then open ALL_BUILD.vcproj in Visual Studio and build one of the
+You can then open **ALL_BUILD.vcproj** in Visual Studio and build one of the
 configurations in that project ("Debug", "Release", etc.) to generate a full
 build of libjpeg-turbo.
 
 This will generate the following files under *{build_directory}*:
 
-**{configuration}/jpeg-static.lib**  
+**{configuration}/jpeg-static.lib**<br>
 Static link library for the libjpeg API
 
-**sharedlib/{configuration}/jpeg{version}.dll**  
+**sharedlib/{configuration}/jpeg{version}.dll**<br>
 DLL for the libjpeg API
 
-**sharedlib/{configuration}/jpeg.lib**  
+**sharedlib/{configuration}/jpeg.lib**<br>
 Import library for the libjpeg API
 
-**{configuration}/turbojpeg-static.lib**  
+**{configuration}/turbojpeg-static.lib**<br>
 Static link library for the TurboJPEG API
 
-**{configuration}/turbojpeg.dll**  
+**{configuration}/turbojpeg.dll**<br>
 DLL for the TurboJPEG API
 
-**{configuration}/turbojpeg.lib**  
+**{configuration}/turbojpeg.lib**<br>
 Import library for the TurboJPEG API
 
 *{configuration}* is Debug, Release, RelWithDebInfo, or MinSizeRel, depending
@@ -589,31 +682,32 @@
 
 ### MinGW
 
-NOTE: This assumes that you are building on a Windows machine.  If you are
-cross-compiling on a Linux/Unix machine, then see "Build Recipes" below.
+NOTE: This assumes that you are building on a Windows machine using the MSYS
+environment.  If you are cross-compiling on a Un*x platform (including Mac and
+Cygwin), then see "Build Recipes" below.
 
     cd {build_directory}
-    cmake -G "MinGW Makefiles" {source_directory}
-    mingw32-make
+    cmake -G"MSYS Makefiles" [additional CMake flags] {source_directory}
+    make
 
 This will generate the following files under *{build_directory}*:
 
-**libjpeg.a**  
+**libjpeg.a**<br>
 Static link library for the libjpeg API
 
-**sharedlib/libjpeg-{version}.dll**  
+**sharedlib/libjpeg-{version}.dll**<br>
 DLL for the libjpeg API
 
-**sharedlib/libjpeg.dll.a**  
+**sharedlib/libjpeg.dll.a**<br>
 Import library for the libjpeg API
 
-**libturbojpeg.a**  
+**libturbojpeg.a**<br>
 Static link library for the TurboJPEG API
 
-**libturbojpeg.dll**  
+**libturbojpeg.dll**<br>
 DLL for the TurboJPEG API
 
-**libturbojpeg.dll.a**  
+**libturbojpeg.dll.a**<br>
 Import library for the TurboJPEG API
 
 *{version}* is 62, 7, or 8, depending on whether libjpeg v6b (default), v7, or
@@ -622,24 +716,24 @@
 
 ### Debug Build
 
-Add `-DCMAKE_BUILD_TYPE=Debug` to the `cmake` command line.  Or, if building
+Add `-DCMAKE_BUILD_TYPE=Debug` to the CMake command line.  Or, if building
 with NMake, remove `-DCMAKE_BUILD_TYPE=Release` (Debug builds are the default
 with NMake.)
 
 
 ### libjpeg v7 or v8 API/ABI Emulation
 
-Add `-DWITH_JPEG7=1` to the `cmake` command line to build a version of
+Add `-DWITH_JPEG7=1` to the CMake command line to build a version of
 libjpeg-turbo that is API/ABI-compatible with libjpeg v7.  Add `-DWITH_JPEG8=1`
-to the `cmake` command line to build a version of libjpeg-turbo that is
+to the CMake command line to build a version of libjpeg-turbo that is
 API/ABI-compatible with libjpeg v8.  See [README.md](README.md) for more
-information on libjpeg v7 and v8 emulation.
+information about libjpeg v7 and v8 emulation.
 
 
 ### In-Memory Source/Destination Managers
 
 When using libjpeg v6b or v7 API/ABI emulation, add `-DWITH_MEM_SRCDST=0` to
-the `cmake` command line to build a version of libjpeg-turbo that lacks the
+the CMake command line to build a version of libjpeg-turbo that lacks the
 `jpeg_mem_src()` and `jpeg_mem_dest()` functions.  These functions were not
 part of the original libjpeg v6b and v7 APIs, so removing them ensures strict
 conformance with those APIs.  See [README.md](README.md) for more information.
@@ -652,98 +746,105 @@
 based on the implementation in libjpeg v8, but it works when emulating libjpeg
 v7 or v6b as well.  The default is to enable both arithmetic encoding and
 decoding, but those who have philosophical objections to arithmetic coding can
-add `-DWITH_ARITH_ENC=0` or `-DWITH_ARITH_DEC=0` to the `cmake` command line to
+add `-DWITH_ARITH_ENC=0` or `-DWITH_ARITH_DEC=0` to the CMake command line to
 disable encoding or decoding (respectively.)
 
 
 ### TurboJPEG Java Wrapper
 
-Add `-DWITH_JAVA=1` to the `cmake` command line to incorporate an optional Java
-Native Interface wrapper into the TurboJPEG shared library and build the Java
-front-end classes to support it.  This allows the TurboJPEG shared library to
-be used directly from Java applications.  See [java/README](java/README) for
+Add `-DWITH_JAVA=1` to the CMake command line to incorporate an optional Java
+Native Interface (JNI) wrapper into the TurboJPEG shared library and build the
+Java front-end classes to support it.  This allows the TurboJPEG shared library
+to be used directly from Java applications.  See [java/README](java/README) for
 more details.
 
-You can set the `Java_JAVAC_EXECUTABLE`, `Java_JAVA_EXECUTABLE`, and
-`Java_JAR_EXECUTABLE` CMake variables to specify alternate commands or
-locations for javac, jar, and java (respectively.)  You can also set the
-`JAVACFLAGS` CMake variable to specify arguments that should be passed to the
-Java compiler when building the front-end classes.
-
-
-Installing libjpeg-turbo
-------------------------
-
-You can use the build system to install libjpeg-turbo into a directory of your
-choosing (as opposed to creating an installer.)  To do this, add:
-
-    -DCMAKE_INSTALL_PREFIX={install_directory}
-
-to the cmake command line.
-
-For example,
-
-    cmake -G "NMake Makefiles" -DCMAKE_BUILD_TYPE=Release \
-      -DCMAKE_INSTALL_PREFIX=c:\libjpeg-turbo {source_directory}
-    nmake install
-
-will install the header files in c:\libjpeg-turbo\include, the library files
-in c:\libjpeg-turbo\lib, the DLL's in c:\libjpeg-turbo\bin, and the
-documentation in c:\libjpeg-turbo\doc.
+If Java is not in your `PATH`, or if you wish to use an alternate JDK to
+build/test libjpeg-turbo, then (prior to running CMake) set the `JAVA_HOME`
+environment variable to the location of the JDK that you wish to use.  The
+`Java_JAVAC_EXECUTABLE`, `Java_JAVA_EXECUTABLE`, and `Java_JAR_EXECUTABLE`
+CMake variables can also be used to specify alternate commands or locations for
+javac, jar, and java (respectively.)  You can also set the `JAVACFLAGS` CMake
+variable to specify arguments that should be passed to the Java compiler when
+building the TurboJPEG classes.
 
 
 Build Recipes
 -------------
 
 
-### 64-bit MinGW Build on Cygwin
+### 32-bit MinGW Build on Un*x (including Mac and Cygwin)
+
+Create a file called **toolchain.cmake** under *{build_directory}*, with the
+following contents:
+
+    set(CMAKE_SYSTEM_NAME Windows)
+    set(CMAKE_SYSTEM_PROCESSOR X86)
+    set(CMAKE_C_COMPILER {mingw_binary_path}/i686-w64-mingw32-gcc)
+    set(CMAKE_RC_COMPILER {mingw_binary_path}/i686-w64-mingw32-windres)
+
+*{mingw\_binary\_path}* is the directory under which the MinGW binaries are
+located (usually **/usr/bin**.)  Next, execute the following commands:
 
     cd {build_directory}
-    CC=/usr/bin/x86_64-w64-mingw32-gcc \
-      cmake -G "Unix Makefiles" -DCMAKE_SYSTEM_NAME=Windows \
-      -DCMAKE_RC_COMPILER=/usr/bin/x86_64-w64-mingw32-windres.exe \
-      {source_directory}
-    make
-
-This produces a 64-bit build of libjpeg-turbo that does not depend on
-cygwin1.dll or other Cygwin DLL's.  The mingw64-x86\_64-gcc-core and
-mingw64-x86\_64-gcc-g++ packages (and their dependencies) must be installed.
-
-
-### 32-bit MinGW Build on Cygwin
-
-     cd {build_directory}
-     CC=/usr/bin/i686-w64-mingw32-gcc \
-       cmake -G "Unix Makefiles" -DCMAKE_SYSTEM_NAME=Windows \
-       -DCMAKE_RC_COMPILER=/usr/bin/i686-w64-mingw32-windres.exe \
-       {source_directory}
-     make
-
-This produces a 32-bit build of libjpeg-turbo that does not depend on
-cygwin1.dll or other Cygwin DLL's.  The mingw64-i686-gcc-core and
-mingw64-i686-gcc-g++ packages (and their dependencies) must be installed.
-
-
-### MinGW Build on Linux
-
-    cd {build_directory}
-    CC={mingw_binary_path}/i686-pc-mingw32-gcc \
-      cmake -G "Unix Makefiles" -DCMAKE_SYSTEM_NAME=Windows \
-      -DCMAKE_RC_COMPILER={mingw_binary_path}/i686-pc-mingw32-windres \
-      -DCMAKE_AR={mingw_binary_path}/i686-pc-mingw32-ar \
-      -DCMAKE_RANLIB={mingw_binary_path}/i686-pc-mingw32-ranlib \
-      {source_directory}
+    cmake -G"Unix Makefiles" -DCMAKE_TOOLCHAIN_FILE=toolchain.cmake \
+      [additional CMake flags] {source_directory}
     make
 
 
-Creating Release Packages
-=========================
+### 64-bit MinGW Build on Un*x (including Mac and Cygwin)
 
-The following commands can be used to create various types of release packages:
+Create a file called **toolchain.cmake** under *{build_directory}*, with the
+following contents:
+
+    set(CMAKE_SYSTEM_NAME Windows)
+    set(CMAKE_SYSTEM_PROCESSOR AMD64)
+    set(CMAKE_C_COMPILER {mingw_binary_path}/x86_64-w64-mingw32-gcc)
+    set(CMAKE_RC_COMPILER {mingw_binary_path}/x86_64-w64-mingw32-windres)
+
+*{mingw\_binary\_path}* is the directory under which the MinGW binaries are
+located (usually **/usr/bin**.)  Next, execute the following commands:
+
+    cd {build_directory}
+    cmake -G"Unix Makefiles" -DCMAKE_TOOLCHAIN_FILE=toolchain.cmake \
+      [additional CMake flags] {source_directory}
+    make
 
 
-Unix/Linux
-----------
+Installing libjpeg-turbo
+------------------------
+
+You can use the build system to install libjpeg-turbo (as opposed to creating
+an installer package.)  To do this, run `make install` or `nmake install`
+(or build the "install" target in the Visual Studio IDE.)  Running
+`make uninstall` or `nmake uninstall` (or building the "uninstall" target in
+the Visual Studio IDE) will uninstall libjpeg-turbo.
+
+The `CMAKE_INSTALL_PREFIX` CMake variable can be modified in order to install
+libjpeg-turbo into a directory of your choosing.  If you don't specify
+`CMAKE_INSTALL_PREFIX`, then the default is:
+
+**c:\libjpeg-turbo**<br>
+Visual Studio 32-bit build
+
+**c:\libjpeg-turbo64**<br>
+Visual Studio 64-bit build
+
+**c:\libjpeg-turbo-gcc**<br>
+MinGW 32-bit build
+
+**c:\libjpeg-turbo-gcc64**<br>
+MinGW 64-bit build
+
+
+Creating Distribution Packages
+==============================
+
+The following commands can be used to create various types of distribution
+packages:
+
+
+Linux
+-----
 
     make rpm
 
@@ -758,43 +859,47 @@
 
 Create Debian-style binary package.  Requires dpkg.
 
+
+Mac
+---
+
     make dmg
 
-Create Macintosh package/disk image.  This requires pkgbuild and
-productbuild, which are installed by default on OS X 10.7 and later and which
-can be obtained by installing Xcode 3.2.6 (with the "Unix Development"
-option) on OS X 10.6.  Packages built in this manner can be installed on OS X
-10.5 and later, but they must be built on OS X 10.6 or later.
+Create Mac package/disk image.  This requires pkgbuild and productbuild, which
+are installed by default on OS X 10.7 and later and which can be obtained by
+installing Xcode 3.2.6 (with the "Unix Development" option) on OS X 10.6.
+Packages built in this manner can be installed on OS X 10.5 and later, but they
+must be built on OS X 10.6 or later.
 
     make udmg [BUILDDIR32={32-bit build directory}]
 
-On 64-bit OS X systems, this creates a Macintosh package and disk image that
-contains universal i386/x86-64 binaries.  You should first configure a 32-bit
-out-of-tree build of libjpeg-turbo, then configure a 64-bit out-of-tree
-build, then run `make udmg` from the 64-bit build directory.  The build
-system will look for the 32-bit build under *{source_directory}*/osxx86 by
-default, but you can override this by setting the `BUILDDIR32` variable on the
-make command line as shown above.
+On 64-bit OS X systems, this creates a Mac package/disk image that contains
+universal i386/x86-64 binaries.  You should first configure a 32-bit
+out-of-tree build of libjpeg-turbo, then configure a 64-bit out-of-tree build,
+then run `make udmg` from the 64-bit build directory.  The build system will
+look for the 32-bit build under *{source_directory}*/osxx86 by default, but you
+can override this by setting the `BUILDDIR32` variable on the make command line
+as shown above.
 
     make iosdmg [BUILDDIR32={32-bit build directory}] \
       [BUILDDIRARMV7={ARMv7 build directory}] \
       [BUILDDIRARMV7S={ARMv7s build directory}] \
       [BUILDDIRARMV8={ARMv8 build directory}]
 
-On OS X systems, this creates a Macintosh package and disk image in which the
-libjpeg-turbo static libraries contain ARM architectures necessary to build
-iOS applications.  If building on an x86-64 system, the binaries will also
-contain the i386 architecture, as with `make udmg` above.  You should first
-configure ARMv7, ARMv7s, and/or ARMv8 out-of-tree builds of libjpeg-turbo (see
-"Building libjpeg-turbo for iOS" above.)  If you are building an x86-64 version
-of libjpeg-turbo, you should configure a 32-bit out-of-tree build as well.
-Next, build libjpeg-turbo as you would normally, using an out-of-tree build.
-When it is built, run `make iosdmg` from the build directory.  The build system
-will look for the ARMv7 build under *{source_directory}*/iosarmv7 by default,
-the ARMv7s build under *{source_directory}*/iosarmv7s by default, the ARMv8
-build under *{source_directory}*/iosarmv8 by default, and (if applicable) the
-32-bit build under *{source_directory}*/osxx86 by default, but you can override
-this by setting the `BUILDDIR32`, `BUILDDIRARMV7`, `BUILDDIRARMV7S`, and/or
+This creates a Mac package/disk image in which the libjpeg-turbo libraries
+contain ARM architectures necessary to build iOS applications.  If building on
+an x86-64 system, the binaries will also contain the i386 architecture, as with
+`make udmg` above.  You should first configure ARMv7, ARMv7s, and/or ARMv8
+out-of-tree builds of libjpeg-turbo (see "Building libjpeg-turbo for iOS"
+above.)  If you are building an x86-64 version of libjpeg-turbo, you should
+configure a 32-bit out-of-tree build as well.  Next, build libjpeg-turbo as you
+would normally, using an out-of-tree build.  When it is built, run `make
+iosdmg` from the build directory.  The build system will look for the ARMv7
+build under *{source_directory}*/iosarmv7 by default, the ARMv7s build under
+*{source_directory}*/iosarmv7s by default, the ARMv8 build under
+*{source_directory}*/iosarmv8 by default, and (if applicable) the 32-bit build
+under *{source_directory}*/osxx86 by default, but you can override this by
+setting the `BUILDDIR32`, `BUILDDIRARMV7`, `BUILDDIRARMV7S`, and/or
 `BUILDDIRARMV8` variables on the `make` command line as shown above.
 
 NOTE: If including an ARMv8 build in the package, then you may need to use
@@ -819,38 +924,40 @@
     cd {build_directory}
     make installer
 
-If using the Visual Studio IDE, build the "installer" project.
+If using the Visual Studio IDE, build the "installer" target.
 
-The installer package (libjpeg-turbo[-gcc][64].exe) will be located under
-*{build_directory}*.  If building using the Visual Studio IDE, then the
-installer package will be located in a subdirectory with the same name as the
-configuration you built (such as *{build_directory}*\Debug\ or
+The installer package (libjpeg-turbo-*{version}*[-gcc|-vc][64].exe) will be
+located under *{build_directory}*.  If building using the Visual Studio IDE,
+then the installer package will be located in a subdirectory with the same name
+as the configuration you built (such as *{build_directory}*\Debug\ or
 *{build_directory}*\Release\).
 
-Building a Windows installer requires the Nullsoft Install System
-(http://nsis.sourceforge.net/.)  makensis.exe should be in your `PATH`.
+Building a Windows installer requires the
+[Nullsoft Install System](http://nsis.sourceforge.net/).  makensis.exe should
+be in your `PATH`.
 
 
 Regression testing
 ==================
 
-The most common way to test libjpeg-turbo is by invoking `make test` on
-Unix/Linux platforms or `ctest` on Windows platforms, once the build has
-completed.  This runs a series of tests to ensure that mathematical
-compatibility has been maintained between libjpeg-turbo and libjpeg v6b.  This
-also invokes the TurboJPEG unit tests, which ensure that the colorspace
-extensions, YUV encoding, decompression scaling, and other features of the
-TurboJPEG C and Java APIs are working properly (and, by extension, that the
-equivalent features of the underlying libjpeg API are also working.)
+The most common way to test libjpeg-turbo is by invoking `make test` (Un*x) or
+`nmake test` (Windows command line) or by building the "RUN_TESTS" target
+(Visual Studio IDE), once the build has completed.  This runs a series of tests
+to ensure that mathematical compatibility has been maintained between
+libjpeg-turbo and libjpeg v6b.  This also invokes the TurboJPEG unit tests,
+which ensure that the colorspace extensions, YUV encoding, decompression
+scaling, and other features of the TurboJPEG C and Java APIs are working
+properly (and, by extension, that the equivalent features of the underlying
+libjpeg API are also working.)
 
-Invoking `make testclean` or `nmake testclean` (if using NMake) or building
-the 'testclean' target (if using the Visual Studio IDE) will clean up the
-output images generated by `make test`.
+Invoking `make testclean` (Un*x) or `nmake testclean` (Windows command line) or
+building the "testclean" target (Visual Studio IDE) will clean up the output
+images generated by the tests.
 
-On Unix/Linux platforms, more extensive tests of the TurboJPEG C and Java
-wrappers can be run by invoking `make tjtest`.  These extended TurboJPEG tests
+On Un*x platforms, more extensive tests of the TurboJPEG C and Java wrappers
+can be run by invoking `make tjtest`.  These extended TurboJPEG tests
 essentially iterate through all of the available features of the TurboJPEG APIs
-that are not covered by the TurboJPEG unit tests (this includes the lossless
+that are not covered by the TurboJPEG unit tests (including the lossless
 transform options) and compare the images generated by each feature to images
 generated using the equivalent feature in the libjpeg API.  The extended
 TurboJPEG tests are meant to test for regressions in the TurboJPEG wrappers,
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 0c46130..fb5e182 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -9,7 +9,7 @@
 endif()
 
 project(libjpeg-turbo C)
-set(VERSION 1.5.1)
+set(VERSION 1.5.3)
 string(REPLACE "." ";" VERSION_TRIPLET ${VERSION})
 list(GET VERSION_TRIPLET 0 VERSION_MAJOR)
 list(GET VERSION_TRIPLET 1 VERSION_MINOR)
@@ -367,8 +367,20 @@
   set(MD5_PPM_GRAY_ISLOW 7213c10af507ad467da5578ca5ee1fca)
   set(MD5_PPM_GRAY_ISLOW_RGB e96ee81c30a6ed422d466338bd3de65d)
   set(MD5_JPEG_420S_IFAST_OPT 7af8e60be4d9c227ec63ac9b6630855e)
-  set(MD5_JPEG_3x2_FLOAT_PROG a8c17daf77b457725ec929e215b603f8)
-  set(MD5_PPM_3x2_FLOAT 42876ab9e5c2f76a87d08db5fbd57956)
+  if(64BIT)
+    # Windows/x64 uses SSE for floating point
+    set(MD5_JPEG_3x2_FLOAT_PROG a8c17daf77b457725ec929e215b603f8)
+    set(MD5_PPM_3x2_FLOAT 42876ab9e5c2f76a87d08db5fbd57956)
+  else()
+    # Windows/x86 uses the 387 FPU for floating point
+    if(MSVC)
+      set(MD5_JPEG_3x2_FLOAT_PROG e27840755870fa849872e58aa0cd1400)
+      set(MD5_PPM_3x2_FLOAT 6c2880b83bb1aa41dfe330e7a9768690)
+    else()
+      set(MD5_JPEG_3x2_FLOAT_PROG bc6dbbefac2872f6b9d6c4a0ae60c3c0)
+      set(MD5_PPM_3x2_FLOAT f58119ee294198ac9b4a9f5645a34266)
+    endif()
+  endif()
   set(MD5_PPM_420M_ISLOW_2_1 4ca6be2a6f326ff9eaab63e70a8259c0)
   set(MD5_PPM_420M_ISLOW_15_8 12aa9f9534c1b3d7ba047322226365eb)
   set(MD5_PPM_420M_ISLOW_13_8 f7e22817c7b25e1393e4ec101e9d4e96)
@@ -410,8 +422,18 @@
     set(MD5_JPEG_3x2_FLOAT_PROG 343e3f8caf8af5986ebaf0bdc13b5c71)
     set(MD5_PPM_3x2_FLOAT 1a75f36e5904d6fc3a85a43da9ad89bb)
   else()
-    set(MD5_JPEG_3x2_FLOAT_PROG 9bca803d2042bd1eb03819e2bf92b3e5)
-    set(MD5_PPM_3x2_FLOAT f6bfab038438ed8f5522fbd33595dcdc)
+    if(64BIT)
+      set(MD5_JPEG_3x2_FLOAT_PROG 9bca803d2042bd1eb03819e2bf92b3e5)
+      set(MD5_PPM_3x2_FLOAT f6bfab038438ed8f5522fbd33595dcdc)
+    else()
+      if(MSVC)
+        set(MD5_JPEG_3x2_FLOAT_PROG 7999ce9cd0ee9b6c7043b7351ab7639d)
+        set(MD5_PPM_3x2_FLOAT 28cdc448a6b75e97892f0e0f8d4b21f3)
+      else()
+        set(MD5_JPEG_3x2_FLOAT_PROG 1657664a410e0822c924b54f6f65e6e9)
+        set(MD5_PPM_3x2_FLOAT cb0a1f027f3d2917c902b5640214e025)
+      endif()
+    endif()
   endif()
   set(MD5_JPEG_420_ISLOW_ARI e986fb0a637a8d833d96e8a6d6d84ea1)
   set(MD5_JPEG_444_ISLOW_PROGARI 0a8f1c8f66e113c3cf635df0a475a617)
@@ -844,7 +866,7 @@
 
 endforeach()
 
-add_custom_target(testclean COMMAND ${MD5CMP} -P
+add_custom_target(testclean COMMAND ${CMAKE_COMMAND} -P
   ${CMAKE_SOURCE_DIR}/cmakescripts/testclean.cmake)
 
 
@@ -933,3 +955,8 @@
 install(FILES ${CMAKE_BINARY_DIR}/jconfig.h ${CMAKE_SOURCE_DIR}/jerror.h
   ${CMAKE_SOURCE_DIR}/jmorecfg.h ${CMAKE_SOURCE_DIR}/jpeglib.h
   DESTINATION include)
+
+configure_file("${CMAKE_SOURCE_DIR}/cmakescripts/cmake_uninstall.cmake.in"
+  "cmake_uninstall.cmake" IMMEDIATE @ONLY)
+
+add_custom_target(uninstall COMMAND ${CMAKE_COMMAND} -P cmake_uninstall.cmake)
diff --git a/ChangeLog.md b/ChangeLog.md
index 7f417a4..f5fe44b 100644
--- a/ChangeLog.md
+++ b/ChangeLog.md
@@ -1,3 +1,118 @@
+1.5.3
+=====
+
+### Significant changes relative to 1.5.2:
+
+1. Fixed a NullPointerException in the TurboJPEG Java wrapper that occurred
+when using the YUVImage constructor that creates an instance backed by separate
+image planes and allocates memory for the image planes.
+
+2. Fixed an issue whereby the Java version of TJUnitTest would fail when
+testing BufferedImage encoding/decoding on big endian systems.
+
+3. Fixed a segfault in djpeg that would occur if an output format other than
+PPM/PGM was selected along with the `-crop` option.  The `-crop` option now
+works with the GIF and Targa formats as well (unfortunately, it cannot be made
+to work with the BMP and RLE formats due to the fact that those output engines
+write scanlines in bottom-up order.)  djpeg will now exit gracefully if an
+output format other than PPM/PGM, GIF, or Targa is selected along with the
+`-crop` option.
+
+4. Fixed an issue whereby `jpeg_skip_scanlines()` would segfault if color
+quantization was enabled.
+
+5. TJBench (both C and Java versions) will now display usage information if any
+command-line argument is unrecognized.  This prevents the program from silently
+ignoring typos.
+
+6. Fixed an access violation in tjbench.exe (Windows) that occurred when the
+program was used to decompress an existing JPEG image.
+
+7. Fixed an ArrayIndexOutOfBoundsException in the TJExample Java program that
+occurred when attempting to decompress a JPEG image that had been compressed
+with 4:1:1 chrominance subsampling.
+
+8. Fixed an issue whereby, when using `jpeg_skip_scanlines()` to skip to the
+end of a single-scan (non-progressive) image, subsequent calls to
+`jpeg_consume_input()` would return `JPEG_SUSPENDED` rather than
+`JPEG_REACHED_EOI`.
+
+9. `jpeg_crop_scanlines()` now works correctly when decompressing grayscale
+JPEG images that were compressed with a sampling factor other than 1 (for
+instance, with `cjpeg -grayscale -sample 2x2`).
+
+
+1.5.2
+=====
+
+### Significant changes relative to 1.5.1:
+
+1. Fixed a regression introduced by 1.5.1[7] that prevented libjpeg-turbo from
+building with Android NDK platforms prior to android-21 (5.0).
+
+2. Fixed a regression introduced by 1.5.1[1] that prevented the MIPS DSPR2 SIMD
+code in libjpeg-turbo from building.
+
+3. Fixed a regression introduced by 1.5 beta1[11] that prevented the Java
+version of TJBench from outputting any reference images (the `-nowrite` switch
+was accidentally enabled by default.)
+
+4. libjpeg-turbo should now build and run with full AltiVec SIMD acceleration
+on PowerPC-based AmigaOS 4 and OpenBSD systems.
+
+5. Fixed build and runtime errors on Windows that occurred when building
+libjpeg-turbo with libjpeg v7 API/ABI emulation and the in-memory
+source/destination managers.  Due to an oversight, the `jpeg_skip_scanlines()`
+and `jpeg_crop_scanlines()` functions were not being included in jpeg7.dll when
+libjpeg-turbo was built with `-DWITH_JPEG7=1` and `-DWITH_MEMSRCDST=1`.
+
+6. Fixed "Bogus virtual array access" error that occurred when using the
+lossless crop feature in jpegtran or the TurboJPEG API, if libjpeg-turbo was
+built with libjpeg v7 API/ABI emulation.  This was apparently a long-standing
+bug that has existed since the introduction of libjpeg v7/v8 API/ABI emulation
+in libjpeg-turbo v1.1.
+
+7. The lossless transform features in jpegtran and the TurboJPEG API will now
+always attempt to adjust the EXIF image width and height tags if the image size
+changed as a result of the transform.  This behavior has always existed when
+using libjpeg v8 API/ABI emulation.  It was supposed to be available with
+libjpeg v7 API/ABI emulation as well but did not work properly due to a bug.
+Furthermore, there was never any good reason not to enable it with libjpeg v6b
+API/ABI emulation, since the behavior is entirely internal.  Note that
+`-copy all` must be passed to jpegtran in order to transfer the EXIF tags from
+the source image to the destination image.
+
+8. Fixed several memory leaks in the TurboJPEG API library that could occur
+if the library was built with certain compilers and optimization levels
+(known to occur with GCC 4.x and clang with `-O1` and higher but not with
+GCC 5.x or 6.x) and one of the underlying libjpeg API functions threw an error
+after a TurboJPEG API function allocated a local buffer.
+
+9. The libjpeg-turbo memory manager will now honor the `max_memory_to_use`
+structure member in jpeg\_memory\_mgr, which can be set to the maximum amount
+of memory (in bytes) that libjpeg-turbo should use during decompression or
+multi-pass (including progressive) compression.  This limit can also be set
+using the `JPEGMEM` environment variable or using the `-maxmemory` switch in
+cjpeg/djpeg/jpegtran (refer to the respective man pages for more details.)
+This has been a documented feature of libjpeg since v5, but the
+`malloc()`/`free()` implementation of the memory manager (jmemnobs.c) never
+implemented the feature.  Restricting libjpeg-turbo's memory usage is useful
+for two reasons:  it allows testers to more easily work around the 2 GB limit
+in libFuzzer, and it allows developers of security-sensitive applications to
+more easily defend against one of the progressive JPEG exploits (LJT-01-004)
+identified in
+[this report](http://www.libjpeg-turbo.org/pmwiki/uploads/About/TwoIssueswiththeJPEGStandard.pdf).
+
+10. TJBench will now run each benchmark for 1 second prior to starting the
+timer, in order to improve the consistency of the results.  Furthermore, the
+`-warmup` option is now used to specify the amount of warmup time rather than
+the number of warmup iterations.
+
+11. Fixed an error (`short jump is out of range`) that occurred when assembling
+the 32-bit x86 SIMD extensions with NASM versions prior to 2.04.  This was a
+regression introduced by 1.5 beta1[12].
+
+
 1.5.1
 =====
 
diff --git a/LICENSE.md b/LICENSE.md
index 4623e29..0572390 100644
--- a/LICENSE.md
+++ b/LICENSE.md
@@ -9,12 +9,11 @@
   This license applies to the libjpeg API library and associated programs
   (any code inherited from libjpeg, and any modifications to that code.)
 
-- The Modified (3-clause) BSD License, which is listed in
-  [turbojpeg.c](turbojpeg.c)
+- The Modified (3-clause) BSD License, which is listed below
 
   This license covers the TurboJPEG API library and associated programs.
 
-- The zlib License, which is listed in [simd/jsimdext.inc](simd/jsimdext.inc)
+- The zlib License, which is listed below
 
   This license is a subset of the other two, and it covers the libjpeg-turbo
   SIMD extensions.
@@ -86,3 +85,55 @@
     - IJG License
     - Modified BSD License
     - zlib License
+
+
+The Modified (3-clause) BSD License
+===================================
+
+Copyright (C)\<YEAR\> \<AUTHOR\>.  All Rights Reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+- Redistributions of source code must retain the above copyright notice,
+  this list of conditions and the following disclaimer.
+- Redistributions in binary form must reproduce the above copyright notice,
+  this list of conditions and the following disclaimer in the documentation
+  and/or other materials provided with the distribution.
+- Neither the name of the libjpeg-turbo Project nor the names of its
+  contributors may be used to endorse or promote products derived from this
+  software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS",
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+
+
+The zlib License
+================
+
+Copyright (C) \<YEAR\>, \<AUTHOR\>.
+
+This software is provided 'as-is', without any express or implied
+warranty.  In no event will the authors be held liable for any damages
+arising from the use of this software.
+
+Permission is granted to anyone to use this software for any purpose,
+including commercial applications, and to alter it and redistribute it
+freely, subject to the following restrictions:
+
+1. The origin of this software must not be misrepresented; you must not
+   claim that you wrote the original software. If you use this software
+   in a product, an acknowledgment in the product documentation would be
+   appreciated but is not required.
+2. Altered source versions must be plainly marked as such, and must not be
+   misrepresented as being the original software.
+3. This notice may not be removed or altered from any source distribution.
diff --git a/Makefile.am b/Makefile.am
index 32c239c..8043f09 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -173,7 +173,7 @@
 EXTRA_DIST = win release $(DOCS) testimages CMakeLists.txt \
 	sharedlib/CMakeLists.txt cmakescripts libjpeg.map.in doc doxygen.config \
 	doxygen-extra.css jccolext.c jdcolext.c jdcol565.c jdmrgext.c jdmrg565.c \
-	jstdhuff.c jdcoefct.h jdmainct.h jdmaster.h jdsample.h wrppm.h \
+	jstdhuff.c jdcoefct.h jdmainct.h jdmaster.h jdsample.h \
 	md5/CMakeLists.txt
 
 dist-hook:
@@ -202,6 +202,8 @@
 MD5_JPEG_3x2_FLOAT_PROG_32BIT = a8c17daf77b457725ec929e215b603f8
 MD5_PPM_3x2_FLOAT_32BIT = 42876ab9e5c2f76a87d08db5fbd57956
 MD5_PPM_3x2_FLOAT_64BIT = d6fbc71153b3d8ded484dbc17c7b9cf4
+MD5_JPEG_3x2_FLOAT_PROG_387 = bc6dbbefac2872f6b9d6c4a0ae60c3c0
+MD5_PPM_3x2_FLOAT_387 = bcc5723c61560463ac60f772e742d092
 MD5_JPEG_3x2_IFAST_PROG = 1396cc2b7185cfe943d408c9d305339e
 MD5_PPM_3x2_IFAST = 3975985ef6eeb0a2cdc58daa651ccc00
 MD5_PPM_420M_ISLOW_2_1 = 4ca6be2a6f326ff9eaab63e70a8259c0
@@ -249,6 +251,8 @@
 MD5_JPEG_3x2_FLOAT_PROG_32BIT = 9bca803d2042bd1eb03819e2bf92b3e5
 MD5_PPM_3x2_FLOAT_32BIT = f6bfab038438ed8f5522fbd33595dcdc
 MD5_PPM_3x2_FLOAT_64BIT = 0e917a34193ef976b679a6b069b1be26
+MD5_JPEG_3x2_FLOAT_PROG_387 = 1657664a410e0822c924b54f6f65e6e9
+MD5_PPM_3x2_FLOAT_387 = cb0a1f027f3d2917c902b5640214e025
 MD5_JPEG_3x2_IFAST_PROG = 1ee5d2c1a77f2da495f993c8c7cceca5
 MD5_PPM_3x2_IFAST = fd283664b3b49127984af0a7f118fccd
 MD5_JPEG_420_ISLOW_ARI = e986fb0a637a8d833d96e8a6d6d84ea1
@@ -337,7 +341,7 @@
 # Test compressing from/decompressing to an arbitrary subregion of a larger
 # image buffer
 	cp $(srcdir)/testimages/testorig.ppm testout_tile.ppm
-	./tjbench testout_tile.ppm 95 -rgb -quiet -tile -benchtime 0.01 >/dev/null 2>&1
+	./tjbench testout_tile.ppm 95 -rgb -quiet -tile -benchtime 0.01 -warmup 0 >/dev/null 2>&1
 	for i in 8 16 32 64 128; do \
 		md5/md5cmp $(MD5_PPM_GRAY_TILE) testout_tile_GRAY_Q95_$$i\x$$i.ppm; \
 	done
@@ -356,7 +360,7 @@
 	done
 	rm -f testout_tile_GRAY_* testout_tile_420_* testout_tile_422_* testout_tile_444_*
 
-	./tjbench testout_tile.ppm 95 -rgb -fastupsample -quiet -tile -benchtime 0.01 >/dev/null 2>&1
+	./tjbench testout_tile.ppm 95 -rgb -fastupsample -quiet -tile -benchtime 0.01 -warmup 0 >/dev/null 2>&1
 	md5/md5cmp $(MD5_PPM_420M_8x8_TILE) testout_tile_420_Q95_8x8.ppm
 	for i in 16 32 64 128; do \
 		md5/md5cmp $(MD5_PPM_420M_TILE) testout_tile_420_Q95_$$i\x$$i.ppm; \
@@ -481,6 +485,9 @@
 #                  x86-64 compilers)
 # FLOATTEST=64bit  validate against the exepected results from the C code
 #                  when running on a 64-bit FPU
+# FLOATTEST=387  validate against the expected results from the C code when
+#                the 387 FPU is being used for floating point math (which is
+#                generally the default with x86 compilers)
 
 # CC: RGB->YCC  SAMP: fullsize/int  FDCT: float  ENT: prog huff
 	./cjpeg -sample 3x2 -dct float -prog -outfile testout_3x2_float_prog.jpg $(srcdir)/testimages/testorig.ppm
@@ -488,6 +495,8 @@
 		md5/md5cmp $(MD5_JPEG_3x2_FLOAT_PROG_SSE) testout_3x2_float_prog.jpg; \
 	elif [ "${FLOATTEST}" = "32bit" -o "${FLOATTEST}" = "64bit" ]; then \
 		md5/md5cmp $(MD5_JPEG_3x2_FLOAT_PROG_32BIT) testout_3x2_float_prog.jpg; \
+	elif [ "${FLOATTEST}" = "387" ]; then \
+		md5/md5cmp $(MD5_JPEG_3x2_FLOAT_PROG_387) testout_3x2_float_prog.jpg; \
 	fi
 # CC: YCC->RGB  SAMP: fullsize/int  IDCT: float  ENT: prog huff
 	./djpeg -dct float -outfile testout_3x2_float.ppm testout_3x2_float_prog.jpg
@@ -497,6 +506,8 @@
 		md5/md5cmp $(MD5_PPM_3x2_FLOAT_32BIT) testout_3x2_float.ppm; \
 	elif [ "${FLOATTEST}" = "64bit" ]; then \
 		md5/md5cmp $(MD5_PPM_3x2_FLOAT_64BIT) testout_3x2_float.ppm; \
+	elif [ "${FLOATTEST}" = "387" ]; then \
+		md5/md5cmp $(MD5_PPM_3x2_FLOAT_387) testout_3x2_float.ppm; \
 	fi
 	rm -f testout_3x2_float.ppm testout_3x2_float_prog.jpg
 
@@ -686,6 +697,8 @@
 	rm -f *_411_*.ppm
 	rm -f *_411_*.jpg
 	rm -f *_411.yuv
+	rm -f tjbenchtest*.log
+	rm -f tjexampletest*.log
 
 
 tjtest:
diff --git a/README.md b/README.md
index ca8866e..74e6eac 100644
--- a/README.md
+++ b/README.md
@@ -42,7 +42,7 @@
 libjpeg-turbo includes two APIs that can be used to compress and decompress
 JPEG images:
 
-- **TurboJPEG API**  
+- **TurboJPEG API**<br>
   This API provides an easy-to-use interface for compressing and decompressing
   JPEG images in memory.  It also provides some functionality that would not be
   straightforward to achieve using the underlying libjpeg API, such as
@@ -50,7 +50,7 @@
   transforms on an image.  The Java interface for libjpeg-turbo is written on
   top of the TurboJPEG API.
 
-- **libjpeg API**  
+- **libjpeg API**<br>
   This is the de facto industry-standard API for compressing and decompressing
   JPEG images.  It is more difficult to use than the TurboJPEG API but also
   more powerful.  The libjpeg API implementation in libjpeg-turbo is both
@@ -141,17 +141,17 @@
 
 #### Fully supported
 
-- **libjpeg: IDCT scaling extensions in decompressor**  
+- **libjpeg: IDCT scaling extensions in decompressor**<br>
   libjpeg-turbo supports IDCT scaling with scaling factors of 1/8, 1/4, 3/8,
   1/2, 5/8, 3/4, 7/8, 9/8, 5/4, 11/8, 3/2, 13/8, 7/4, 15/8, and 2/1 (only 1/4
   and 1/2 are SIMD-accelerated.)
 
 - **libjpeg: Arithmetic coding**
 
-- **libjpeg: In-memory source and destination managers**  
+- **libjpeg: In-memory source and destination managers**<br>
   See notes below.
 
-- **cjpeg: Separate quality settings for luminance and chrominance**  
+- **cjpeg: Separate quality settings for luminance and chrominance**<br>
   Note that the libpjeg v7+ API was extended to accommodate this feature only
   for convenience purposes.  It has always been possible to implement this
   feature with libjpeg v6b (see rdswitch.c for an example.)
@@ -180,14 +180,14 @@
 but it is the general belief of our project that these features have not
 demonstrated sufficient usefulness to justify inclusion in libjpeg-turbo.
 
-- **libjpeg: DCT scaling in compressor**  
+- **libjpeg: DCT scaling in compressor**<br>
   `cinfo.scale_num` and `cinfo.scale_denom` are silently ignored.
   There is no technical reason why DCT scaling could not be supported when
   emulating the libjpeg v7+ API/ABI, but without the SmartScale extension (see
   below), only scaling factors of 1/2, 8/15, 4/7, 8/13, 2/3, 8/11, 4/5, and
   8/9 would be available, which is of limited usefulness.
 
-- **libjpeg: SmartScale**  
+- **libjpeg: SmartScale**<br>
   `cinfo.block_size` is silently ignored.
   SmartScale is an extension to the JPEG format that allows for DCT block
   sizes other than 8x8.  Providing support for this new format would be
@@ -200,15 +200,15 @@
   interest in providing this feature would be as a means of supporting
   additional DCT scaling factors.
 
-- **libjpeg: Fancy downsampling in compressor**  
+- **libjpeg: Fancy downsampling in compressor**<br>
   `cinfo.do_fancy_downsampling` is silently ignored.
   This requires the DCT scaling feature, which is not supported.
 
-- **jpegtran: Scaling**  
+- **jpegtran: Scaling**<br>
   This requires both the DCT scaling and SmartScale features, which are not
   supported.
 
-- **Lossless RGB JPEG files**  
+- **Lossless RGB JPEG files**<br>
   This requires the SmartScale feature, which is not supported.
 
 ### What About libjpeg v9?
@@ -226,7 +226,7 @@
 existing, standard lossless formats.  Therefore, at this time it is our belief
 that there is not sufficient technical justification for software projects to
 upgrade from libjpeg v8 to libjpeg v9, and thus there is not sufficient
-echnical justification for us to emulate the libjpeg v9 ABI.
+technical justification for us to emulate the libjpeg v9 ABI.
 
 In-Memory Source/Destination Managers
 -------------------------------------
@@ -249,8 +249,8 @@
 libjpeg v8 API/ABI.
 
 On Un*x systems, including the in-memory source/destination managers changes
-the dynamic library version from 62.0.0 to 62.1.0 if using libjpeg v6b API/ABI
-emulation and from 7.0.0 to 7.1.0 if using libjpeg v7 API/ABI emulation.
+the dynamic library version from 62.1.0 to 62.2.0 if using libjpeg v6b API/ABI
+emulation and from 7.1.0 to 7.2.0 if using libjpeg v7 API/ABI emulation.
 
 Note that, on most Un*x systems, the dynamic linker will not look for a
 function in a library until that function is actually used.  Thus, if a program
diff --git a/acinclude.m4 b/acinclude.m4
index 2c90762..113169f 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -252,3 +252,36 @@
     $2
   fi
 ])
+
+# AC_CHECK_ALTIVEC
+# ----------------
+# Test whether AltiVec intrinsics are supported
+AC_DEFUN([AC_CHECK_ALTIVEC],[
+  ac_save_CFLAGS="$CFLAGS"
+  CFLAGS="$CFLAGS -maltivec"
+  AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
+    #include <altivec.h>
+    int main(void) {
+      __vector int vi = { 0, 0, 0, 0 };
+      int i[4];
+      vec_st(vi, 0, i);
+      return i[0];
+    }]])], ac_has_altivec=yes)
+  CFLAGS="$ac_save_CFLAGS"
+  if test "x$ac_has_altivec" = "xyes" ; then
+    $1
+  else
+    $2
+  fi
+])
+
+AC_DEFUN([AC_NO_SIMD],[
+  AC_MSG_RESULT([no ("$1")])
+  with_simd=no;
+  if test "x${require_simd}" = "xyes"; then
+    AC_MSG_ERROR([SIMD support not available for this CPU.])
+  else
+    AC_MSG_WARN([SIMD support not available for this CPU.  Performance will\
+ suffer.])
+  fi
+])
diff --git a/appveyor.yml b/appveyor.yml
new file mode 100644
index 0000000..4f2d6cc
--- /dev/null
+++ b/appveyor.yml
@@ -0,0 +1,57 @@
+install:
+  - cmd: >-
+      mkdir c:\installers
+
+      mkdir c:\temp
+
+      curl -fSL -o c:\installers\nasm-2.10.01-win32.zip http://www.nasm.us/pub/nasm/releasebuilds/2.10.01/win32/nasm-2.10.01-win32.zip
+
+      7z x c:\installers\nasm-2.10.01-win32.zip -oc:\ > c:\installers\nasm.install.log
+
+      set INCLUDE=c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include;c:\Program Files (x86)\Microsoft SDKs\Windows\v7.1A\include
+
+      set LIB=c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\lib\amd64;c:\Program Files (x86)\Microsoft SDKs\Windows\v7.1A\lib\x64
+
+      set PATH=c:\nasm-2.10.01;c:\Program Files (x86)\NSIS;c:\msys64\mingw32\bin;c:\msys64\usr\bin;c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\amd64;c:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE;c:\Program Files (x86)\Microsoft SDKs\Windows\v7.1A\bin\x64;c:\Program Files (x86)\Microsoft SDKs\Windows\v7.1A\bin;%PATH%
+
+      set MSYSTEM=MINGW32
+
+      bash -c "pacman --noconfirm -S autoconf automake libtool zip"
+
+      mklink /d "%ProgramData%\Oracle\Java32" "c:\Program Files (x86)\Java\jdk1.6.0"
+
+      git clone --depth=1 https://github.com/libjpeg-turbo/buildscripts.git c:/buildscripts
+
+build_script:
+  - cmd: >-
+      for /f %%i in ('"cygpath %CD%"') do set MINGWPATH=%%i
+
+      bash c:/buildscripts/buildljt -r file://%MINGWPATH% -b /c/ljt.nightly %APPVEYOR_REPO_BRANCH% -v
+
+      move c:\ljt.nightly\files\*.tar.gz .
+
+      move c:\ljt.nightly\files\*.exe .
+
+      move c:\ljt.nightly\files\*.zip .
+
+      move c:\ljt.nightly\log-windows.txt .
+
+artifacts:
+  - path: '*.tar.gz'
+    name: Source tarball
+
+  - path: '*-gcc*.exe'
+    name: SDK for MinGW
+
+  - path: '*-vc*.exe'
+    name: SDK for Visual C++
+
+  - path: '*.zip'
+    name: Windows JNI JARs
+
+  - path: 'log-windows.txt'
+    name: Build log
+
+test: off
+
+deploy: off
diff --git a/cdjpeg.h b/cdjpeg.h
index a65310e..bb49fbf 100644
--- a/cdjpeg.h
+++ b/cdjpeg.h
@@ -3,8 +3,8 @@
  *
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1994-1997, Thomas G. Lane.
- * It was modified by The libjpeg-turbo Project to include only code relevant
- * to libjpeg-turbo.
+ * libjpeg-turbo Modifications:
+ * Copyright (C) 2017, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
  *
@@ -54,6 +54,14 @@
                           JDIMENSION rows_supplied);
   /* Finish up at the end of the image. */
   void (*finish_output) (j_decompress_ptr cinfo, djpeg_dest_ptr dinfo);
+  /* Re-calculate buffer dimensions based on output dimensions (for use with
+     partial image decompression.)  If this is NULL, then the output format
+     does not support partial image decompression (BMP and RLE, in particular,
+     cannot support partial decompression because they use an inversion buffer
+     to write the image in bottom-up order.) */
+  void (*calc_buffer_dimensions) (j_decompress_ptr cinfo,
+                                  djpeg_dest_ptr dinfo);
+
 
   /* Target file spec; filled in by djpeg.c after object is created. */
   FILE *output_file;
diff --git a/ci/keys.enc b/ci/keys.enc
new file mode 100644
index 0000000..4cd333f
--- /dev/null
+++ b/ci/keys.enc
Binary files differ
diff --git a/cjpeg.1 b/cjpeg.1
index d1dc304..283fc81 100644
--- a/cjpeg.1
+++ b/cjpeg.1
@@ -1,4 +1,4 @@
-.TH CJPEG 1 "17 February 2016"
+.TH CJPEG 1 "18 March 2017"
 .SH NAME
 cjpeg \- compress an image file to a JPEG file
 .SH SYNOPSIS
@@ -202,7 +202,7 @@
 in thousands of bytes, or millions of bytes if "M" is attached to the
 number.  For example,
 .B \-max 4m
-selects 4000000 bytes.  If more space is needed, temporary files will be used.
+selects 4000000 bytes.  If more space is needed, an error will occur.
 .TP
 .BI \-outfile " name"
 Send output image to the named file, not to standard output.
diff --git a/cjpeg.c b/cjpeg.c
index 713224f..9d282b8 100644
--- a/cjpeg.c
+++ b/cjpeg.c
@@ -5,7 +5,7 @@
  * Copyright (C) 1991-1998, Thomas G. Lane.
  * Modified 2003-2011 by Guido Vollbeding.
  * libjpeg-turbo Modifications:
- * Copyright (C) 2010, 2013-2014, D. R. Commander.
+ * Copyright (C) 2010, 2013-2014, 2017, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
  *
@@ -197,11 +197,11 @@
   fprintf(stderr, "  -version       Print version information and exit\n");
   fprintf(stderr, "Switches for wizards:\n");
   fprintf(stderr, "  -baseline      Force baseline quantization tables\n");
-  fprintf(stderr, "  -qtables file  Use quantization tables given in file\n");
+  fprintf(stderr, "  -qtables FILE  Use quantization tables given in FILE\n");
   fprintf(stderr, "  -qslots N[,...]    Set component quantization tables\n");
   fprintf(stderr, "  -sample HxV[,...]  Set component sampling factors\n");
 #ifdef C_MULTISCAN_FILES_SUPPORTED
-  fprintf(stderr, "  -scans file    Create multi-scan JPEG per script file\n");
+  fprintf(stderr, "  -scans FILE    Create multi-scan JPEG per script FILE\n");
 #endif
   exit(EXIT_FAILURE);
 }
diff --git a/cmakescripts/cmake_uninstall.cmake.in b/cmakescripts/cmake_uninstall.cmake.in
new file mode 100644
index 0000000..b35d100
--- /dev/null
+++ b/cmakescripts/cmake_uninstall.cmake.in
@@ -0,0 +1,24 @@
+# This code is from the CMake FAQ
+
+if (NOT EXISTS "@CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt")
+  message(FATAL_ERROR "Cannot find install manifest: \"@CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt\"")
+endif(NOT EXISTS "@CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt")
+
+file(READ "@CMAKE_CURRENT_BINARY_DIR@/install_manifest.txt" files)
+string(REGEX REPLACE "\n" ";" files "${files}")
+list(REVERSE files)
+foreach (file ${files})
+  message(STATUS "Uninstalling \"$ENV{DESTDIR}${file}\"")
+    if (EXISTS "$ENV{DESTDIR}${file}")
+      execute_process(
+        COMMAND "@CMAKE_COMMAND@" -E remove "$ENV{DESTDIR}${file}"
+        OUTPUT_VARIABLE rm_out
+        RESULT_VARIABLE rm_retval
+      )
+    if(NOT ${rm_retval} EQUAL 0)
+      message(FATAL_ERROR "Problem when removing \"$ENV{DESTDIR}${file}\"")
+    endif (NOT ${rm_retval} EQUAL 0)
+  else (EXISTS "$ENV{DESTDIR}${file}")
+    message(STATUS "File \"$ENV{DESTDIR}${file}\" does not exist.")
+  endif (EXISTS "$ENV{DESTDIR}${file}")
+endforeach(file)
diff --git a/cmakescripts/testclean.cmake b/cmakescripts/testclean.cmake
index e357787..38bb03b 100644
--- a/cmakescripts/testclean.cmake
+++ b/cmakescripts/testclean.cmake
@@ -24,7 +24,12 @@
   *_440_*.png
   *_440_*.ppm
   *_440_*.jpg
-  *_440.yuv)
+  *_440.yuv
+  *_411_*.bmp
+  *_411_*.png
+  *_411_*.ppm
+  *_411_*.jpg
+  *_411.yuv)
 
 if(NOT FILES STREQUAL "")
   message(STATUS "Removing test files")
diff --git a/configure.ac b/configure.ac
index d6f11e1..af80ee5 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2,7 +2,7 @@
 # Process this file with autoconf to produce a configure script.
 
 AC_PREREQ([2.56])
-AC_INIT([libjpeg-turbo], [1.5.1])
+AC_INIT([libjpeg-turbo], [1.5.3])
 
 AM_INIT_AUTOMAKE([-Wall foreign dist-bzip2])
 AC_PREFIX_DEFAULT(/opt/libjpeg-turbo)
@@ -225,9 +225,9 @@
 AC_DEFINE_UNQUOTED(LIBJPEG_TURBO_VERSION, [$VERSION], [libjpeg-turbo version])
 
 m4_define(version_triplet,m4_split(AC_PACKAGE_VERSION,[[.]]))
-m4_define(version_major,m4_argn(1,version_triplet))
-m4_define(version_minor,m4_argn(2,version_triplet))
-m4_define(version_revision,m4_argn(3,version_triplet))
+m4_define(version_major,m4_car(m4_shiftn(1,[],version_triplet)))
+m4_define(version_minor,m4_car(m4_shiftn(2,[],version_triplet)))
+m4_define(version_revision,m4_car(m4_shiftn(3,[],version_triplet)))
 VERSION_MAJOR=version_major
 VERSION_MINOR=version_minor
 VERSION_REVISION=version_revision
@@ -361,6 +361,7 @@
 fi
 AC_SUBST(JAVAC)
 AC_ARG_VAR(JAVACFLAGS, [Java compiler flags])
+JAVACFLAGS="$JAVACFLAGS -J-Dfile.encoding=UTF8"
 AC_SUBST(JAVACFLAGS)
 AC_ARG_VAR(JAR, [Java archive command (default: jar)])
 if test "x$JAR" = "x"; then
@@ -517,17 +518,13 @@
       fi
       ;;
     powerpc*)
-      AC_MSG_RESULT([yes (powerpc)])
-      simd_arch=powerpc
+      AC_CHECK_ALTIVEC(
+        [AC_MSG_RESULT([yes (powerpc)])
+         simd_arch=powerpc],
+        [AC_NO_SIMD(PowerPC SPE)])
       ;;
     *)
-      AC_MSG_RESULT([no ("$host_cpu")])
-      with_simd=no;
-      if test "x${require_simd}" = "xyes"; then
-        AC_MSG_ERROR([SIMD support not available for this CPU.])
-      else
-        AC_MSG_WARN([SIMD support not available for this CPU.  Performance will suffer.])
-      fi
+      AC_NO_SIMD($host_cpu)
       ;;
   esac
 
@@ -565,6 +562,14 @@
     RPMARCH=i386
     DEBARCH=i386
     ;;
+  powerpc64le)
+    RPMARCH=`uname -m`
+    DEBARCH=ppc64el
+    ;;
+  powerpc)
+    RPMARCH=ppc
+    DEBARCH=ppc
+    ;;
   *)
     RPMARCH=`uname -m`
     DEBARCH=$RPMARCH
diff --git a/djpeg.1 b/djpeg.1
index 7efde43..0a89927 100644
--- a/djpeg.1
+++ b/djpeg.1
@@ -1,4 +1,4 @@
-.TH DJPEG 1 "18 February 2016"
+.TH DJPEG 1 "13 November 2017"
 .SH NAME
 djpeg \- decompress a JPEG file to an image file
 .SH SYNOPSIS
@@ -185,7 +185,7 @@
 in thousands of bytes, or millions of bytes if "M" is attached to the
 number.  For example,
 .B \-max 4m
-selects 4000000 bytes.  If more space is needed, temporary files will be used.
+selects 4000000 bytes.  If more space is needed, an error will occur.
 .TP
 .BI \-outfile " name"
 Send output image to the named file, not to standard output.
@@ -204,7 +204,8 @@
 with width W and height H.  If necessary, X will be shifted left to the nearest
 iMCU boundary, and the width will be increased accordingly.  Note that if
 decompression scaling is being used, then X, Y, W, and H are relative to the
-scaled image dimensions.
+scaled image dimensions.  Currently this option only works with the
+PBMPLUS (PPM/PGM), GIF, and Targa output formats.
 .TP
 .B \-verbose
 Enable debug printout.  More
diff --git a/djpeg.c b/djpeg.c
index 54cd525..96db401 100644
--- a/djpeg.c
+++ b/djpeg.c
@@ -5,7 +5,7 @@
  * Copyright (C) 1991-1997, Thomas G. Lane.
  * Modified 2013 by Guido Vollbeding.
  * libjpeg-turbo Modifications:
- * Copyright (C) 2010-2011, 2013-2016, D. R. Commander.
+ * Copyright (C) 2010-2011, 2013-2017, D. R. Commander.
  * Copyright (C) 2015, Google, Inc.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
@@ -31,7 +31,6 @@
 #include "cdjpeg.h"             /* Common decls for cjpeg/djpeg applications */
 #include "jversion.h"           /* for version message */
 #include "jconfigint.h"
-#include "wrppm.h"
 
 #include <ctype.h>              /* to declare isprint() */
 
@@ -173,6 +172,7 @@
 
   fprintf(stderr, "  -skip Y0,Y1    Decompress all rows except those between Y0 and Y1 (inclusive)\n");
   fprintf(stderr, "  -crop WxH+X+Y  Decompress only a rectangular subregion of the image\n");
+  fprintf(stderr, "                 [requires PBMPLUS (PPM/PGM), GIF, or Targa output format]\n");
   fprintf(stderr, "  -verbose  or  -debug   Emit debug output\n");
   fprintf(stderr, "  -version       Print version information and exit\n");
   exit(EXIT_FAILURE);
@@ -713,9 +713,10 @@
     }
 
     jpeg_crop_scanline(&cinfo, &crop_x, &crop_width);
-    ((ppm_dest_ptr) dest_mgr)->buffer_width = cinfo.output_width *
-                                              cinfo.out_color_components *
-                                              sizeof(JSAMPLE);
+    if (dest_mgr->calc_buffer_dimensions)
+      (*dest_mgr->calc_buffer_dimensions) (&cinfo, dest_mgr);
+    else
+      ERREXIT(&cinfo, JERR_UNSUPPORTED_FORMAT);
 
     /* Write output file header.  This is a hack to ensure that the destination
      * manager creates an output image of the proper size.
diff --git a/doc/html/group___turbo_j_p_e_g.html b/doc/html/group___turbo_j_p_e_g.html
index 4b8d306..89780d4 100644
--- a/doc/html/group___turbo_j_p_e_g.html
+++ b/doc/html/group___turbo_j_p_e_g.html
@@ -455,7 +455,7 @@
 </div><div class="memdoc">
 
 <p>Disable buffer (re)allocation. </p>
-<p>If passed to <a class="el" href="group___turbo_j_p_e_g.html#gaf38f2ed44bdc88e730e08b632fa6e88e" title="Compress an RGB, grayscale, or CMYK image into a JPEG image.">tjCompress2()</a> or <a class="el" href="group___turbo_j_p_e_g.html#gad02cd42b69f193a0623a9c801788df3a" title="Losslessly transform a JPEG image into another JPEG image.">tjTransform()</a>, this flag will cause those functions to generate an error if the JPEG image buffer is invalid or too small rather than attempting to allocate or reallocate that buffer. This reproduces the behavior of earlier versions of TurboJPEG. </p>
+<p>If passed to one of the JPEG compression or transform functions, this flag will cause those functions to generate an error if the JPEG image buffer is invalid or too small rather than attempting to allocate or reallocate that buffer. This reproduces the behavior of earlier versions of TurboJPEG. </p>
 
 </div>
 </div>
@@ -812,7 +812,7 @@
 </div><div class="memdoc">
 
 <p>Allocate an image buffer for use with TurboJPEG. </p>
-<p>You should always use this function to allocate the JPEG destination buffer(s) for <a class="el" href="group___turbo_j_p_e_g.html#gaf38f2ed44bdc88e730e08b632fa6e88e" title="Compress an RGB, grayscale, or CMYK image into a JPEG image.">tjCompress2()</a> and <a class="el" href="group___turbo_j_p_e_g.html#gad02cd42b69f193a0623a9c801788df3a" title="Losslessly transform a JPEG image into another JPEG image.">tjTransform()</a> unless you are disabling automatic buffer (re)allocation (by setting <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a>.)</p>
+<p>You should always use this function to allocate the JPEG destination buffer(s) for the compression and transform functions unless you are disabling automatic buffer (re)allocation (by setting <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a>.)</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">bytes</td><td>the number of bytes to allocate</td></tr>
@@ -1008,13 +1008,13 @@
     <tr><td class="paramname">jpegBuf</td><td>address of a pointer to an image buffer that will receive the JPEG image. TurboJPEG has the ability to reallocate the JPEG buffer to accommodate the size of the JPEG image. Thus, you can choose to:<ol type="1">
 <li>pre-allocate the JPEG buffer with an arbitrary size using <a class="el" href="group___turbo_j_p_e_g.html#ga5c9234bda6d993cdaffdd89bf81a00ff" title="Allocate an image buffer for use with TurboJPEG.">tjAlloc()</a> and let TurboJPEG grow the buffer as needed,</li>
 <li>set <code>*jpegBuf</code> to NULL to tell TurboJPEG to allocate the buffer for you, or</li>
-<li>pre-allocate the buffer to a "worst case" size determined by calling <a class="el" href="group___turbo_j_p_e_g.html#gaccc5bca7f12fcdcc302e6e1c6d4b311b" title="The maximum size of the buffer (in bytes) required to hold a JPEG image with the given parameters...">tjBufSize()</a>. This should ensure that the buffer never has to be re-allocated (setting <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a> guarantees this.)</li>
+<li>pre-allocate the buffer to a "worst case" size determined by calling <a class="el" href="group___turbo_j_p_e_g.html#gaccc5bca7f12fcdcc302e6e1c6d4b311b" title="The maximum size of the buffer (in bytes) required to hold a JPEG image with the given parameters...">tjBufSize()</a>. This should ensure that the buffer never has to be re-allocated (setting <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a> guarantees that it won't be.)</li>
 </ol>
 If you choose option 1, <code>*jpegSize</code> should be set to the size of your pre-allocated buffer. In any case, unless you have set <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a>, you should always check <code>*jpegBuf</code> upon return from this function, as it may have changed.</td></tr>
     <tr><td class="paramname">jpegSize</td><td>pointer to an unsigned long variable that holds the size of the JPEG image buffer. If <code>*jpegBuf</code> points to a pre-allocated buffer, then <code>*jpegSize</code> should be set to the size of the buffer. Upon return, <code>*jpegSize</code> will contain the size of the JPEG image (in bytes.) If <code>*jpegBuf</code> points to a JPEG image buffer that is being reused from a previous call to one of the JPEG compression functions, then <code>*jpegSize</code> is ignored.</td></tr>
     <tr><td class="paramname">jpegSubsamp</td><td>the level of chrominance subsampling to be used when generating the JPEG image (see <a class="el" href="group___turbo_j_p_e_g.html#ga1d047060ea80bb9820d540bb928e9074">Chrominance subsampling options</a>.)</td></tr>
     <tr><td class="paramname">jpegQual</td><td>the image quality of the generated JPEG image (1 = worst, 100 = best)</td></tr>
-    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#ga72ecf4ebe6eb702d3c6f5ca27455e1ec">flags</a></td></tr>
+    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#gacb233cfd722d66d1ccbf48a7de81f0e0">flags</a></td></tr>
   </table>
   </dd>
 </dl>
@@ -1106,12 +1106,12 @@
     <tr><td class="paramname">jpegBuf</td><td>address of a pointer to an image buffer that will receive the JPEG image. TurboJPEG has the ability to reallocate the JPEG buffer to accommodate the size of the JPEG image. Thus, you can choose to:<ol type="1">
 <li>pre-allocate the JPEG buffer with an arbitrary size using <a class="el" href="group___turbo_j_p_e_g.html#ga5c9234bda6d993cdaffdd89bf81a00ff" title="Allocate an image buffer for use with TurboJPEG.">tjAlloc()</a> and let TurboJPEG grow the buffer as needed,</li>
 <li>set <code>*jpegBuf</code> to NULL to tell TurboJPEG to allocate the buffer for you, or</li>
-<li>pre-allocate the buffer to a "worst case" size determined by calling <a class="el" href="group___turbo_j_p_e_g.html#gaccc5bca7f12fcdcc302e6e1c6d4b311b" title="The maximum size of the buffer (in bytes) required to hold a JPEG image with the given parameters...">tjBufSize()</a>. This should ensure that the buffer never has to be re-allocated (setting <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a> guarantees this.)</li>
+<li>pre-allocate the buffer to a "worst case" size determined by calling <a class="el" href="group___turbo_j_p_e_g.html#gaccc5bca7f12fcdcc302e6e1c6d4b311b" title="The maximum size of the buffer (in bytes) required to hold a JPEG image with the given parameters...">tjBufSize()</a>. This should ensure that the buffer never has to be re-allocated (setting <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a> guarantees that it won't be.)</li>
 </ol>
 If you choose option 1, <code>*jpegSize</code> should be set to the size of your pre-allocated buffer. In any case, unless you have set <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a>, you should always check <code>*jpegBuf</code> upon return from this function, as it may have changed.</td></tr>
     <tr><td class="paramname">jpegSize</td><td>pointer to an unsigned long variable that holds the size of the JPEG image buffer. If <code>*jpegBuf</code> points to a pre-allocated buffer, then <code>*jpegSize</code> should be set to the size of the buffer. Upon return, <code>*jpegSize</code> will contain the size of the JPEG image (in bytes.) If <code>*jpegBuf</code> points to a JPEG image buffer that is being reused from a previous call to one of the JPEG compression functions, then <code>*jpegSize</code> is ignored.</td></tr>
     <tr><td class="paramname">jpegQual</td><td>the image quality of the generated JPEG image (1 = worst, 100 = best)</td></tr>
-    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#ga72ecf4ebe6eb702d3c6f5ca27455e1ec">flags</a></td></tr>
+    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#gacb233cfd722d66d1ccbf48a7de81f0e0">flags</a></td></tr>
   </table>
   </dd>
 </dl>
@@ -1203,12 +1203,12 @@
     <tr><td class="paramname">jpegBuf</td><td>address of a pointer to an image buffer that will receive the JPEG image. TurboJPEG has the ability to reallocate the JPEG buffer to accommodate the size of the JPEG image. Thus, you can choose to:<ol type="1">
 <li>pre-allocate the JPEG buffer with an arbitrary size using <a class="el" href="group___turbo_j_p_e_g.html#ga5c9234bda6d993cdaffdd89bf81a00ff" title="Allocate an image buffer for use with TurboJPEG.">tjAlloc()</a> and let TurboJPEG grow the buffer as needed,</li>
 <li>set <code>*jpegBuf</code> to NULL to tell TurboJPEG to allocate the buffer for you, or</li>
-<li>pre-allocate the buffer to a "worst case" size determined by calling <a class="el" href="group___turbo_j_p_e_g.html#gaccc5bca7f12fcdcc302e6e1c6d4b311b" title="The maximum size of the buffer (in bytes) required to hold a JPEG image with the given parameters...">tjBufSize()</a>. This should ensure that the buffer never has to be re-allocated (setting <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a> guarantees this.)</li>
+<li>pre-allocate the buffer to a "worst case" size determined by calling <a class="el" href="group___turbo_j_p_e_g.html#gaccc5bca7f12fcdcc302e6e1c6d4b311b" title="The maximum size of the buffer (in bytes) required to hold a JPEG image with the given parameters...">tjBufSize()</a>. This should ensure that the buffer never has to be re-allocated (setting <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a> guarantees that it won't be.)</li>
 </ol>
 If you choose option 1, <code>*jpegSize</code> should be set to the size of your pre-allocated buffer. In any case, unless you have set <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a>, you should always check <code>*jpegBuf</code> upon return from this function, as it may have changed.</td></tr>
     <tr><td class="paramname">jpegSize</td><td>pointer to an unsigned long variable that holds the size of the JPEG image buffer. If <code>*jpegBuf</code> points to a pre-allocated buffer, then <code>*jpegSize</code> should be set to the size of the buffer. Upon return, <code>*jpegSize</code> will contain the size of the JPEG image (in bytes.) If <code>*jpegBuf</code> points to a JPEG image buffer that is being reused from a previous call to one of the JPEG compression functions, then <code>*jpegSize</code> is ignored.</td></tr>
     <tr><td class="paramname">jpegQual</td><td>the image quality of the generated JPEG image (1 = worst, 100 = best)</td></tr>
-    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#ga72ecf4ebe6eb702d3c6f5ca27455e1ec">flags</a></td></tr>
+    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#gacb233cfd722d66d1ccbf48a7de81f0e0">flags</a></td></tr>
   </table>
   </dd>
 </dl>
@@ -1301,7 +1301,7 @@
     <tr><td class="paramname">pitch</td><td>bytes per line in the destination image. Normally, this should be <code>width * <a class="el" href="group___turbo_j_p_e_g.html#gad77cf8fe5b2bfd3cb3f53098146abb4c" title="Pixel size (in bytes) for a given pixel format.">tjPixelSize</a>[pixelFormat]</code> if the destination image is unpadded, or <code><a class="el" href="group___turbo_j_p_e_g.html#ga0aba955473315e405295d978f0c16511" title="Pad the given width to the nearest 32-bit boundary.">TJPAD</a>(width * <a class="el" href="group___turbo_j_p_e_g.html#gad77cf8fe5b2bfd3cb3f53098146abb4c" title="Pixel size (in bytes) for a given pixel format.">tjPixelSize</a>[pixelFormat])</code> if each line of the destination image should be padded to the nearest 32-bit boundary, as is the case for Windows bitmaps. You can also be clever and use the pitch parameter to skip lines, etc. Setting this parameter to 0 is the equivalent of setting it to <code>width * <a class="el" href="group___turbo_j_p_e_g.html#gad77cf8fe5b2bfd3cb3f53098146abb4c" title="Pixel size (in bytes) for a given pixel format.">tjPixelSize</a>[pixelFormat]</code>.</td></tr>
     <tr><td class="paramname">height</td><td>height (in pixels) of the source and destination images</td></tr>
     <tr><td class="paramname">pixelFormat</td><td>pixel format of the destination image (see <a class="el" href="group___turbo_j_p_e_g.html#gac916144e26c3817ac514e64ae5d12e2a">Pixel formats</a>.)</td></tr>
-    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#ga72ecf4ebe6eb702d3c6f5ca27455e1ec">flags</a></td></tr>
+    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#gacb233cfd722d66d1ccbf48a7de81f0e0">flags</a></td></tr>
   </table>
   </dd>
 </dl>
@@ -1394,7 +1394,7 @@
     <tr><td class="paramname">pitch</td><td>bytes per line in the destination image. Normally, this should be <code>width * <a class="el" href="group___turbo_j_p_e_g.html#gad77cf8fe5b2bfd3cb3f53098146abb4c" title="Pixel size (in bytes) for a given pixel format.">tjPixelSize</a>[pixelFormat]</code> if the destination image is unpadded, or <code><a class="el" href="group___turbo_j_p_e_g.html#ga0aba955473315e405295d978f0c16511" title="Pad the given width to the nearest 32-bit boundary.">TJPAD</a>(width * <a class="el" href="group___turbo_j_p_e_g.html#gad77cf8fe5b2bfd3cb3f53098146abb4c" title="Pixel size (in bytes) for a given pixel format.">tjPixelSize</a>[pixelFormat])</code> if each line of the destination image should be padded to the nearest 32-bit boundary, as is the case for Windows bitmaps. You can also be clever and use the pitch parameter to skip lines, etc. Setting this parameter to 0 is the equivalent of setting it to <code>width * <a class="el" href="group___turbo_j_p_e_g.html#gad77cf8fe5b2bfd3cb3f53098146abb4c" title="Pixel size (in bytes) for a given pixel format.">tjPixelSize</a>[pixelFormat]</code>.</td></tr>
     <tr><td class="paramname">height</td><td>height (in pixels) of the source and destination images</td></tr>
     <tr><td class="paramname">pixelFormat</td><td>pixel format of the destination image (see <a class="el" href="group___turbo_j_p_e_g.html#gac916144e26c3817ac514e64ae5d12e2a">Pixel formats</a>.)</td></tr>
-    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#ga72ecf4ebe6eb702d3c6f5ca27455e1ec">flags</a></td></tr>
+    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#gacb233cfd722d66d1ccbf48a7de81f0e0">flags</a></td></tr>
   </table>
   </dd>
 </dl>
@@ -1479,7 +1479,7 @@
     <tr><td class="paramname">pitch</td><td>bytes per line in the destination image. Normally, this is <code>scaledWidth * <a class="el" href="group___turbo_j_p_e_g.html#gad77cf8fe5b2bfd3cb3f53098146abb4c" title="Pixel size (in bytes) for a given pixel format.">tjPixelSize</a>[pixelFormat]</code> if the decompressed image is unpadded, else <code><a class="el" href="group___turbo_j_p_e_g.html#ga0aba955473315e405295d978f0c16511" title="Pad the given width to the nearest 32-bit boundary.">TJPAD</a>(scaledWidth * <a class="el" href="group___turbo_j_p_e_g.html#gad77cf8fe5b2bfd3cb3f53098146abb4c" title="Pixel size (in bytes) for a given pixel format.">tjPixelSize</a>[pixelFormat])</code> if each line of the decompressed image is padded to the nearest 32-bit boundary, as is the case for Windows bitmaps. (NOTE: <code>scaledWidth</code> can be determined by calling <a class="el" href="group___turbo_j_p_e_g.html#ga84878bb65404204743aa18cac02781df" title="Compute the scaled value of dimension using the given scaling factor.">TJSCALED()</a> with the JPEG image width and one of the scaling factors returned by <a class="el" href="group___turbo_j_p_e_g.html#ga6449044b9af402999ccf52f401333be8" title="Returns a list of fractional scaling factors that the JPEG decompressor in this implementation of Tur...">tjGetScalingFactors()</a>.) You can also be clever and use the pitch parameter to skip lines, etc. Setting this parameter to 0 is the equivalent of setting it to <code>scaledWidth * <a class="el" href="group___turbo_j_p_e_g.html#gad77cf8fe5b2bfd3cb3f53098146abb4c" title="Pixel size (in bytes) for a given pixel format.">tjPixelSize</a>[pixelFormat]</code>.</td></tr>
     <tr><td class="paramname">height</td><td>desired height (in pixels) of the destination image. If this is different than the height of the JPEG image being decompressed, then TurboJPEG will use scaling in the JPEG decompressor to generate the largest possible image that will fit within the desired height. If <code>height</code> is set to 0, then only the width will be considered when determining the scaled image size.</td></tr>
     <tr><td class="paramname">pixelFormat</td><td>pixel format of the destination image (see <a class="el" href="group___turbo_j_p_e_g.html#gac916144e26c3817ac514e64ae5d12e2a">Pixel formats</a>.)</td></tr>
-    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#ga72ecf4ebe6eb702d3c6f5ca27455e1ec">flags</a></td></tr>
+    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#gacb233cfd722d66d1ccbf48a7de81f0e0">flags</a></td></tr>
   </table>
   </dd>
 </dl>
@@ -1629,7 +1629,7 @@
     <tr><td class="paramname">width</td><td>desired width (in pixels) of the YUV image. If this is different than the width of the JPEG image being decompressed, then TurboJPEG will use scaling in the JPEG decompressor to generate the largest possible image that will fit within the desired width. If <code>width</code> is set to 0, then only the height will be considered when determining the scaled image size. If the scaled width is not an even multiple of the MCU block width (see <a class="el" href="group___turbo_j_p_e_g.html#ga9e61e7cd47a15a173283ba94e781308c" title="MCU block width (in pixels) for a given level of chrominance subsampling.">tjMCUWidth</a>), then an intermediate buffer copy will be performed within TurboJPEG.</td></tr>
     <tr><td class="paramname">pad</td><td>the width of each line in each plane of the YUV image will be padded to the nearest multiple of this number of bytes (must be a power of 2.) To generate images suitable for X Video, <code>pad</code> should be set to 4.</td></tr>
     <tr><td class="paramname">height</td><td>desired height (in pixels) of the YUV image. If this is different than the height of the JPEG image being decompressed, then TurboJPEG will use scaling in the JPEG decompressor to generate the largest possible image that will fit within the desired height. If <code>height</code> is set to 0, then only the width will be considered when determining the scaled image size. If the scaled height is not an even multiple of the MCU block height (see <a class="el" href="group___turbo_j_p_e_g.html#gabd247bb9fecb393eca57366feb8327bf" title="MCU block height (in pixels) for a given level of chrominance subsampling.">tjMCUHeight</a>), then an intermediate buffer copy will be performed within TurboJPEG.</td></tr>
-    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#ga72ecf4ebe6eb702d3c6f5ca27455e1ec">flags</a></td></tr>
+    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#gacb233cfd722d66d1ccbf48a7de81f0e0">flags</a></td></tr>
   </table>
   </dd>
 </dl>
@@ -1708,7 +1708,7 @@
     <tr><td class="paramname">width</td><td>desired width (in pixels) of the YUV image. If this is different than the width of the JPEG image being decompressed, then TurboJPEG will use scaling in the JPEG decompressor to generate the largest possible image that will fit within the desired width. If <code>width</code> is set to 0, then only the height will be considered when determining the scaled image size. If the scaled width is not an even multiple of the MCU block width (see <a class="el" href="group___turbo_j_p_e_g.html#ga9e61e7cd47a15a173283ba94e781308c" title="MCU block width (in pixels) for a given level of chrominance subsampling.">tjMCUWidth</a>), then an intermediate buffer copy will be performed within TurboJPEG.</td></tr>
     <tr><td class="paramname">strides</td><td>an array of integers, each specifying the number of bytes per line in the corresponding plane of the output image. Setting the stride for any plane to 0 is the same as setting it to the scaled plane width (see <a class="el" href="group___turbo_j_p_e_g.html#YUVnotes">YUV Image Format Notes</a>.) If <code>strides</code> is NULL, then the strides for all planes will be set to their respective scaled plane widths. You can adjust the strides in order to add an arbitrary amount of line padding to each plane or to decompress the JPEG image into a subregion of a larger YUV planar image.</td></tr>
     <tr><td class="paramname">height</td><td>desired height (in pixels) of the YUV image. If this is different than the height of the JPEG image being decompressed, then TurboJPEG will use scaling in the JPEG decompressor to generate the largest possible image that will fit within the desired height. If <code>height</code> is set to 0, then only the width will be considered when determining the scaled image size. If the scaled height is not an even multiple of the MCU block height (see <a class="el" href="group___turbo_j_p_e_g.html#gabd247bb9fecb393eca57366feb8327bf" title="MCU block height (in pixels) for a given level of chrominance subsampling.">tjMCUHeight</a>), then an intermediate buffer copy will be performed within TurboJPEG.</td></tr>
-    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#ga72ecf4ebe6eb702d3c6f5ca27455e1ec">flags</a></td></tr>
+    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#gacb233cfd722d66d1ccbf48a7de81f0e0">flags</a></td></tr>
   </table>
   </dd>
 </dl>
@@ -1826,7 +1826,7 @@
     <tr><td class="paramname">dstBuf</td><td>pointer to an image buffer that will receive the YUV image. Use <a class="el" href="group___turbo_j_p_e_g.html#gaf451664a62c1f6c7cc5a6401f32908c9" title="The size of the buffer (in bytes) required to hold a YUV planar image with the given parameters...">tjBufSizeYUV2()</a> to determine the appropriate size for this buffer based on the image width, height, padding, and level of chrominance subsampling. The Y, U (Cb), and V (Cr) image planes will be stored sequentially in the buffer (refer to <a class="el" href="group___turbo_j_p_e_g.html#YUVnotes">YUV Image Format Notes</a>.)</td></tr>
     <tr><td class="paramname">pad</td><td>the width of each line in each plane of the YUV image will be padded to the nearest multiple of this number of bytes (must be a power of 2.) To generate images suitable for X Video, <code>pad</code> should be set to 4.</td></tr>
     <tr><td class="paramname">subsamp</td><td>the level of chrominance subsampling to be used when generating the YUV image (see <a class="el" href="group___turbo_j_p_e_g.html#ga1d047060ea80bb9820d540bb928e9074">Chrominance subsampling options</a>.) To generate images suitable for X Video, <code>subsamp</code> should be set to <a class="el" href="group___turbo_j_p_e_g.html#gga1d047060ea80bb9820d540bb928e9074a63085dbf683cfe39e513cdb6343e3737">TJSAMP_420</a>. This produces an image compatible with the I420 (AKA "YUV420P") format.</td></tr>
-    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#ga72ecf4ebe6eb702d3c6f5ca27455e1ec">flags</a></td></tr>
+    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#gacb233cfd722d66d1ccbf48a7de81f0e0">flags</a></td></tr>
   </table>
   </dd>
 </dl>
@@ -1919,7 +1919,7 @@
     <tr><td class="paramname">dstPlanes</td><td>an array of pointers to Y, U (Cb), and V (Cr) image planes (or just a Y plane, if generating a grayscale image) that will receive the encoded image. These planes can be contiguous or non-contiguous in memory. Use <a class="el" href="group___turbo_j_p_e_g.html#ga6f98d977bfa9d167c97172e876ba61e2" title="The size of the buffer (in bytes) required to hold a YUV image plane with the given parameters...">tjPlaneSizeYUV()</a> to determine the appropriate size for each plane based on the image width, height, strides, and level of chrominance subsampling. Refer to <a class="el" href="group___turbo_j_p_e_g.html#YUVnotes">YUV Image Format Notes</a> for more details.</td></tr>
     <tr><td class="paramname">strides</td><td>an array of integers, each specifying the number of bytes per line in the corresponding plane of the output image. Setting the stride for any plane to 0 is the same as setting it to the plane width (see <a class="el" href="group___turbo_j_p_e_g.html#YUVnotes">YUV Image Format Notes</a>.) If <code>strides</code> is NULL, then the strides for all planes will be set to their respective plane widths. You can adjust the strides in order to add an arbitrary amount of line padding to each plane or to encode an RGB or grayscale image into a subregion of a larger YUV planar image.</td></tr>
     <tr><td class="paramname">subsamp</td><td>the level of chrominance subsampling to be used when generating the YUV image (see <a class="el" href="group___turbo_j_p_e_g.html#ga1d047060ea80bb9820d540bb928e9074">Chrominance subsampling options</a>.) To generate images suitable for X Video, <code>subsamp</code> should be set to <a class="el" href="group___turbo_j_p_e_g.html#gga1d047060ea80bb9820d540bb928e9074a63085dbf683cfe39e513cdb6343e3737">TJSAMP_420</a>. This produces an image compatible with the I420 (AKA "YUV420P") format.</td></tr>
-    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#ga72ecf4ebe6eb702d3c6f5ca27455e1ec">flags</a></td></tr>
+    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#gacb233cfd722d66d1ccbf48a7de81f0e0">flags</a></td></tr>
   </table>
   </dd>
 </dl>
@@ -1942,7 +1942,7 @@
 </div><div class="memdoc">
 
 <p>Free an image buffer previously allocated by TurboJPEG. </p>
-<p>You should always use this function to free JPEG destination buffer(s) that were automatically (re)allocated by <a class="el" href="group___turbo_j_p_e_g.html#gaf38f2ed44bdc88e730e08b632fa6e88e" title="Compress an RGB, grayscale, or CMYK image into a JPEG image.">tjCompress2()</a> or <a class="el" href="group___turbo_j_p_e_g.html#gad02cd42b69f193a0623a9c801788df3a" title="Losslessly transform a JPEG image into another JPEG image.">tjTransform()</a> or that were manually allocated using <a class="el" href="group___turbo_j_p_e_g.html#ga5c9234bda6d993cdaffdd89bf81a00ff" title="Allocate an image buffer for use with TurboJPEG.">tjAlloc()</a>.</p>
+<p>You should always use this function to free JPEG destination buffer(s) that were automatically (re)allocated by the compression and transform functions or that were manually allocated using <a class="el" href="group___turbo_j_p_e_g.html#ga5c9234bda6d993cdaffdd89bf81a00ff" title="Allocate an image buffer for use with TurboJPEG.">tjAlloc()</a>.</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">buffer</td><td>address of the buffer to free</td></tr>
@@ -2270,12 +2270,12 @@
     <tr><td class="paramname">dstBufs</td><td>pointer to an array of n image buffers. <code>dstBufs[i]</code> will receive a JPEG image that has been transformed using the parameters in <code>transforms[i]</code>. TurboJPEG has the ability to reallocate the JPEG buffer to accommodate the size of the JPEG image. Thus, you can choose to:<ol type="1">
 <li>pre-allocate the JPEG buffer with an arbitrary size using <a class="el" href="group___turbo_j_p_e_g.html#ga5c9234bda6d993cdaffdd89bf81a00ff" title="Allocate an image buffer for use with TurboJPEG.">tjAlloc()</a> and let TurboJPEG grow the buffer as needed,</li>
 <li>set <code>dstBufs[i]</code> to NULL to tell TurboJPEG to allocate the buffer for you, or</li>
-<li>pre-allocate the buffer to a "worst case" size determined by calling <a class="el" href="group___turbo_j_p_e_g.html#gaccc5bca7f12fcdcc302e6e1c6d4b311b" title="The maximum size of the buffer (in bytes) required to hold a JPEG image with the given parameters...">tjBufSize()</a> with the transformed or cropped width and height. This should ensure that the buffer never has to be re-allocated (setting <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a> guarantees this.)</li>
+<li>pre-allocate the buffer to a "worst case" size determined by calling <a class="el" href="group___turbo_j_p_e_g.html#gaccc5bca7f12fcdcc302e6e1c6d4b311b" title="The maximum size of the buffer (in bytes) required to hold a JPEG image with the given parameters...">tjBufSize()</a> with the transformed or cropped width and height. Under normal circumstances, this should ensure that the buffer never has to be re-allocated (setting <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a> guarantees that it won't be.) Note, however, that there are some rare cases (such as transforming images with a large amount of embedded EXIF or ICC profile data) in which the output image will be larger than the worst-case size, and <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a> cannot be used in those cases.</li>
 </ol>
 If you choose option 1, <code>dstSizes[i]</code> should be set to the size of your pre-allocated buffer. In any case, unless you have set <a class="el" href="group___turbo_j_p_e_g.html#ga8808d403c68b62aaa58a4c1e58e98963" title="Disable buffer (re)allocation.">TJFLAG_NOREALLOC</a>, you should always check <code>dstBufs[i]</code> upon return from this function, as it may have changed.</td></tr>
     <tr><td class="paramname">dstSizes</td><td>pointer to an array of n unsigned long variables that will receive the actual sizes (in bytes) of each transformed JPEG image. If <code>dstBufs[i]</code> points to a pre-allocated buffer, then <code>dstSizes[i]</code> should be set to the size of the buffer. Upon return, <code>dstSizes[i]</code> will contain the size of the JPEG image (in bytes.)</td></tr>
     <tr><td class="paramname">transforms</td><td>pointer to an array of n <a class="el" href="structtjtransform.html" title="Lossless transform.">tjtransform</a> structures, each of which specifies the transform parameters and/or cropping region for the corresponding transformed output image.</td></tr>
-    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#ga72ecf4ebe6eb702d3c6f5ca27455e1ec">flags</a></td></tr>
+    <tr><td class="paramname">flags</td><td>the bitwise OR of one or more of the <a class="el" href="group___turbo_j_p_e_g.html#gacb233cfd722d66d1ccbf48a7de81f0e0">flags</a></td></tr>
   </table>
   </dd>
 </dl>
diff --git a/doxygen-extra.css b/doxygen-extra.css
index 5abbcc2..f1bd4c2 100644
--- a/doxygen-extra.css
+++ b/doxygen-extra.css
@@ -1,3 +1,3 @@
 code {
-	color: #4665A2; 
+	color: #4665A2;
 }
diff --git a/java/TJBench.java b/java/TJBench.java
index 19db789..ddc414c 100644
--- a/java/TJBench.java
+++ b/java/TJBench.java
@@ -1,5 +1,5 @@
 /*
- * Copyright (C)2009-2014, 2016 D. R. Commander.  All Rights Reserved.
+ * Copyright (C)2009-2014, 2016-2017 D. R. Commander.  All Rights Reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are met:
@@ -34,8 +34,8 @@
 
 class TJBench {
 
-  static int flags = 0, quiet = 0, pf = TJ.PF_BGR, yuvpad = 1, warmup = 1;
-  static boolean compOnly, decompOnly, doTile, doYUV, write;
+  static int flags = 0, quiet = 0, pf = TJ.PF_BGR, yuvpad = 1;
+  static boolean compOnly, decompOnly, doTile, doYUV, write = true;
 
   static final String[] pixFormatStr = {
     "RGB", "BGR", "RGBX", "BGRX", "XBGR", "XRGB", "GRAY"
@@ -55,7 +55,7 @@
 
   static TJScalingFactor sf;
   static int xformOp = TJTransform.OP_NONE, xformOpt = 0;
-  static double benchTime = 5.0;
+  static double benchTime = 5.0, warmup = 1.0;
 
 
   static final double getTime() {
@@ -162,7 +162,7 @@
     }
 
     /* Benchmark */
-    iter -= warmup;
+    iter = -1;
     elapsed = elapsedDecode = 0.0;
     while (true) {
       int tile = 0;
@@ -184,11 +184,14 @@
             tjd.decompress(dstBuf, x, y, width, pitch, height, pf, flags);
         }
       }
-      iter++;
-      if (iter >= 1) {
-        elapsed += getTime() - start;
+      elapsed += getTime() - start;
+      if (iter >= 0) {
+        iter++;
         if (elapsed >= benchTime)
           break;
+      } else if (elapsed >= warmup) {
+        iter = 0;
+        elapsed = elapsedDecode = 0.0;
       }
     }
     if(doYUV)
@@ -321,7 +324,7 @@
       }
 
       /* Benchmark */
-      iter = -warmup;
+      iter = -1;
       elapsed = elapsedEncode = 0.0;
       while (true) {
         int tile = 0;
@@ -346,11 +349,14 @@
             totalJpegSize += jpegSize[tile];
           }
         }
-        iter++;
-        if (iter >= 1) {
-          elapsed += getTime() - start;
+        elapsed += getTime() - start;
+        if (iter >= 0) {
+          iter++;
           if (elapsed >= benchTime)
             break;
+        } else if (elapsed >= warmup) {
+          iter = 0;
+          elapsed = elapsedEncode = 0.0;
         }
       }
       if (doYUV)
@@ -541,17 +547,20 @@
           }
         }
 
-        iter = -warmup;
+        iter = -1;
         elapsed = 0.;
         while (true) {
           start = getTime();
           tjt.transform(jpegBuf, t, flags);
           jpegSize = tjt.getTransformedSizes();
-          iter++;
-          if (iter >= 1) {
-            elapsed += getTime() - start;
+          elapsed += getTime() - start;
+          if (iter >= 0) {
+            iter++;
             if (elapsed >= benchTime)
               break;
+          } else if (elapsed >= warmup) {
+            iter = 0;
+            elapsed = 0.0;
           }
         }
         t = null;
@@ -582,8 +591,8 @@
           System.out.print("N/A     N/A     ");
         jpegBuf = new byte[1][TJ.bufSize(_tilew, _tileh, subsamp)];
         jpegSize = new int[1];
+        jpegBuf[0] = srcBuf;
         jpegSize[0] = srcSize;
-        System.arraycopy(srcBuf, 0, jpegBuf[0], 0, srcSize);
       }
 
       if (w == tilew)
@@ -659,8 +668,9 @@
     System.out.println("-grayscale = Perform lossless grayscale conversion prior to decompression");
     System.out.println("     test (can be combined with the other transforms above)");
     System.out.println("-benchtime <t> = Run each benchmark for at least <t> seconds (default = 5.0)");
-    System.out.println("-warmup <w> = Execute each benchmark <w> times to prime the cache before");
-    System.out.println("     taking performance measurements (default = 1)");
+    System.out.println("-warmup <t> = Run each benchmark for <t> seconds (default = 1.0) prior to");
+    System.out.println("     starting the timer, in order to prime the caches and thus improve the");
+    System.out.println("     consistency of the results.");
     System.out.println("-componly = Stop after running compression tests.  Do not test decompression.");
     System.out.println("-nowrite = Do not write reference or output images (improves consistency");
     System.out.println("     of performance measurements.)\n");
@@ -711,37 +721,37 @@
           if (argv[i].equalsIgnoreCase("-tile")) {
             doTile = true;  xformOpt |= TJTransform.OPT_CROP;
           }
-          if (argv[i].equalsIgnoreCase("-fastupsample")) {
+          else if (argv[i].equalsIgnoreCase("-fastupsample")) {
             System.out.println("Using fast upsampling code\n");
             flags |= TJ.FLAG_FASTUPSAMPLE;
           }
-          if (argv[i].equalsIgnoreCase("-fastdct")) {
+          else if (argv[i].equalsIgnoreCase("-fastdct")) {
             System.out.println("Using fastest DCT/IDCT algorithm\n");
             flags |= TJ.FLAG_FASTDCT;
           }
-          if (argv[i].equalsIgnoreCase("-accuratedct")) {
+          else if (argv[i].equalsIgnoreCase("-accuratedct")) {
             System.out.println("Using most accurate DCT/IDCT algorithm\n");
             flags |= TJ.FLAG_ACCURATEDCT;
           }
-          if (argv[i].equalsIgnoreCase("-rgb"))
+          else if (argv[i].equalsIgnoreCase("-rgb"))
             pf = TJ.PF_RGB;
-          if (argv[i].equalsIgnoreCase("-rgbx"))
+          else if (argv[i].equalsIgnoreCase("-rgbx"))
             pf = TJ.PF_RGBX;
-          if (argv[i].equalsIgnoreCase("-bgr"))
+          else if (argv[i].equalsIgnoreCase("-bgr"))
             pf = TJ.PF_BGR;
-          if (argv[i].equalsIgnoreCase("-bgrx"))
+          else if (argv[i].equalsIgnoreCase("-bgrx"))
             pf = TJ.PF_BGRX;
-          if (argv[i].equalsIgnoreCase("-xbgr"))
+          else if (argv[i].equalsIgnoreCase("-xbgr"))
             pf = TJ.PF_XBGR;
-          if (argv[i].equalsIgnoreCase("-xrgb"))
+          else if (argv[i].equalsIgnoreCase("-xrgb"))
             pf = TJ.PF_XRGB;
-          if (argv[i].equalsIgnoreCase("-bottomup"))
+          else if (argv[i].equalsIgnoreCase("-bottomup"))
             flags |= TJ.FLAG_BOTTOMUP;
-          if (argv[i].equalsIgnoreCase("-quiet"))
+          else if (argv[i].equalsIgnoreCase("-quiet"))
             quiet = 1;
-          if (argv[i].equalsIgnoreCase("-qq"))
+          else if (argv[i].equalsIgnoreCase("-qq"))
             quiet = 2;
-          if (argv[i].equalsIgnoreCase("-scale") && i < argv.length - 1) {
+          else if (argv[i].equalsIgnoreCase("-scale") && i < argv.length - 1) {
             int temp1 = 0, temp2 = 0;
             boolean match = false, scanned = true;
             Scanner scanner = new Scanner(argv[++i]).useDelimiter("/");
@@ -764,25 +774,25 @@
             } else
               usage();
           }
-          if (argv[i].equalsIgnoreCase("-hflip"))
+          else if (argv[i].equalsIgnoreCase("-hflip"))
             xformOp = TJTransform.OP_HFLIP;
-          if (argv[i].equalsIgnoreCase("-vflip"))
+          else if (argv[i].equalsIgnoreCase("-vflip"))
             xformOp = TJTransform.OP_VFLIP;
-          if (argv[i].equalsIgnoreCase("-transpose"))
+          else if (argv[i].equalsIgnoreCase("-transpose"))
             xformOp = TJTransform.OP_TRANSPOSE;
-          if (argv[i].equalsIgnoreCase("-transverse"))
+          else if (argv[i].equalsIgnoreCase("-transverse"))
             xformOp = TJTransform.OP_TRANSVERSE;
-          if (argv[i].equalsIgnoreCase("-rot90"))
+          else if (argv[i].equalsIgnoreCase("-rot90"))
             xformOp = TJTransform.OP_ROT90;
-          if (argv[i].equalsIgnoreCase("-rot180"))
+          else if (argv[i].equalsIgnoreCase("-rot180"))
             xformOp = TJTransform.OP_ROT180;
-          if (argv[i].equalsIgnoreCase("-rot270"))
+          else if (argv[i].equalsIgnoreCase("-rot270"))
             xformOp = TJTransform.OP_ROT270;
-          if (argv[i].equalsIgnoreCase("-grayscale"))
+          else if (argv[i].equalsIgnoreCase("-grayscale"))
             xformOpt |= TJTransform.OPT_GRAY;
-          if (argv[i].equalsIgnoreCase("-nooutput"))
+          else if (argv[i].equalsIgnoreCase("-nooutput"))
             xformOpt |= TJTransform.OPT_NOOUTPUT;
-          if (argv[i].equalsIgnoreCase("-benchtime") && i < argv.length - 1) {
+          else if (argv[i].equalsIgnoreCase("-benchtime") && i < argv.length - 1) {
             double temp = -1;
             try {
               temp = Double.parseDouble(argv[++i]);
@@ -792,11 +802,11 @@
             else
               usage();
           }
-          if (argv[i].equalsIgnoreCase("-yuv")) {
+          else if (argv[i].equalsIgnoreCase("-yuv")) {
             System.out.println("Testing YUV planar encoding/decoding\n");
             doYUV = true;
           }
-          if (argv[i].equalsIgnoreCase("-yuvpad") && i < argv.length - 1) {
+          else if (argv[i].equalsIgnoreCase("-yuvpad") && i < argv.length - 1) {
             int temp = 0;
             try {
              temp = Integer.parseInt(argv[++i]);
@@ -804,7 +814,7 @@
             if (temp >= 1)
               yuvpad = temp;
           }
-          if (argv[i].equalsIgnoreCase("-subsamp") && i < argv.length - 1) {
+          else if (argv[i].equalsIgnoreCase("-subsamp") && i < argv.length - 1) {
             i++;
             if (argv[i].toUpperCase().startsWith("G"))
               subsamp = TJ.SAMP_GRAY;
@@ -819,22 +829,22 @@
             else if (argv[i].equals("411"))
               subsamp = TJ.SAMP_411;
           }
-          if (argv[i].equalsIgnoreCase("-componly"))
+          else if (argv[i].equalsIgnoreCase("-componly"))
             compOnly = true;
-          if (argv[i].equalsIgnoreCase("-nowrite"))
+          else if (argv[i].equalsIgnoreCase("-nowrite"))
             write = false;
-          if (argv[i].equalsIgnoreCase("-warmup") && i < argv.length - 1) {
-            int temp = -1;
+          else if (argv[i].equalsIgnoreCase("-warmup") && i < argv.length - 1) {
+            double temp = -1;
             try {
-             temp = Integer.parseInt(argv[++i]);
+             temp = Double.parseDouble(argv[++i]);
             } catch (NumberFormatException e) {}
-            if (temp >= 0) {
+            if (temp >= 0.0) {
               warmup = temp;
-              System.out.format("Warmup runs = %d\n\n", warmup);
-            }
+              System.out.format("Warmup time = %.1f seconds\n\n", warmup);
+            } else
+              usage();
           }
-          if (argv[i].equalsIgnoreCase("-?"))
-            usage();
+          else usage();
         }
       }
 
diff --git a/java/TJExample.java b/java/TJExample.java
index da09807..835a5b9 100644
--- a/java/TJExample.java
+++ b/java/TJExample.java
@@ -1,5 +1,6 @@
 /*
- * Copyright (C)2011-2012, 2014-2015 D. R. Commander.  All Rights Reserved.
+ * Copyright (C)2011-2012, 2014-2015, 2017 D. R. Commander.
+ *                                         All Rights Reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are met:
@@ -94,7 +95,7 @@
   }
 
   private static final String[] sampName = {
-    "4:4:4", "4:2:2", "4:2:0", "Grayscale", "4:4:0"
+    "4:4:4", "4:2:2", "4:2:0", "Grayscale", "4:4:0", "4:1:1"
   };
 
   public static void main(String[] argv) {
@@ -117,114 +118,114 @@
       int outSubsamp = -1, outQual = 95;
       boolean display = false;
 
-      if (argv.length > 1) {
-        for (int i = 1; i < argv.length; i++) {
-          if (argv[i].length() < 2)
-            continue;
-          if (argv[i].length() > 2 &&
-              argv[i].substring(0, 3).equalsIgnoreCase("-sc")) {
-            int match = 0;
-            if (i < argv.length - 1) {
-              String[] scaleArg = argv[++i].split("/");
-              if (scaleArg.length == 2) {
-                TJScalingFactor tempsf =
-                  new TJScalingFactor(Integer.parseInt(scaleArg[0]),
-                                      Integer.parseInt(scaleArg[1]));
-                for (int j = 0; j < sf.length; j++) {
-                  if (tempsf.equals(sf[j])) {
-                    scaleFactor = sf[j];
-                    match = 1;
-                    break;
-                  }
+      if (argv[1].substring(0, 2).equalsIgnoreCase("-d"))
+        display = true;
+
+      for (int i = 2; i < argv.length; i++) {
+        if (argv[i].length() < 2)
+          continue;
+        else if (argv[i].length() > 2 &&
+            argv[i].substring(0, 3).equalsIgnoreCase("-sc")) {
+          int match = 0;
+          if (i < argv.length - 1) {
+            String[] scaleArg = argv[++i].split("/");
+            if (scaleArg.length == 2) {
+              TJScalingFactor tempsf =
+                new TJScalingFactor(Integer.parseInt(scaleArg[0]),
+                                    Integer.parseInt(scaleArg[1]));
+              for (int j = 0; j < sf.length; j++) {
+                if (tempsf.equals(sf[j])) {
+                  scaleFactor = sf[j];
+                  match = 1;
+                  break;
                 }
               }
             }
-            if (match != 1) usage();
           }
-          if (argv[i].equalsIgnoreCase("-h") || argv[i].equalsIgnoreCase("-?"))
-            usage();
-          if (argv[i].length() > 2 &&
-              argv[i].substring(0, 3).equalsIgnoreCase("-sa")) {
-            if (i < argv.length - 1) {
-              i++;
-              if (argv[i].substring(0, 1).equalsIgnoreCase("g"))
-                outSubsamp = TJ.SAMP_GRAY;
-              else if (argv[i].equals("444"))
-                outSubsamp = TJ.SAMP_444;
-              else if (argv[i].equals("422"))
-                outSubsamp = TJ.SAMP_422;
-              else if (argv[i].equals("420"))
-                outSubsamp = TJ.SAMP_420;
-              else
-                usage();
-            } else
-              usage();
-          }
-          if (argv[i].substring(0, 2).equalsIgnoreCase("-q")) {
-            if (i < argv.length - 1) {
-              int qual = Integer.parseInt(argv[++i]);
-              if (qual >= 1 && qual <= 100)
-                outQual = qual;
-              else
-                usage();
-            } else
-              usage();
-          }
-          if (argv[i].substring(0, 2).equalsIgnoreCase("-g"))
-            xform.options |= TJTransform.OPT_GRAY;
-          if (argv[i].equalsIgnoreCase("-hflip"))
-            xform.op = TJTransform.OP_HFLIP;
-          if (argv[i].equalsIgnoreCase("-vflip"))
-            xform.op = TJTransform.OP_VFLIP;
-          if (argv[i].equalsIgnoreCase("-transpose"))
-            xform.op = TJTransform.OP_TRANSPOSE;
-          if (argv[i].equalsIgnoreCase("-transverse"))
-            xform.op = TJTransform.OP_TRANSVERSE;
-          if (argv[i].equalsIgnoreCase("-rot90"))
-            xform.op = TJTransform.OP_ROT90;
-          if (argv[i].equalsIgnoreCase("-rot180"))
-            xform.op = TJTransform.OP_ROT180;
-          if (argv[i].equalsIgnoreCase("-rot270"))
-            xform.op = TJTransform.OP_ROT270;
-          if (argv[i].equalsIgnoreCase("-custom"))
-            xform.cf = new TJExample();
-          else if (argv[i].length() > 2 &&
-                   argv[i].substring(0, 2).equalsIgnoreCase("-c")) {
-            if (i >= argv.length - 1)
-              usage();
-            String[] cropArg = argv[++i].split(",");
-            if (cropArg.length != 3)
-              usage();
-            String[] dimArg = cropArg[2].split("[xX]");
-            if (dimArg.length != 2)
-              usage();
-            int tempx = Integer.parseInt(cropArg[0]);
-            int tempy = Integer.parseInt(cropArg[1]);
-            int tempw = Integer.parseInt(dimArg[0]);
-            int temph = Integer.parseInt(dimArg[1]);
-            if (tempx < 0 || tempy < 0 || tempw < 0 || temph < 0)
-              usage();
-            xform.x = tempx;
-            xform.y = tempy;
-            xform.width = tempw;
-            xform.height = temph;
-            xform.options |= TJTransform.OPT_CROP;
-          }
-          if (argv[i].substring(0, 2).equalsIgnoreCase("-d"))
-            display = true;
-          if (argv[i].equalsIgnoreCase("-fastupsample")) {
-            System.out.println("Using fast upsampling code");
-            flags |= TJ.FLAG_FASTUPSAMPLE;
-          }
-          if (argv[i].equalsIgnoreCase("-fastdct")) {
-            System.out.println("Using fastest DCT/IDCT algorithm");
-            flags |= TJ.FLAG_FASTDCT;
-          }
-          if (argv[i].equalsIgnoreCase("-accuratedct")) {
-            System.out.println("Using most accurate DCT/IDCT algorithm");
-            flags |= TJ.FLAG_ACCURATEDCT;
-          }
+          if (match != 1) usage();
         }
+        else if (argv[i].length() > 2 &&
+            argv[i].substring(0, 3).equalsIgnoreCase("-sa")) {
+          if (i < argv.length - 1) {
+            i++;
+            if (argv[i].substring(0, 1).equalsIgnoreCase("g"))
+              outSubsamp = TJ.SAMP_GRAY;
+            else if (argv[i].equals("444"))
+              outSubsamp = TJ.SAMP_444;
+            else if (argv[i].equals("422"))
+              outSubsamp = TJ.SAMP_422;
+            else if (argv[i].equals("420"))
+              outSubsamp = TJ.SAMP_420;
+            else
+              usage();
+          } else
+            usage();
+        }
+        else if (argv[i].substring(0, 2).equalsIgnoreCase("-q")) {
+          if (i < argv.length - 1) {
+            int qual = Integer.parseInt(argv[++i]);
+            if (qual >= 1 && qual <= 100)
+              outQual = qual;
+            else
+              usage();
+          } else
+            usage();
+        }
+        else if (argv[i].substring(0, 2).equalsIgnoreCase("-g"))
+          xform.options |= TJTransform.OPT_GRAY;
+        else if (argv[i].equalsIgnoreCase("-hflip"))
+          xform.op = TJTransform.OP_HFLIP;
+        else if (argv[i].equalsIgnoreCase("-vflip"))
+          xform.op = TJTransform.OP_VFLIP;
+        else if (argv[i].equalsIgnoreCase("-transpose"))
+          xform.op = TJTransform.OP_TRANSPOSE;
+        else if (argv[i].equalsIgnoreCase("-transverse"))
+          xform.op = TJTransform.OP_TRANSVERSE;
+        else if (argv[i].equalsIgnoreCase("-rot90"))
+          xform.op = TJTransform.OP_ROT90;
+        else if (argv[i].equalsIgnoreCase("-rot180"))
+          xform.op = TJTransform.OP_ROT180;
+        else if (argv[i].equalsIgnoreCase("-rot270"))
+          xform.op = TJTransform.OP_ROT270;
+        else if (argv[i].equalsIgnoreCase("-custom"))
+          xform.cf = new TJExample();
+        else if (argv[i].length() > 2 &&
+                 argv[i].substring(0, 2).equalsIgnoreCase("-c")) {
+          if (i >= argv.length - 1)
+            usage();
+          String[] cropArg = argv[++i].split(",");
+          if (cropArg.length != 3)
+            usage();
+          String[] dimArg = cropArg[2].split("[xX]");
+          if (dimArg.length != 2)
+            usage();
+          int tempx = Integer.parseInt(cropArg[0]);
+          int tempy = Integer.parseInt(cropArg[1]);
+          int tempw = Integer.parseInt(dimArg[0]);
+          int temph = Integer.parseInt(dimArg[1]);
+          if (tempx < 0 || tempy < 0 || tempw < 0 || temph < 0)
+            usage();
+          xform.x = tempx;
+          xform.y = tempy;
+          xform.width = tempw;
+          xform.height = temph;
+          xform.options |= TJTransform.OPT_CROP;
+        }
+        else if (argv[i].substring(0, 2).equalsIgnoreCase("-d"))
+          display = true;
+        else if (argv[i].equalsIgnoreCase("-fastupsample")) {
+          System.out.println("Using fast upsampling code");
+          flags |= TJ.FLAG_FASTUPSAMPLE;
+        }
+        else if (argv[i].equalsIgnoreCase("-fastdct")) {
+          System.out.println("Using fastest DCT/IDCT algorithm");
+          flags |= TJ.FLAG_FASTDCT;
+        }
+        else if (argv[i].equalsIgnoreCase("-accuratedct")) {
+          System.out.println("Using most accurate DCT/IDCT algorithm");
+          flags |= TJ.FLAG_ACCURATEDCT;
+        }
+        else usage();
       }
       String[] inFileTokens = argv[0].split("\\.");
       if (inFileTokens.length > 1)
diff --git a/java/TJUnitTest.java b/java/TJUnitTest.java
index 444e798..47ff7bb 100644
--- a/java/TJUnitTest.java
+++ b/java/TJUnitTest.java
@@ -1,5 +1,5 @@
 /*
- * Copyright (C)2011-2016 D. R. Commander.  All Rights Reserved.
+ * Copyright (C)2011-2017 D. R. Commander.  All Rights Reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are met:
@@ -44,10 +44,10 @@
 
   private static void usage() {
     System.out.println("\nUSAGE: java " + classname + " [options]\n");
-    System.out.println("Options:\n");
-    System.out.println("-yuv = test YUV encoding/decoding support\n");
-    System.out.println("-noyuvpad = do not pad each line of each Y, U, and V plane to the nearest\n");
-    System.out.println("            4-byte boundary\n");
+    System.out.println("Options:");
+    System.out.println("-yuv = test YUV encoding/decoding support");
+    System.out.println("-noyuvpad = do not pad each line of each Y, U, and V plane to the nearest");
+    System.out.println("            4-byte boundary");
     System.out.println("-bi = test BufferedImage support\n");
     System.exit(1);
   }
@@ -109,21 +109,12 @@
       case BufferedImage.TYPE_BYTE_GRAY:
         return TJ.PF_GRAY;
       case BufferedImage.TYPE_INT_BGR:
-        if (byteOrder == ByteOrder.BIG_ENDIAN)
-          return TJ.PF_XBGR;
-        else
-          return TJ.PF_RGBX;
+        return TJ.PF_RGBX;
       case BufferedImage.TYPE_INT_RGB:
-        if (byteOrder == ByteOrder.BIG_ENDIAN)
-          return TJ.PF_XRGB;
-        else
-          return TJ.PF_BGRX;
+        return TJ.PF_BGRX;
       case BufferedImage.TYPE_INT_ARGB:
       case BufferedImage.TYPE_INT_ARGB_PRE:
-        if (byteOrder == ByteOrder.BIG_ENDIAN)
-          return TJ.PF_ARGB;
-        else
-          return TJ.PF_BGRA;
+        return TJ.PF_BGRA;
     }
     return 0;
   }
@@ -911,15 +902,13 @@
       for (int i = 0; i < argv.length; i++) {
         if (argv[i].equalsIgnoreCase("-yuv"))
           doYUV = true;
-        if (argv[i].equalsIgnoreCase("-noyuvpad"))
+        else if (argv[i].equalsIgnoreCase("-noyuvpad"))
           pad = 1;
-        if (argv[i].substring(0, 1).equalsIgnoreCase("-h") ||
-            argv[i].equalsIgnoreCase("-?"))
-          usage();
-        if (argv[i].equalsIgnoreCase("-bi")) {
+        else if (argv[i].equalsIgnoreCase("-bi")) {
           bi = true;
           testName = "javabitest";
-        }
+        } else
+          usage();
       }
       if (doYUV)
         _4byteFormats[4] = -1;
diff --git a/java/org/libjpegturbo/turbojpeg/YUVImage.java b/java/org/libjpegturbo/turbojpeg/YUVImage.java
index 1a05e62..d123e37 100644
--- a/java/org/libjpegturbo/turbojpeg/YUVImage.java
+++ b/java/org/libjpegturbo/turbojpeg/YUVImage.java
@@ -1,5 +1,5 @@
 /*
- * Copyright (C)2014 D. R. Commander.  All Rights Reserved.
+ * Copyright (C)2014, 2017 D. R. Commander.  All Rights Reserved.
  * Copyright (C)2015 Viktor Szathmáry.  All Rights Reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -220,10 +220,13 @@
       throw new IllegalArgumentException("Invalid argument in YUVImage::setBuf()");
 
     int nc = (subsamp == TJ.SAMP_GRAY ? 1 : 3);
-    if (planes.length != nc || (offsets != null && offsets.length != nc) ||
+    if ((planes != null && planes.length != nc) ||
+        (offsets != null && offsets.length != nc) ||
         (strides != null && strides.length != nc))
       throw new IllegalArgumentException("YUVImage::setBuf(): planes, offsets, or strides array is the wrong size");
 
+    if (planes == null)
+      planes = new byte[nc][];
     if (offsets == null)
       offsets = new int[nc];
     if (strides == null)
diff --git a/jcdctmgr.c b/jcdctmgr.c
index aef8517..6e3b19b 100644
--- a/jcdctmgr.c
+++ b/jcdctmgr.c
@@ -216,7 +216,7 @@
 #endif
   dtbl[DCTSIZE2 * 3] = (DCTELEM) r - sizeof(DCTELEM)*8; /* shift */
 
-  if(r <= 16) return 0;
+  if (r <= 16) return 0;
   else return 1;
 }
 
diff --git a/jconfig.h.in b/jconfig.h.in
new file mode 100644
index 0000000..02c12cc
--- /dev/null
+++ b/jconfig.h.in
@@ -0,0 +1,73 @@
+/* Version ID for the JPEG library.
+ * Might be useful for tests like "#if JPEG_LIB_VERSION >= 60".
+ */
+#define JPEG_LIB_VERSION  62	/* Version 6b */
+
+/* libjpeg-turbo version */
+#define LIBJPEG_TURBO_VERSION 0
+
+/* libjpeg-turbo version in integer form */
+#define LIBJPEG_TURBO_VERSION_NUMBER 0
+
+/* Support arithmetic encoding */
+#undef C_ARITH_CODING_SUPPORTED
+
+/* Support arithmetic decoding */
+#undef D_ARITH_CODING_SUPPORTED
+
+/*
+ * Define BITS_IN_JSAMPLE as either
+ *   8   for 8-bit sample values (the usual setting)
+ *   12  for 12-bit sample values
+ * Only 8 and 12 are legal data precisions for lossy JPEG according to the
+ * JPEG standard, and the IJG code does not support anything else!
+ * We do not support run-time selection of data precision, sorry.
+ */
+
+#define BITS_IN_JSAMPLE  8      /* use 8 or 12 */
+
+/* Define to 1 if you have the <locale.h> header file. */
+#undef HAVE_LOCALE_H
+
+/* Define to 1 if you have the <stddef.h> header file. */
+#undef HAVE_STDDEF_H
+
+/* Define to 1 if you have the <stdlib.h> header file. */
+#undef HAVE_STDLIB_H
+
+/* Define to 1 if the system has the type `unsigned char'. */
+#undef HAVE_UNSIGNED_CHAR
+
+/* Define to 1 if the system has the type `unsigned short'. */
+#undef HAVE_UNSIGNED_SHORT
+
+/* Compiler does not support pointers to undefined structures. */
+#undef INCOMPLETE_TYPES_BROKEN
+
+/* Support in-memory source/destination managers */
+#undef MEM_SRCDST_SUPPORTED
+
+/* Define if you have BSD-like bzero and bcopy in <strings.h> rather than
+   memset/memcpy in <string.h>. */
+#undef NEED_BSD_STRINGS
+
+/* Define if you need to include <sys/types.h> to get size_t. */
+#undef NEED_SYS_TYPES_H
+
+/* Define if your (broken) compiler shifts signed values as if they were
+   unsigned. */
+#undef RIGHT_SHIFT_IS_UNSIGNED
+
+/* Use accelerated SIMD routines. */
+#undef WITH_SIMD
+
+/* Define to 1 if type `char' is unsigned and you are not using gcc.  */
+#ifndef __CHAR_UNSIGNED__
+# undef __CHAR_UNSIGNED__
+#endif
+
+/* Define to empty if `const' does not conform to ANSI C. */
+#undef const
+
+/* Define to `unsigned int' if <sys/types.h> does not define. */
+#undef size_t
diff --git a/jconfigint.h.in b/jconfigint.h.in
new file mode 100644
index 0000000..963e760
--- /dev/null
+++ b/jconfigint.h.in
@@ -0,0 +1,17 @@
+/* libjpeg-turbo build number */
+#undef BUILD
+
+/* Compiler's inline keyword */
+#undef inline
+
+/* How to obtain function inlining. */
+#undef INLINE
+
+/* Define to the full name of this package. */
+#undef PACKAGE_NAME
+
+/* Version number of package */
+#undef VERSION
+
+/* The size of `size_t', as computed by sizeof. */
+#undef SIZEOF_SIZE_T
diff --git a/jcstest.c b/jcstest.c
index 358ed25..11883b5 100644
--- a/jcstest.c
+++ b/jcstest.c
@@ -77,7 +77,7 @@
   jerr.pub.error_exit = my_error_exit;
   jerr.pub.output_message = my_output_message;
 
-  if(setjmp(jerr.jb)) {
+  if (setjmp(jerr.jb)) {
     /* this will execute if libjpeg has an error */
     jcs_valid = 0;
     goto done;
@@ -104,7 +104,7 @@
   printf("  Not present at compile time\n");
   #endif
 
-  if(setjmp(jerr.jb)) {
+  if (setjmp(jerr.jb)) {
     /* this will execute if libjpeg has an error */
     jcs_alpha_valid = 0;
     goto done2;
diff --git a/jdapistd.c b/jdapistd.c
index 37afc84..105121d 100644
--- a/jdapistd.c
+++ b/jdapistd.c
@@ -4,7 +4,7 @@
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1994-1996, Thomas G. Lane.
  * libjpeg-turbo Modifications:
- * Copyright (C) 2010, 2015-2016, D. R. Commander.
+ * Copyright (C) 2010, 2015-2017, D. R. Commander.
  * Copyright (C) 2015, Google, Inc.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
@@ -190,7 +190,10 @@
    * single-pass decompression case, allowing us to use the same MCU column
    * width for all of the components.
    */
-  align = cinfo->_min_DCT_scaled_size * cinfo->max_h_samp_factor;
+  if (cinfo->comps_in_scan == 1 && cinfo->num_components == 1)
+    align = cinfo->_min_DCT_scaled_size;
+  else
+    align = cinfo->_min_DCT_scaled_size * cinfo->max_h_samp_factor;
 
   /* Adjust xoffset to the nearest iMCU boundary <= the requested value */
   input_xoffset = *xoffset;
@@ -215,6 +218,9 @@
 
   for (ci = 0, compptr = cinfo->comp_info; ci < cinfo->num_components;
        ci++, compptr++) {
+    int hsf = (cinfo->comps_in_scan == 1 && cinfo->num_components == 1) ?
+              1 : compptr->h_samp_factor;
+
     /* Set downsampled_width to the new output width. */
     orig_downsampled_width = compptr->downsampled_width;
     compptr->downsampled_width =
@@ -228,11 +234,10 @@
      * values will be used in multi-scan decompressions.
      */
     cinfo->master->first_MCU_col[ci] =
-      (JDIMENSION) (long) (*xoffset * compptr->h_samp_factor) /
-                   (long) align;
+      (JDIMENSION) (long) (*xoffset * hsf) / (long) align;
     cinfo->master->last_MCU_col[ci] =
       (JDIMENSION) jdiv_round_up((long) ((*xoffset + cinfo->output_width) *
-                                         compptr->h_samp_factor),
+                                         hsf),
                                  (long) align) - 1;
   }
 
@@ -293,6 +298,14 @@
 }
 
 
+/* Dummy quantize function used by jpeg_skip_scanlines() */
+LOCAL(void)
+noop_quantize (j_decompress_ptr cinfo, JSAMPARRAY input_buf,
+               JSAMPARRAY output_buf, int num_rows)
+{
+}
+
+
 /*
  * In some cases, it is best to call jpeg_read_scanlines() and discard the
  * output, rather than skipping the scanlines, because this allows us to
@@ -308,14 +321,22 @@
   void (*color_convert) (j_decompress_ptr cinfo, JSAMPIMAGE input_buf,
                          JDIMENSION input_row, JSAMPARRAY output_buf,
                          int num_rows);
+  void (*color_quantize) (j_decompress_ptr cinfo, JSAMPARRAY input_buf,
+                          JSAMPARRAY output_buf, int num_rows) = NULL;
 
   color_convert = cinfo->cconvert->color_convert;
   cinfo->cconvert->color_convert = noop_convert;
+  if (cinfo->cquantize && cinfo->cquantize->color_quantize) {
+    color_quantize = cinfo->cquantize->color_quantize;
+    cinfo->cquantize->color_quantize = noop_quantize;
+  }
 
   for (n = 0; n < num_lines; n++)
     jpeg_read_scanlines(cinfo, NULL, 1);
 
   cinfo->cconvert->color_convert = color_convert;
+  if (color_quantize)
+    cinfo->cquantize->color_quantize = color_quantize;
 }
 
 
@@ -370,6 +391,8 @@
   /* Do not skip past the bottom of the image. */
   if (cinfo->output_scanline + num_lines >= cinfo->output_height) {
     cinfo->output_scanline = cinfo->output_height;
+    (*cinfo->inputctl->finish_input_pass) (cinfo);
+    cinfo->inputctl->eoi_reached = TRUE;
     return cinfo->output_height - cinfo->output_scanline;
   }
 
diff --git a/jdarith.c b/jdarith.c
index df3540e..ce0f920 100644
--- a/jdarith.c
+++ b/jdarith.c
@@ -21,6 +21,9 @@
 #include "jpeglib.h"
 
 
+#define NEG_1 ((unsigned int)-1)
+
+
 /* Expanded entropy decoder object for arithmetic decoding. */
 
 typedef struct {
@@ -450,7 +453,7 @@
   tbl = cinfo->cur_comp_info[0]->ac_tbl_no;
 
   p1 = 1 << cinfo->Al;          /* 1 in the bit position being coded */
-  m1 = (-1) << cinfo->Al;       /* -1 in the bit position being coded */
+  m1 = (NEG_1) << cinfo->Al;    /* -1 in the bit position being coded */
 
   /* Establish EOBx (previous stage end-of-block) index */
   for (kex = cinfo->Se; kex > 0; kex--)
diff --git a/jdatadst-tj.c b/jdatadst-tj.c
index c6144ec..a2219df 100644
--- a/jdatadst-tj.c
+++ b/jdatadst-tj.c
@@ -130,7 +130,7 @@
 {
   my_mem_dest_ptr dest = (my_mem_dest_ptr) cinfo->dest;
 
-  if(dest->alloc) *dest->outbuffer = dest->buffer;
+  if (dest->alloc) *dest->outbuffer = dest->buffer;
   *dest->outsize = (unsigned long)(dest->bufsize - dest->pub.free_in_buffer);
 }
 
diff --git a/jdcolor.c b/jdcolor.c
index ab8fa24..05cbf4d 100644
--- a/jdcolor.c
+++ b/jdcolor.c
@@ -616,7 +616,7 @@
 static INLINE boolean is_big_endian(void)
 {
   int test_value = 1;
-  if(*(char *)&test_value != 1)
+  if (*(char *)&test_value != 1)
     return TRUE;
   return FALSE;
 }
diff --git a/jdmerge.c b/jdmerge.c
index 6276dd0..ca6f16c 100644
--- a/jdmerge.c
+++ b/jdmerge.c
@@ -503,7 +503,7 @@
 static INLINE boolean is_big_endian(void)
 {
   int test_value = 1;
-  if(*(char *)&test_value != 1)
+  if (*(char *)&test_value != 1)
     return TRUE;
   return FALSE;
 }
diff --git a/jmemmgr.c b/jmemmgr.c
index b17c5c5..8dfb633 100644
--- a/jmemmgr.c
+++ b/jmemmgr.c
@@ -32,7 +32,9 @@
 #include "jinclude.h"
 #include "jpeglib.h"
 #include "jmemsys.h"            /* import the system-dependent declarations */
+#ifndef _WIN32
 #include <stdint.h>
+#endif
 #include <limits.h>
 
 #ifndef NO_GETENV
diff --git a/jmemnobs.c b/jmemnobs.c
index 5797198..ac12afa 100644
--- a/jmemnobs.c
+++ b/jmemnobs.c
@@ -3,8 +3,8 @@
  *
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1992-1996, Thomas G. Lane.
- * It was modified by The libjpeg-turbo Project to include only code and
- * information relevant to libjpeg-turbo.
+ * libjpeg-turbo Modifications:
+ * Copyright (C) 2017, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
  *
@@ -15,7 +15,6 @@
  * This is very portable in the sense that it'll compile on almost anything,
  * but you'd better have lots of main memory (or virtual memory) if you want
  * to process big images.
- * Note that the max_memory_to_use option is ignored by this implementation.
  */
 
 #define JPEG_INTERNALS
@@ -66,14 +65,21 @@
 
 /*
  * This routine computes the total memory space available for allocation.
- * Here we always say, "we got all you want bud!"
  */
 
 GLOBAL(size_t)
 jpeg_mem_available (j_common_ptr cinfo, size_t min_bytes_needed,
                     size_t max_bytes_needed, size_t already_allocated)
 {
-  return max_bytes_needed;
+  if (cinfo->mem->max_memory_to_use) {
+    if (cinfo->mem->max_memory_to_use > already_allocated)
+      return cinfo->mem->max_memory_to_use - already_allocated;
+    else
+      return 0;
+  } else {
+    /* Here we always say, "we got all you want bud!" */
+    return max_bytes_needed;
+  }
 }
 
 
diff --git a/jpegtran.1 b/jpegtran.1
index 7f3c853..631455b 100644
--- a/jpegtran.1
+++ b/jpegtran.1
@@ -1,4 +1,4 @@
-.TH JPEGTRAN 1 "18 February 2016"
+.TH JPEGTRAN 1 "18 March 2017"
 .SH NAME
 jpegtran \- lossless transformation of JPEG files
 .SH SYNOPSIS
@@ -222,7 +222,7 @@
 in thousands of bytes, or millions of bytes if "M" is attached to the
 number.  For example,
 .B \-max 4m
-selects 4000000 bytes.  If more space is needed, temporary files will be used.
+selects 4000000 bytes.  If more space is needed, an error will occur.
 .TP
 .BI \-outfile " name"
 Send output image to the named file, not to standard output.
diff --git a/jpegtran.c b/jpegtran.c
index c44f21e..6f8fd5b 100644
--- a/jpegtran.c
+++ b/jpegtran.c
@@ -4,7 +4,7 @@
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1995-2010, Thomas G. Lane, Guido Vollbeding.
  * libjpeg-turbo Modifications:
- * Copyright (C) 2010, 2014, D. R. Commander.
+ * Copyright (C) 2010, 2014, 2017, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
  *
@@ -90,7 +90,7 @@
   fprintf(stderr, "  -version       Print version information and exit\n");
   fprintf(stderr, "Switches for wizards:\n");
 #ifdef C_MULTISCAN_FILES_SUPPORTED
-  fprintf(stderr, "  -scans file    Create multi-scan JPEG per script file\n");
+  fprintf(stderr, "  -scans FILE    Create multi-scan JPEG per script FILE\n");
 #endif
   exit(EXIT_FAILURE);
 }
diff --git a/jversion.h b/jversion.h
index 6ce663d..7e44eaa 100644
--- a/jversion.h
+++ b/jversion.h
@@ -4,7 +4,7 @@
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1991-2012, Thomas G. Lane, Guido Vollbeding.
  * libjpeg-turbo Modifications:
- * Copyright (C) 2010, 2012-2016, D. R. Commander.
+ * Copyright (C) 2010, 2012-2017, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
  *
@@ -35,7 +35,7 @@
  *   their code
  */
 
-#define JCOPYRIGHT      "Copyright (C) 2009-2016 D. R. Commander\n" \
+#define JCOPYRIGHT      "Copyright (C) 2009-2017 D. R. Commander\n" \
                         "Copyright (C) 2011-2016 Siarhei Siamashka\n" \
                         "Copyright (C) 2015-2016 Matthieu Darbois\n" \
                         "Copyright (C) 2015 Google, Inc.\n" \
@@ -46,4 +46,4 @@
                         "Copyright (C) 1999-2006 MIYASAKA Masaru\n" \
                         "Copyright (C) 1991-2016 Thomas G. Lane, Guido Vollbeding" \
 
-#define JCOPYRIGHT_SHORT "Copyright (C) 1991-2016 The libjpeg-turbo Project and many others"
+#define JCOPYRIGHT_SHORT "Copyright (C) 1991-2017 The libjpeg-turbo Project and many others"
diff --git a/libjpeg.txt b/libjpeg.txt
index 71d37c6..5181afc 100644
--- a/libjpeg.txt
+++ b/libjpeg.txt
@@ -3,7 +3,7 @@
 This file was part of the Independent JPEG Group's software:
 Copyright (C) 1994-2013, Thomas G. Lane, Guido Vollbeding.
 libjpeg-turbo Modifications:
-Copyright (C) 2010, 2014-2016, D. R. Commander.
+Copyright (C) 2010, 2014-2017, D. R. Commander.
 Copyright (C) 2015, Google, Inc.
 For conditions of distribution and use, see the accompanying README.ijg file.
 
@@ -34,6 +34,7 @@
         Data formats
         Compression details
         Decompression details
+        Partial image decompression
         Mechanics of usage: include files, linking, etc
 Advanced features:
         Compression parameter selection
@@ -2941,13 +2942,6 @@
 buffers.  Such buffers are treated as "virtual arrays": only the current strip
 need be in memory, and the rest can be swapped out to a temporary file.
 
-If you use the simplest memory manager back end (jmemnobs.c), then no
-temporary files are used; virtual arrays are simply malloc()'d.  Images bigger
-than memory can be processed only if your system supports virtual memory.
-The other memory manager back ends support temporary files of various flavors
-and thus work in machines without virtual memory.  They may also be useful on
-Unix machines if you need to process images that exceed available swap space.
-
 When using temporary files, the library will make the in-memory buffers for
 its virtual arrays just big enough to stay within a "maximum memory" setting.
 Your application can set this limit by setting cinfo->mem->max_memory_to_use
@@ -2960,6 +2954,11 @@
 it's too small to be worth worrying about; so a reasonable safety margin
 should be left when setting max_memory_to_use.
 
+NOTE: Unless you develop your own memory manager back end, then temporary files
+will never be used.  The back end provided in libjpeg-turbo (jmemnobs.c) simply
+malloc()s and free()s virtual arrays, and an error occurs if the required
+memory exceeds the limit specified in cinfo->mem->max_memory_to_use.
+
 
 Memory usage
 ------------
diff --git a/md5/md5.c b/md5/md5.c
index 087f4b0..4b5ba5e 100644
--- a/md5/md5.c
+++ b/md5/md5.c
@@ -31,6 +31,15 @@
 
 #include "./md5.h"
 
+#ifdef __amigaos4__
+#include <machine/endian.h>
+#define le32toh(x) (((x & 0xff) << 24) | \
+                    ((x & 0xff00) << 8) | \
+                    ((x & 0xff0000) >> 8) | \
+                    ((x & 0xff000000) >> 24))
+#define htole32(x) le32toh(x)
+#endif
+
 static void MD5Transform(unsigned int [4], const unsigned char [64]);
 
 #if (BYTE_ORDER == LITTLE_ENDIAN)
diff --git a/release/libjpeg-turbo.spec.in b/release/libjpeg-turbo.spec.in
index 4b792d7..e4e4b9c 100644
--- a/release/libjpeg-turbo.spec.in
+++ b/release/libjpeg-turbo.spec.in
@@ -13,12 +13,22 @@
 # Path under which headers should be installed
 %define _includedir %{__includedir}
 
-# _libdir is set to %{_prefix}/%{_lib} by default
-%ifarch x86_64
-%define _lib lib64
+%if "%{?__isa_bits:1}" == "1"
+%define _bits %{__isa_bits}
+%else
+# RPM < 4.6
+%if "%{_lib}" == "lib64"
+%define _bits 64
+%else
+%define _bits 32
+%endif
+%endif
+
+%if "%{_bits}" == "64"
+%define _libdir %{_exec_prefix}/lib64
 %else
 %if "%{_prefix}" == "/opt/libjpeg-turbo"
-%define _lib lib32
+%define _libdir %{_exec_prefix}/lib32
 %endif
 %endif
 
@@ -36,7 +46,7 @@
 License: BSD-style
 BuildRoot: %{_blddir}/%{name}-buildroot-%{version}-%{release}
 Prereq: /sbin/ldconfig
-%ifarch x86_64
+%if "%{_bits}" == "64"
 Provides: %{name} = %{version}-%{release}, @PACKAGE_NAME@ = %{version}-%{release}, libturbojpeg.so()(64bit)
 %else
 Provides: %{name} = %{version}-%{release}, @PACKAGE_NAME@ = %{version}-%{release}, libturbojpeg.so
@@ -73,7 +83,8 @@
 #-->	mandir=%{_mandir} JPEG_LIB_VERSION=@JPEG_LIB_VERSION@ \
 #-->	SO_MAJOR_VERSION=@SO_MAJOR_VERSION@ SO_MINOR_VERSION=@SO_MINOR_VERSION@ \
 #-->	--with-pic @RPM_CONFIG_ARGS@
-#-->make DESTDIR=$RPM_BUILD_ROOT
+#-->export NUMCPUS=`grep -c '^processor' /proc/cpuinfo`
+#-->make -j$NUMCPUS --load-average=$NUMCPUS DESTDIR=$RPM_BUILD_ROOT
 
 %install
 
@@ -86,7 +97,7 @@
 
 LJT_LIBDIR=%{__libdir}
 if [ ! "$LJT_LIBDIR" = "%{_libdir}" ]; then
-	echo ERROR: libjpeg-turbo must be configured with libdir=%{_prefix}/%{_lib} when generating an in-tree RPM for this architecture.
+	echo ERROR: libjpeg-turbo must be configured with libdir=%{_libdir} when generating an in-tree RPM for this architecture.
 	exit 1
 fi
 
diff --git a/simd/CMakeLists.txt b/simd/CMakeLists.txt
index 37938ec..6e898d8 100755
--- a/simd/CMakeLists.txt
+++ b/simd/CMakeLists.txt
@@ -1,6 +1,7 @@
 if(NOT DEFINED NASM)
-  set(NASM nasm CACHE FILEPATH "Path to NASM/YASM executable")
+  find_program(NASM NAMES nasm yasm DOC "Path to NASM/YASM executable")
 endif()
+message(STATUS "NASM = ${NASM}")
 
 if(SIMD_X86_64)
   set(NAFLAGS -fwin64 -DWIN64 -D__x86_64__)
diff --git a/simd/jchuff-sse2.asm b/simd/jchuff-sse2.asm
index 36d1f2d..b81db75 100644
--- a/simd/jchuff-sse2.asm
+++ b/simd/jchuff-sse2.asm
@@ -1,7 +1,7 @@
 ;
 ; jchuff-sse2.asm - Huffman entropy encoding (SSE2)
 ;
-; Copyright (C) 2009-2011, 2014-2016, D. R. Commander.
+; Copyright (C) 2009-2011, 2014-2017, D. R. Commander.
 ; Copyright (C) 2015, Matthieu Darbois.
 ;
 ; Based on the x86 SIMD extension for IJG JPEG library
@@ -288,13 +288,13 @@
 
 .BLOOP:
         bsf ecx, edx  ; r = __builtin_ctzl(index);
-        jz .ELOOP
+        jz near .ELOOP
         lea esi, [esi+ecx*2]  ; k += r;
         shr edx, cl  ; index >>= r;
         mov DWORD [esp+temp3], edx
 .BRLOOP:
         cmp ecx, 16  ; while (r > 15) {
-        jl .ERLOOP
+        jl near .ERLOOP
         sub ecx, 16 ; r -= 16;
         mov DWORD [esp+temp], ecx
         mov   eax, INT [ebp + 240 * 4]  ; code_0xf0 = actbl->ehufco[0xf0];
@@ -348,7 +348,7 @@
         sub eax, esi
         shr eax, 1
         bsf ecx, edx  ; r = __builtin_ctzl(index);
-        jz .ELOOP2
+        jz near .ELOOP2
         shr edx, cl  ; index >>= r;
         add ecx, eax
         lea esi, [esi+ecx*2]  ; k += r;
@@ -356,13 +356,13 @@
         jmp .BRLOOP2
 .BLOOP2:
         bsf ecx, edx  ; r = __builtin_ctzl(index);
-        jz .ELOOP2
+        jz near .ELOOP2
         lea esi, [esi+ecx*2]  ; k += r;
         shr edx, cl  ; index >>= r;
         mov DWORD [esp+temp3], edx
 .BRLOOP2:
         cmp ecx, 16  ; while (r > 15) {
-        jl .ERLOOP2
+        jl near .ERLOOP2
         sub ecx, 16  ; r -= 16;
         mov DWORD [esp+temp], ecx
         mov   eax, INT [ebp + 240 * 4]  ; code_0xf0 = actbl->ehufco[0xf0];
diff --git a/simd/jsimd_arm.c b/simd/jsimd_arm.c
index 61cd073..0b955cd 100644
--- a/simd/jsimd_arm.c
+++ b/simd/jsimd_arm.c
@@ -2,6 +2,7 @@
  * jsimd_arm.c
  *
  * Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
+ * Copyright (C) 2011, Nokia Corporation and/or its subsidiary(-ies).
  * Copyright (C) 2009-2011, 2013-2014, 2016, D. R. Commander.
  * Copyright (C) 2015-2016, Matthieu Darbois.
  *
diff --git a/simd/jsimd_arm64.c b/simd/jsimd_arm64.c
index 09449bb..f6e9736 100644
--- a/simd/jsimd_arm64.c
+++ b/simd/jsimd_arm64.c
@@ -2,6 +2,7 @@
  * jsimd_arm64.c
  *
  * Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
+ * Copyright (C) 2011, Nokia Corporation and/or its subsidiary(-ies).
  * Copyright (C) 2009-2011, 2013-2014, 2016, D. R. Commander.
  * Copyright (C) 2015-2016, Matthieu Darbois.
  *
diff --git a/simd/jsimd_mips.c b/simd/jsimd_mips.c
index 63b8115..02e90cd 100644
--- a/simd/jsimd_mips.c
+++ b/simd/jsimd_mips.c
@@ -63,6 +63,8 @@
 LOCAL(void)
 init_simd (void)
 {
+  char *env = NULL;
+
   if (simd_support != ~0U)
     return;
 
diff --git a/simd/jsimd_mips_dspr2.S b/simd/jsimd_mips_dspr2.S
index c8c286c..c26dd5c 100644
--- a/simd/jsimd_mips_dspr2.S
+++ b/simd/jsimd_mips_dspr2.S
@@ -4484,4 +4484,3 @@
 END(jsimd_convsamp_float_mips_dspr2)
 
 /*****************************************************************************/
-
diff --git a/simd/jsimd_mips_dspr2_asm.h b/simd/jsimd_mips_dspr2_asm.h
index 64f9880..499e34b 100644
--- a/simd/jsimd_mips_dspr2_asm.h
+++ b/simd/jsimd_mips_dspr2_asm.h
@@ -281,5 +281,3 @@
     addiu           sp, sp, \stack_offset
     .endif
 .endm
-
-
diff --git a/simd/jsimd_powerpc.c b/simd/jsimd_powerpc.c
index 42dc1e0..47dd746 100644
--- a/simd/jsimd_powerpc.c
+++ b/simd/jsimd_powerpc.c
@@ -14,6 +14,11 @@
  * PowerPC architecture.
  */
 
+#ifdef __amigaos4__
+/* This must be defined first as it re-defines GLOBAL otherwise */
+#include <proto/exec.h>
+#endif
+
 #define JPEG_INTERNALS
 #include "../jinclude.h"
 #include "../jpeglib.h"
@@ -26,6 +31,12 @@
 #include <string.h>
 #include <ctype.h>
 
+#if defined(__OpenBSD__)
+#include <sys/param.h>
+#include <sys/sysctl.h>
+#include <machine/cpu.h>
+#endif
+
 static unsigned int simd_support = ~0;
 
 #if defined(__linux__) || defined(ANDROID) || defined(__ANDROID__)
@@ -101,6 +112,12 @@
   char *env = NULL;
 #if !defined(__ALTIVEC__) && (defined(__linux__) || defined(ANDROID) || defined(__ANDROID__))
   int bufsize = 1024; /* an initial guess for the line buffer size limit */
+#elif defined(__amigaos4__)
+  uint32 altivec = 0;
+#elif defined(__OpenBSD__)
+  int mib[2] = { CTL_MACHDEP, CPU_ALTIVEC };
+  int altivec;
+  size_t len = sizeof(altivec);
 #endif
 
   if (simd_support != ~0U)
@@ -116,6 +133,13 @@
     if (bufsize > SOMEWHAT_SANE_PROC_CPUINFO_SIZE_LIMIT)
       break;
   }
+#elif defined(__amigaos4__)
+  IExec->GetCPUInfoTags(GCIT_VectorUnit, &altivec, TAG_DONE);
+  if(altivec == VECTORTYPE_ALTIVEC)
+    simd_support |= JSIMD_ALTIVEC;
+#elif defined(__OpenBSD__)
+  if (sysctl(mib, 2, &altivec, &len, NULL, 0) == 0 && altivec != 0)
+    simd_support |= JSIMD_ALTIVEC;
 #endif
 
   /* Force different settings through environment variables */
diff --git a/simd/jsimdcfg.inc.h b/simd/jsimdcfg.inc.h
new file mode 100644
index 0000000..d2b499f
--- /dev/null
+++ b/simd/jsimdcfg.inc.h
@@ -0,0 +1,130 @@
+// This file generates the include file for the assembly
+// implementations by abusing the C preprocessor.
+//
+// Note: Some things are manually defined as they need to
+// be mapped to NASM types.
+
+;
+; Automatically generated include file from jsimdcfg.inc.h
+;
+
+#define JPEG_INTERNALS
+
+#include "../jpeglib.h"
+#include "../jconfig.h"
+#include "../jmorecfg.h"
+#include "jsimd.h"
+
+;
+; -- jpeglib.h
+;
+
+%define _cpp_protection_DCTSIZE DCTSIZE
+%define _cpp_protection_DCTSIZE2 DCTSIZE2
+
+;
+; -- jmorecfg.h
+;
+
+%define _cpp_protection_RGB_RED RGB_RED
+%define _cpp_protection_RGB_GREEN RGB_GREEN
+%define _cpp_protection_RGB_BLUE RGB_BLUE
+%define _cpp_protection_RGB_PIXELSIZE RGB_PIXELSIZE
+
+%define _cpp_protection_EXT_RGB_RED EXT_RGB_RED
+%define _cpp_protection_EXT_RGB_GREEN EXT_RGB_GREEN
+%define _cpp_protection_EXT_RGB_BLUE EXT_RGB_BLUE
+%define _cpp_protection_EXT_RGB_PIXELSIZE EXT_RGB_PIXELSIZE
+
+%define _cpp_protection_EXT_RGBX_RED EXT_RGBX_RED
+%define _cpp_protection_EXT_RGBX_GREEN EXT_RGBX_GREEN
+%define _cpp_protection_EXT_RGBX_BLUE EXT_RGBX_BLUE
+%define _cpp_protection_EXT_RGBX_PIXELSIZE EXT_RGBX_PIXELSIZE
+
+%define _cpp_protection_EXT_BGR_RED EXT_BGR_RED
+%define _cpp_protection_EXT_BGR_GREEN EXT_BGR_GREEN
+%define _cpp_protection_EXT_BGR_BLUE EXT_BGR_BLUE
+%define _cpp_protection_EXT_BGR_PIXELSIZE EXT_BGR_PIXELSIZE
+
+%define _cpp_protection_EXT_BGRX_RED EXT_BGRX_RED
+%define _cpp_protection_EXT_BGRX_GREEN EXT_BGRX_GREEN
+%define _cpp_protection_EXT_BGRX_BLUE EXT_BGRX_BLUE
+%define _cpp_protection_EXT_BGRX_PIXELSIZE EXT_BGRX_PIXELSIZE
+
+%define _cpp_protection_EXT_XBGR_RED EXT_XBGR_RED
+%define _cpp_protection_EXT_XBGR_GREEN EXT_XBGR_GREEN
+%define _cpp_protection_EXT_XBGR_BLUE EXT_XBGR_BLUE
+%define _cpp_protection_EXT_XBGR_PIXELSIZE EXT_XBGR_PIXELSIZE
+
+%define _cpp_protection_EXT_XRGB_RED EXT_XRGB_RED
+%define _cpp_protection_EXT_XRGB_GREEN EXT_XRGB_GREEN
+%define _cpp_protection_EXT_XRGB_BLUE EXT_XRGB_BLUE
+%define _cpp_protection_EXT_XRGB_PIXELSIZE EXT_XRGB_PIXELSIZE
+
+%define RGBX_FILLER_0XFF        1
+
+; Representation of a single sample (pixel element value).
+; On this SIMD implementation, this must be 'unsigned char'.
+;
+
+%define JSAMPLE                 byte          ; unsigned char
+%define SIZEOF_JSAMPLE          SIZEOF_BYTE   ; sizeof(JSAMPLE)
+
+%define _cpp_protection_CENTERJSAMPLE CENTERJSAMPLE
+
+; Representation of a DCT frequency coefficient.
+; On this SIMD implementation, this must be 'short'.
+;
+%define JCOEF                   word          ; short
+%define SIZEOF_JCOEF            SIZEOF_WORD   ; sizeof(JCOEF)
+
+; Datatype used for image dimensions.
+; On this SIMD implementation, this must be 'unsigned int'.
+;
+%define JDIMENSION              dword         ; unsigned int
+%define SIZEOF_JDIMENSION       SIZEOF_DWORD  ; sizeof(JDIMENSION)
+
+%define JSAMPROW                POINTER       ; JSAMPLE *     (jpeglib.h)
+%define JSAMPARRAY              POINTER       ; JSAMPROW *    (jpeglib.h)
+%define JSAMPIMAGE              POINTER       ; JSAMPARRAY *  (jpeglib.h)
+%define JCOEFPTR                POINTER       ; JCOEF *       (jpeglib.h)
+%define SIZEOF_JSAMPROW         SIZEOF_POINTER  ; sizeof(JSAMPROW)
+%define SIZEOF_JSAMPARRAY       SIZEOF_POINTER  ; sizeof(JSAMPARRAY)
+%define SIZEOF_JSAMPIMAGE       SIZEOF_POINTER  ; sizeof(JSAMPIMAGE)
+%define SIZEOF_JCOEFPTR         SIZEOF_POINTER  ; sizeof(JCOEFPTR)
+
+;
+; -- jdct.h
+;
+
+; A forward DCT routine is given a pointer to a work area of type DCTELEM[];
+; the DCT is to be performed in-place in that buffer.
+; To maximize parallelism, Type DCTELEM is changed to short (originally, int).
+;
+%define DCTELEM                 word          ; short
+%define SIZEOF_DCTELEM          SIZEOF_WORD   ; sizeof(DCTELEM)
+
+%define FAST_FLOAT              FP32            ; float
+%define SIZEOF_FAST_FLOAT       SIZEOF_FP32     ; sizeof(FAST_FLOAT)
+
+; To maximize parallelism, Type MULTIPLIER is changed to short.
+;
+%define ISLOW_MULT_TYPE         word          ; must be short
+%define SIZEOF_ISLOW_MULT_TYPE  SIZEOF_WORD   ; sizeof(ISLOW_MULT_TYPE)
+
+%define IFAST_MULT_TYPE         word          ; must be short
+%define SIZEOF_IFAST_MULT_TYPE  SIZEOF_WORD   ; sizeof(IFAST_MULT_TYPE)
+%define IFAST_SCALE_BITS        2             ; fractional bits in scale factors
+
+%define FLOAT_MULT_TYPE         FP32          ; must be float
+%define SIZEOF_FLOAT_MULT_TYPE  SIZEOF_FP32   ; sizeof(FLOAT_MULT_TYPE)
+
+;
+; -- jsimd.h
+;
+
+%define _cpp_protection_JSIMD_NONE JSIMD_NONE
+%define _cpp_protection_JSIMD_MMX JSIMD_MMX
+%define _cpp_protection_JSIMD_3DNOW JSIMD_3DNOW
+%define _cpp_protection_JSIMD_SSE JSIMD_SSE
+%define _cpp_protection_JSIMD_SSE2 JSIMD_SSE2
diff --git a/simd/jsimdext.inc b/simd/jsimdext.inc
index 6631034..f28db60 100644
--- a/simd/jsimdext.inc
+++ b/simd/jsimdext.inc
@@ -178,13 +178,7 @@
 ;  External Symbol Name
 ;
 %ifndef EXTN
-# Android Modification:
-# The unmodified code from upstream appends an underscore to the front of
-# "name" here.  It is unclear why.  Before removing the underscore, the
-# code failed to link because the function names in the SIMD code did not
-# match the callers (because of the extra underscore).  This fix only
-# applies to x86 SIMD code.  x86_64 is handled properly by the code above.
-%define EXTN(name)   name          ; foo() -> _foo
+%define EXTN(name)   _ %+ name          ; foo() -> _foo
 %endif
 
 ; --------------------------------------------------------------------------
diff --git a/structure.txt b/structure.txt
index 296d125..f69c9d8 100644
--- a/structure.txt
+++ b/structure.txt
@@ -832,21 +832,19 @@
 write_backing_store,
 close_backing_store
 
-On some systems there will be more than one type of backing-store object
-(specifically, in MS-DOS a backing store file might be an area of extended
-memory as well as a disk file).  jpeg_open_backing_store is responsible for
-choosing how to implement a given object.  The read/write/close routines
-are method pointers in the structure that describes a given object; this
-lets them be different for different object types.
+On some systems there will be more than one type of backing-store object.
+jpeg_open_backing_store is responsible for choosing how to implement a given
+object.  The read/write/close routines are method pointers in the structure
+that describes a given object; this lets them be different for different object
+types.
 
 It may be necessary to ensure that backing store objects are explicitly
-released upon abnormal program termination.  For example, MS-DOS won't free
-extended memory by itself.  To support this, we will expect the main program
-or surrounding application to arrange to call self_destruct (typically via
-jpeg_destroy) upon abnormal termination.  This may require a SIGINT signal
-handler or equivalent.  We don't want to have the back end module install its
-own signal handler, because that would pre-empt the surrounding application's
-ability to control signal handling.
+released upon abnormal program termination.  To support this, we will expect
+the main program or surrounding application to arrange to call self_destruct
+(typically via jpeg_destroy) upon abnormal termination.  This may require a
+SIGINT signal handler or equivalent.  We don't want to have the back end module
+install its own signal handler, because that would pre-empt the surrounding
+application's ability to control signal handling.
 
 The IJG distribution includes several memory manager back end implementations.
 Usually the same back end should be suitable for all applications on a given
diff --git a/tjbench.c b/tjbench.c
index 9db1968..76b61cd 100644
--- a/tjbench.c
+++ b/tjbench.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (C)2009-2016 D. R. Commander.  All Rights Reserved.
+ * Copyright (C)2009-2017 D. R. Commander.  All Rights Reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are met:
@@ -40,13 +40,13 @@
 
 #define _throw(op, err) {  \
 	printf("ERROR in line %d while %s:\n%s\n", __LINE__, op, err);  \
-  retval=-1;  goto bailout;}
+	retval=-1;  goto bailout;}
 #define _throwunix(m) _throw(m, strerror(errno))
 #define _throwtj(m) _throw(m, tjGetErrorStr())
 #define _throwbmp(m) _throw(m, bmpgeterr())
 
 int flags=TJFLAG_NOREALLOC, componly=0, decomponly=0, doyuv=0, quiet=0,
-	dotile=0, pf=TJPF_BGR, yuvpad=1, warmup=1, dowrite=1;
+	dotile=0, pf=TJPF_BGR, yuvpad=1, dowrite=1;
 char *ext="ppm";
 const char *pixFormatStr[TJ_NUMPF]=
 {
@@ -64,7 +64,7 @@
 tjscalingfactor *scalingfactors=NULL, sf={1, 1};  int nsf=0;
 int xformop=TJXOP_NONE, xformopt=0;
 int (*customFilter)(short *, tjregion, tjregion, int, int, tjtransform *);
-double benchtime=5.0;
+double benchtime=5.0, warmup=1.0;
 
 
 char *formatName(int subsamp, int cs, char *buf)
@@ -146,7 +146,7 @@
 	}
 
 	/* Benchmark */
-	iter=-warmup;
+	iter=-1;
 	elapsed=elapsedDecode=0.;
 	while(1)
 	{
@@ -176,12 +176,17 @@
 						_throwtj("executing tjDecompress2()");
 			}
 		}
-		iter++;
-		if(iter>=1)
+		elapsed+=gettime()-start;
+		if(iter>=0)
 		{
-			elapsed+=gettime()-start;
+			iter++;
 			if(elapsed>=benchtime) break;
 		}
+		else if(elapsed>=warmup)
+		{
+			iter=0;
+			elapsed=elapsedDecode=0.;
+		}
 	}
 	if(doyuv) elapsed-=elapsedDecode;
 
@@ -207,7 +212,7 @@
 			(double)(w*h)/1000000.*(double)iter/elapsed);
 		if(doyuv)
 		{
-			printf("YUV Decode    --> Frame rate:         %f fps\n",
+			printf("YUV Decode    --> Frame rate:         %f fps\n",
 				(double)iter/elapsedDecode);
 			printf("                  Throughput:         %f Megapixels/sec\n",
 				(double)(w*h)/1000000.*(double)iter/elapsedDecode);
@@ -340,7 +345,7 @@
 		}
 
 		/* Benchmark */
-		iter=-warmup;
+		iter=-1;
 		elapsed=elapsedEncode=0.;
 		while(1)
 		{
@@ -374,12 +379,17 @@
 					totaljpegsize+=jpegsize[tile];
 				}
 			}
-			iter++;
-			if(iter>=1)
+			elapsed+=gettime()-start;
+			if(iter>=0)
 			{
-				elapsed+=gettime()-start;
+				iter++;
 				if(elapsed>=benchtime) break;
 			}
+			else if(elapsed>=warmup)
+			{
+				iter=0;
+				elapsed=elapsedEncode=0.;
+			}
 		}
 		if(doyuv) elapsed-=elapsedEncode;
 
@@ -492,7 +502,7 @@
 	char *temp=NULL, tempstr[80], tempstr2[80];
 	int row, col, i, iter, tilew, tileh, ntilesw=1, ntilesh=1, retval=0;
 	double start, elapsed;
-	int ps=tjPixelSize[pf], tile;
+	int ps=tjPixelSize[pf], tile, decompsrc=0;
 
 	if((file=fopen(filename, "rb"))==NULL)
 		_throwunix("opening file");
@@ -623,7 +633,7 @@
 				}
 			}
 
-			iter=-warmup;
+			iter=-1;
 			elapsed=0.;
 			while(1)
 			{
@@ -631,12 +641,17 @@
 				if(tjTransform(handle, srcbuf, srcsize, _ntilesw*_ntilesh, jpegbuf,
 					jpegsize, t, flags)==-1)
 					_throwtj("executing tjTransform()");
-				iter++;
-				if(iter>=1)
+				elapsed+=gettime()-start;
+				if(iter>=0)
 				{
-					elapsed+=gettime()-start;
+					iter++;
 					if(elapsed>=benchtime) break;
 				}
+				else if(elapsed>=warmup)
+				{
+					iter=0;
+					elapsed=0.;
+				}
 			}
 
 			free(t);  t=NULL;
@@ -667,16 +682,17 @@
 		else
 		{
 			if(quiet==1) printf("N/A     N/A     ");
-			jpegsize[0]=srcsize;
-			memcpy(jpegbuf[0], srcbuf, srcsize);
+			tjFree(jpegbuf[0]);
+			jpegbuf[0]=NULL;
+			decompsrc=1;
 		}
 
 		if(w==tilew) _tilew=_w;
 		if(h==tileh) _tileh=_h;
 		if(!(xformopt&TJXOPT_NOOUTPUT))
 		{
-			if(decomp(NULL, jpegbuf, jpegsize, NULL, _w, _h, _subsamp, 0,
-				filename, _tilew, _tileh)==-1)
+			if(decomp(NULL, decompsrc? &srcbuf:jpegbuf, decompsrc? &srcsize:jpegsize,
+					NULL, _w, _h, _subsamp, 0, filename, _tilew, _tileh)==-1)
 				goto bailout;
 		}
 		else if(quiet==1) printf("N/A\n");
@@ -763,8 +779,9 @@
 	printf("-grayscale = Perform lossless grayscale conversion prior to decompression\n");
 	printf("     test (can be combined with the other transforms above)\n");
 	printf("-benchtime <t> = Run each benchmark for at least <t> seconds (default = 5.0)\n");
-	printf("-warmup <w> = Execute each benchmark <w> times to prime the cache before\n");
-	printf("     taking performance measurements (default = 1)\n");
+	printf("-warmup <t> = Run each benchmark for <t> seconds (default = 1.0) prior to\n");
+	printf("     starting the timer, in order to prime the caches and thus improve the\n");
+	printf("     consistency of the results.\n");
 	printf("-componly = Stop after running compression tests.  Do not test decompression.\n");
 	printf("-nowrite = Do not write reference or output images (improves consistency of\n");
 	printf("     performance measurements.)\n\n");
@@ -817,32 +834,32 @@
 			{
 				dotile=1;  xformopt|=TJXOPT_CROP;
 			}
-			if(!strcasecmp(argv[i], "-fastupsample"))
+			else if(!strcasecmp(argv[i], "-fastupsample"))
 			{
 				printf("Using fast upsampling code\n\n");
 				flags|=TJFLAG_FASTUPSAMPLE;
 			}
-			if(!strcasecmp(argv[i], "-fastdct"))
+			else if(!strcasecmp(argv[i], "-fastdct"))
 			{
 				printf("Using fastest DCT/IDCT algorithm\n\n");
 				flags|=TJFLAG_FASTDCT;
 			}
-			if(!strcasecmp(argv[i], "-accuratedct"))
+			else if(!strcasecmp(argv[i], "-accuratedct"))
 			{
 				printf("Using most accurate DCT/IDCT algorithm\n\n");
 				flags|=TJFLAG_ACCURATEDCT;
 			}
-			if(!strcasecmp(argv[i], "-rgb")) pf=TJPF_RGB;
-			if(!strcasecmp(argv[i], "-rgbx")) pf=TJPF_RGBX;
-			if(!strcasecmp(argv[i], "-bgr")) pf=TJPF_BGR;
-			if(!strcasecmp(argv[i], "-bgrx")) pf=TJPF_BGRX;
-			if(!strcasecmp(argv[i], "-xbgr")) pf=TJPF_XBGR;
-			if(!strcasecmp(argv[i], "-xrgb")) pf=TJPF_XRGB;
-			if(!strcasecmp(argv[i], "-cmyk")) pf=TJPF_CMYK;
-			if(!strcasecmp(argv[i], "-bottomup")) flags|=TJFLAG_BOTTOMUP;
-			if(!strcasecmp(argv[i], "-quiet")) quiet=1;
-			if(!strcasecmp(argv[i], "-qq")) quiet=2;
-			if(!strcasecmp(argv[i], "-scale") && i<argc-1)
+			else if(!strcasecmp(argv[i], "-rgb")) pf=TJPF_RGB;
+			else if(!strcasecmp(argv[i], "-rgbx")) pf=TJPF_RGBX;
+			else if(!strcasecmp(argv[i], "-bgr")) pf=TJPF_BGR;
+			else if(!strcasecmp(argv[i], "-bgrx")) pf=TJPF_BGRX;
+			else if(!strcasecmp(argv[i], "-xbgr")) pf=TJPF_XBGR;
+			else if(!strcasecmp(argv[i], "-xrgb")) pf=TJPF_XRGB;
+			else if(!strcasecmp(argv[i], "-cmyk")) pf=TJPF_CMYK;
+			else if(!strcasecmp(argv[i], "-bottomup")) flags|=TJFLAG_BOTTOMUP;
+			else if(!strcasecmp(argv[i], "-quiet")) quiet=1;
+			else if(!strcasecmp(argv[i], "-qq")) quiet=2;
+			else if(!strcasecmp(argv[i], "-scale") && i<argc-1)
 			{
 				int temp1=0, temp2=0, match=0;
 				if(sscanf(argv[++i], "%d/%d", &temp1, &temp2)==2)
@@ -860,46 +877,42 @@
 				}
 				else usage(argv[0]);
 			}
-			if(!strcasecmp(argv[i], "-hflip")) xformop=TJXOP_HFLIP;
-			if(!strcasecmp(argv[i], "-vflip")) xformop=TJXOP_VFLIP;
-			if(!strcasecmp(argv[i], "-transpose")) xformop=TJXOP_TRANSPOSE;
-			if(!strcasecmp(argv[i], "-transverse")) xformop=TJXOP_TRANSVERSE;
-			if(!strcasecmp(argv[i], "-rot90")) xformop=TJXOP_ROT90;
-			if(!strcasecmp(argv[i], "-rot180")) xformop=TJXOP_ROT180;
-			if(!strcasecmp(argv[i], "-rot270")) xformop=TJXOP_ROT270;
-			if(!strcasecmp(argv[i], "-grayscale")) xformopt|=TJXOPT_GRAY;
-			if(!strcasecmp(argv[i], "-custom")) customFilter=dummyDCTFilter;
-			if(!strcasecmp(argv[i], "-nooutput")) xformopt|=TJXOPT_NOOUTPUT;
-			if(!strcasecmp(argv[i], "-benchtime") && i<argc-1)
+			else if(!strcasecmp(argv[i], "-hflip")) xformop=TJXOP_HFLIP;
+			else if(!strcasecmp(argv[i], "-vflip")) xformop=TJXOP_VFLIP;
+			else if(!strcasecmp(argv[i], "-transpose")) xformop=TJXOP_TRANSPOSE;
+			else if(!strcasecmp(argv[i], "-transverse")) xformop=TJXOP_TRANSVERSE;
+			else if(!strcasecmp(argv[i], "-rot90")) xformop=TJXOP_ROT90;
+			else if(!strcasecmp(argv[i], "-rot180")) xformop=TJXOP_ROT180;
+			else if(!strcasecmp(argv[i], "-rot270")) xformop=TJXOP_ROT270;
+			else if(!strcasecmp(argv[i], "-grayscale")) xformopt|=TJXOPT_GRAY;
+			else if(!strcasecmp(argv[i], "-custom")) customFilter=dummyDCTFilter;
+			else if(!strcasecmp(argv[i], "-nooutput")) xformopt|=TJXOPT_NOOUTPUT;
+			else if(!strcasecmp(argv[i], "-benchtime") && i<argc-1)
 			{
 				double temp=atof(argv[++i]);
 				if(temp>0.0) benchtime=temp;
 				else usage(argv[0]);
 			}
-			if(!strcasecmp(argv[i], "-warmup") && i<argc-1)
+			else if(!strcasecmp(argv[i], "-warmup") && i<argc-1)
 			{
-				int temp=atoi(argv[++i]);
-				if(temp>=0)
-				{
-					warmup=temp;
-					printf("Warmup runs = %d\n\n", warmup);
-				}
+				double temp=atof(argv[++i]);
+				if(temp>=0.0) warmup=temp;
 				else usage(argv[0]);
+				printf("Warmup time = %.1f seconds\n\n", warmup);
 			}
-			if(!strcmp(argv[i], "-?")) usage(argv[0]);
-			if(!strcasecmp(argv[i], "-alloc")) flags&=(~TJFLAG_NOREALLOC);
-			if(!strcasecmp(argv[i], "-bmp")) ext="bmp";
-			if(!strcasecmp(argv[i], "-yuv"))
+			else if(!strcasecmp(argv[i], "-alloc")) flags&=(~TJFLAG_NOREALLOC);
+			else if(!strcasecmp(argv[i], "-bmp")) ext="bmp";
+			else if(!strcasecmp(argv[i], "-yuv"))
 			{
 				printf("Testing YUV planar encoding/decoding\n\n");
 				doyuv=1;
 			}
-			if(!strcasecmp(argv[i], "-yuvpad") && i<argc-1)
+			else if(!strcasecmp(argv[i], "-yuvpad") && i<argc-1)
 			{
 				int temp=atoi(argv[++i]);
 				if(temp>=1) yuvpad=temp;
 			}
-			if(!strcasecmp(argv[i], "-subsamp") && i<argc-1)
+			else if(!strcasecmp(argv[i], "-subsamp") && i<argc-1)
 			{
 				i++;
 				if(toupper(argv[i][0])=='G') subsamp=TJSAMP_GRAY;
@@ -916,8 +929,9 @@
 					}
 				}
 			}
-			if(!strcasecmp(argv[i], "-componly")) componly=1;
-			if(!strcasecmp(argv[i], "-nowrite")) dowrite=0;
+			else if(!strcasecmp(argv[i], "-componly")) componly=1;
+			else if(!strcasecmp(argv[i], "-nowrite")) dowrite=0;
+			else usage(argv[0]);
 		}
 	}
 
diff --git a/tjbenchtest.in b/tjbenchtest.in
index ef11b24..22e15db 100755
--- a/tjbenchtest.in
+++ b/tjbenchtest.in
@@ -36,10 +36,9 @@
 fi
 mkdir -p $OUTDIR
 
-exec >$EXEDIR/tjbenchtest.log
-
-if [ $# -gt 0 ]; then
-	if [ "$1" = "-yuv" ]; then
+while [ $# -gt 0 ]; do
+	case "$1" in
+	-yuv)
 		NSARG=-nosmooth
 		YUVARG=-yuv
 
@@ -60,12 +59,16 @@
 # phenomenon is not yet fully understood but is also believed to be some sort
 # of round-off error.)
 		IMAGES="vgl_6548_0026a.${EXT}"
-	fi
-	if [ "$1" = "-alloc" ]; then
+		;;
+	-alloc)
 		ALLOCARG=-alloc
 		ALLOC=1
-	fi
-fi
+		;;
+	esac
+	shift
+done
+
+exec >$EXEDIR/tjbenchtest$YUVARG$ALLOCARG.log
 
 # Standard tests
 for image in $IMAGES; do
diff --git a/tjbenchtest.java.in b/tjbenchtest.java.in
index acdabd0..0fd2896 100755
--- a/tjbenchtest.java.in
+++ b/tjbenchtest.java.in
@@ -33,8 +33,6 @@
 fi
 mkdir -p $OUTDIR
 
-exec >$EXEDIR/tjbenchtest-java.log
-
 if [ $# -gt 0 ]; then
 	if [ "$1" = "-yuv" ]; then
 		NSARG=-nosmooth
@@ -60,6 +58,8 @@
 	fi
 fi
 
+exec >$EXEDIR/tjbenchtest-java$YUVARG.log
+
 # Standard tests
 for image in $IMAGES; do
 
diff --git a/tjexampletest.in b/tjexampletest.in
index 40b342e..4cb9e9d 100755
--- a/tjexampletest.in
+++ b/tjexampletest.in
@@ -21,7 +21,7 @@
 
 IMAGES="vgl_5674_0098.bmp vgl_6434_0018a.bmp vgl_6548_0026a.bmp nightshot_iso_100.bmp"
 IMGDIR=@srcdir@/testimages
-OUTDIR=__tjexampletest_output
+OUTDIR=`mktemp -d /tmp/__tjexampletest_output.XXXXXX`
 EXEDIR=.
 JAVA="@JAVA@ -cp java/turbojpeg.jar -Djava.library.path=.libs"
 
@@ -36,23 +36,23 @@
 
 	cp $IMGDIR/$image $OUTDIR
 	basename=`basename $image .bmp`
-	$EXEDIR/cjpeg -quality 95 -dct fast -grayscale $IMGDIR/${basename}.bmp >$OUTDIR/${basename}_GRAY_fast_cjpeg.jpg
-	$EXEDIR/cjpeg -quality 95 -dct fast -sample 2x2 $IMGDIR/${basename}.bmp >$OUTDIR/${basename}_420_fast_cjpeg.jpg
-	$EXEDIR/cjpeg -quality 95 -dct fast -sample 2x1 $IMGDIR/${basename}.bmp >$OUTDIR/${basename}_422_fast_cjpeg.jpg
-	$EXEDIR/cjpeg -quality 95 -dct fast -sample 1x1 $IMGDIR/${basename}.bmp >$OUTDIR/${basename}_444_fast_cjpeg.jpg
-	$EXEDIR/cjpeg -quality 95 -dct int -grayscale $IMGDIR/${basename}.bmp >$OUTDIR/${basename}_GRAY_accurate_cjpeg.jpg
-	$EXEDIR/cjpeg -quality 95 -dct int -sample 2x2 $IMGDIR/${basename}.bmp >$OUTDIR/${basename}_420_accurate_cjpeg.jpg
-	$EXEDIR/cjpeg -quality 95 -dct int -sample 2x1 $IMGDIR/${basename}.bmp >$OUTDIR/${basename}_422_accurate_cjpeg.jpg
-	$EXEDIR/cjpeg -quality 95 -dct int -sample 1x1 $IMGDIR/${basename}.bmp >$OUTDIR/${basename}_444_accurate_cjpeg.jpg
+	runme $EXEDIR/cjpeg -quality 95 -dct fast -grayscale -outfile $OUTDIR/${basename}_GRAY_fast_cjpeg.jpg $IMGDIR/${basename}.bmp
+	runme $EXEDIR/cjpeg -quality 95 -dct fast -sample 2x2 -outfile $OUTDIR/${basename}_420_fast_cjpeg.jpg $IMGDIR/${basename}.bmp
+	runme $EXEDIR/cjpeg -quality 95 -dct fast -sample 2x1 -outfile $OUTDIR/${basename}_422_fast_cjpeg.jpg $IMGDIR/${basename}.bmp
+	runme $EXEDIR/cjpeg -quality 95 -dct fast -sample 1x1 -outfile $OUTDIR/${basename}_444_fast_cjpeg.jpg $IMGDIR/${basename}.bmp
+	runme $EXEDIR/cjpeg -quality 95 -dct int -grayscale -outfile $OUTDIR/${basename}_GRAY_accurate_cjpeg.jpg $IMGDIR/${basename}.bmp
+	runme $EXEDIR/cjpeg -quality 95 -dct int -sample 2x2 -outfile $OUTDIR/${basename}_420_accurate_cjpeg.jpg $IMGDIR/${basename}.bmp
+	runme $EXEDIR/cjpeg -quality 95 -dct int -sample 2x1 -outfile $OUTDIR/${basename}_422_accurate_cjpeg.jpg $IMGDIR/${basename}.bmp
+	runme $EXEDIR/cjpeg -quality 95 -dct int -sample 1x1 -outfile $OUTDIR/${basename}_444_accurate_cjpeg.jpg $IMGDIR/${basename}.bmp
 	for samp in GRAY 420 422 444; do
-		$EXEDIR/djpeg -rgb -bmp $OUTDIR/${basename}_${samp}_fast_cjpeg.jpg >$OUTDIR/${basename}_${samp}_default_djpeg.bmp
-		$EXEDIR/djpeg -dct fast -rgb -bmp $OUTDIR/${basename}_${samp}_fast_cjpeg.jpg >$OUTDIR/${basename}_${samp}_fast_djpeg.bmp
-		$EXEDIR/djpeg -dct int -rgb -bmp $OUTDIR/${basename}_${samp}_accurate_cjpeg.jpg >$OUTDIR/${basename}_${samp}_accurate_djpeg.bmp
+		runme $EXEDIR/djpeg -rgb -bmp -outfile $OUTDIR/${basename}_${samp}_default_djpeg.bmp $OUTDIR/${basename}_${samp}_fast_cjpeg.jpg
+		runme $EXEDIR/djpeg -dct fast -rgb -bmp -outfile $OUTDIR/${basename}_${samp}_fast_djpeg.bmp $OUTDIR/${basename}_${samp}_fast_cjpeg.jpg
+		runme $EXEDIR/djpeg -dct int -rgb -bmp -outfile $OUTDIR/${basename}_${samp}_accurate_djpeg.bmp $OUTDIR/${basename}_${samp}_accurate_cjpeg.jpg
 	done
 	for samp in 420 422; do
-		$EXEDIR/djpeg -nosmooth -bmp $OUTDIR/${basename}_${samp}_fast_cjpeg.jpg >$OUTDIR/${basename}_${samp}_default_nosmooth_djpeg.bmp
-		$EXEDIR/djpeg -dct fast -nosmooth -bmp $OUTDIR/${basename}_${samp}_fast_cjpeg.jpg >$OUTDIR/${basename}_${samp}_fast_nosmooth_djpeg.bmp
-		$EXEDIR/djpeg -dct int -nosmooth -bmp $OUTDIR/${basename}_${samp}_accurate_cjpeg.jpg >$OUTDIR/${basename}_${samp}_accurate_nosmooth_djpeg.bmp
+		runme $EXEDIR/djpeg -nosmooth -bmp -outfile $OUTDIR/${basename}_${samp}_default_nosmooth_djpeg.bmp $OUTDIR/${basename}_${samp}_fast_cjpeg.jpg
+		runme $EXEDIR/djpeg -dct fast -nosmooth -bmp -outfile $OUTDIR/${basename}_${samp}_fast_nosmooth_djpeg.bmp $OUTDIR/${basename}_${samp}_fast_cjpeg.jpg
+		runme $EXEDIR/djpeg -dct int -nosmooth -bmp -outfile $OUTDIR/${basename}_${samp}_accurate_nosmooth_djpeg.bmp $OUTDIR/${basename}_${samp}_accurate_cjpeg.jpg
 	done
 
 	# Compression
@@ -87,7 +87,7 @@
 	for scale in 2_1 15_8 7_4 13_8 3_2 11_8 5_4 9_8 7_8 3_4 5_8 1_2 3_8 1_4 1_8; do
 		scalearg=`echo $scale | sed s@_@/@g`
 		for samp in GRAY 420 422 444; do
-			$EXEDIR/djpeg -rgb -bmp -scale ${scalearg} $OUTDIR/${basename}_${samp}_fast_cjpeg.jpg >$OUTDIR/${basename}_${samp}_${scale}_djpeg.bmp
+			runme $EXEDIR/djpeg -rgb -bmp -scale ${scalearg} -outfile $OUTDIR/${basename}_${samp}_${scale}_djpeg.bmp $OUTDIR/${basename}_${samp}_fast_cjpeg.jpg
 			runme $JAVA TJExample $OUTDIR/${basename}_${samp}_fast.jpg $OUTDIR/${basename}_${samp}_${scale}.bmp -scale ${scalearg}
 			runme cmp -i 54:54 $OUTDIR/${basename}_${samp}_${scale}.bmp $OUTDIR/${basename}_${samp}_${scale}_djpeg.bmp
 			rm $OUTDIR/${basename}_${samp}_${scale}.bmp
@@ -96,25 +96,25 @@
 
 	# Transforms
 	for samp in GRAY 420 422 444; do
-		$EXEDIR/jpegtran -crop 70x60+16+16 -flip horizontal -trim $OUTDIR/${basename}_${samp}_fast.jpg >$OUTDIR/${basename}_${samp}_hflip_jpegtran.jpg
-		$EXEDIR/jpegtran -crop 70x60+16+16 -flip vertical -trim $OUTDIR/${basename}_${samp}_fast.jpg >$OUTDIR/${basename}_${samp}_vflip_jpegtran.jpg
-		$EXEDIR/jpegtran -crop 70x60+16+16 -transpose -trim $OUTDIR/${basename}_${samp}_fast.jpg >$OUTDIR/${basename}_${samp}_transpose_jpegtran.jpg
-		$EXEDIR/jpegtran -crop 70x60+16+16 -transverse -trim $OUTDIR/${basename}_${samp}_fast.jpg >$OUTDIR/${basename}_${samp}_transverse_jpegtran.jpg
-		$EXEDIR/jpegtran -crop 70x60+16+16 -rotate 90 -trim $OUTDIR/${basename}_${samp}_fast.jpg >$OUTDIR/${basename}_${samp}_rot90_jpegtran.jpg
-		$EXEDIR/jpegtran -crop 70x60+16+16 -rotate 180 -trim $OUTDIR/${basename}_${samp}_fast.jpg >$OUTDIR/${basename}_${samp}_rot180_jpegtran.jpg
-		$EXEDIR/jpegtran -crop 70x60+16+16 -rotate 270 -trim $OUTDIR/${basename}_${samp}_fast.jpg >$OUTDIR/${basename}_${samp}_rot270_jpegtran.jpg
+		runme $EXEDIR/jpegtran -crop 70x60+16+16 -flip horizontal -trim -outfile $OUTDIR/${basename}_${samp}_hflip_jpegtran.jpg $OUTDIR/${basename}_${samp}_fast.jpg
+		runme $EXEDIR/jpegtran -crop 70x60+16+16 -flip vertical -trim -outfile $OUTDIR/${basename}_${samp}_vflip_jpegtran.jpg $OUTDIR/${basename}_${samp}_fast.jpg
+		runme $EXEDIR/jpegtran -crop 70x60+16+16 -transpose -trim -outfile $OUTDIR/${basename}_${samp}_transpose_jpegtran.jpg $OUTDIR/${basename}_${samp}_fast.jpg
+		runme $EXEDIR/jpegtran -crop 70x60+16+16 -transverse -trim -outfile $OUTDIR/${basename}_${samp}_transverse_jpegtran.jpg $OUTDIR/${basename}_${samp}_fast.jpg
+		runme $EXEDIR/jpegtran -crop 70x60+16+16 -rotate 90 -trim -outfile $OUTDIR/${basename}_${samp}_rot90_jpegtran.jpg $OUTDIR/${basename}_${samp}_fast.jpg
+		runme $EXEDIR/jpegtran -crop 70x60+16+16 -rotate 180 -trim -outfile $OUTDIR/${basename}_${samp}_rot180_jpegtran.jpg $OUTDIR/${basename}_${samp}_fast.jpg
+		runme $EXEDIR/jpegtran -crop 70x60+16+16 -rotate 270 -trim -outfile $OUTDIR/${basename}_${samp}_rot270_jpegtran.jpg $OUTDIR/${basename}_${samp}_fast.jpg
 	done
 	for xform in hflip vflip transpose transverse rot90 rot180 rot270; do
 		for samp in GRAY 420 422 444; do
 			runme $JAVA TJExample $OUTDIR/${basename}_${samp}_fast.jpg $OUTDIR/${basename}_${samp}_${xform}.jpg -$xform -crop 16,16,70x60
 			runme cmp $OUTDIR/${basename}_${samp}_${xform}.jpg $OUTDIR/${basename}_${samp}_${xform}_jpegtran.jpg
-			$EXEDIR/djpeg -rgb -bmp $OUTDIR/${basename}_${samp}_${xform}_jpegtran.jpg >$OUTDIR/${basename}_${samp}_${xform}_jpegtran.bmp
+			runme $EXEDIR/djpeg -rgb -bmp -outfile $OUTDIR/${basename}_${samp}_${xform}_jpegtran.bmp $OUTDIR/${basename}_${samp}_${xform}_jpegtran.jpg
 			runme $JAVA TJExample $OUTDIR/${basename}_${samp}_fast.jpg $OUTDIR/${basename}_${samp}_${xform}.bmp -$xform -crop 16,16,70x60
 			runme cmp -i 54:54 $OUTDIR/${basename}_${samp}_${xform}.bmp $OUTDIR/${basename}_${samp}_${xform}_jpegtran.bmp
 			rm $OUTDIR/${basename}_${samp}_${xform}.bmp
 		done
 		for samp in 420 422; do
-			$EXEDIR/djpeg -nosmooth -rgb -bmp $OUTDIR/${basename}_${samp}_${xform}_jpegtran.jpg >$OUTDIR/${basename}_${samp}_${xform}_jpegtran.bmp
+			runme $EXEDIR/djpeg -nosmooth -rgb -bmp -outfile $OUTDIR/${basename}_${samp}_${xform}_jpegtran.bmp $OUTDIR/${basename}_${samp}_${xform}_jpegtran.jpg
 			runme $JAVA TJExample $OUTDIR/${basename}_${samp}_fast.jpg $OUTDIR/${basename}_${samp}_${xform}.bmp -$xform -crop 16,16,70x60 -fastupsample
 			runme cmp -i 54:54 $OUTDIR/${basename}_${samp}_${xform}.bmp $OUTDIR/${basename}_${samp}_${xform}_jpegtran.bmp
 			rm $OUTDIR/${basename}_${samp}_${xform}.bmp
@@ -137,7 +137,7 @@
 		for samp in GRAY 444 422 420; do
 			for scale in 2_1 15_8 7_4 13_8 3_2 11_8 5_4 9_8 7_8 3_4 5_8 1_2 3_8 1_4 1_8; do
 				scalearg=`echo $scale | sed s@_@/@g`
-				$EXEDIR/djpeg -rgb -bmp -scale ${scalearg} $OUTDIR/${basename}_${samp}_${xform}_jpegtran.jpg >$OUTDIR/${basename}_${samp}_${xform}_${scale}_jpegtran.bmp
+				runme $EXEDIR/djpeg -rgb -bmp -scale ${scalearg} -outfile $OUTDIR/${basename}_${samp}_${xform}_${scale}_jpegtran.bmp $OUTDIR/${basename}_${samp}_${xform}_jpegtran.jpg
 				runme $JAVA TJExample $OUTDIR/${basename}_${samp}_fast.jpg $OUTDIR/${basename}_${samp}_${xform}_${scale}.bmp -$xform -scale ${scalearg} -crop 16,16,70x60
 				runme cmp -i 54:54 $OUTDIR/${basename}_${samp}_${xform}_${scale}.bmp $OUTDIR/${basename}_${samp}_${xform}_${scale}_jpegtran.bmp
 				rm $OUTDIR/${basename}_${samp}_${xform}_${scale}.bmp
diff --git a/tjunittest.c b/tjunittest.c
index 6a4022f..f793796 100644
--- a/tjunittest.c
+++ b/tjunittest.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (C)2009-2014 D. R. Commander.  All Rights Reserved.
+ * Copyright (C)2009-2014, 2017 D. R. Commander.  All Rights Reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are met:
@@ -44,12 +44,12 @@
 
 void usage(char *progName)
 {
-	printf("\nUSAGE: %s [options]\n", progName);
+	printf("\nUSAGE: %s [options]\n\n", progName);
 	printf("Options:\n");
 	printf("-yuv = test YUV encoding/decoding support\n");
 	printf("-noyuvpad = do not pad each line of each Y, U, and V plane to the nearest\n");
 	printf("            4-byte boundary\n");
-	printf("-alloc = test automatic buffer allocation\n");
+	printf("-alloc = test automatic buffer allocation\n\n");
 	exit(1);
 }
 
@@ -697,10 +697,9 @@
 		for(i=1; i<argc; i++)
 		{
 			if(!strcasecmp(argv[i], "-yuv")) doyuv=1;
-			if(!strcasecmp(argv[i], "-noyuvpad")) pad=1;
-			if(!strcasecmp(argv[i], "-alloc")) alloc=1;
-			if(!strncasecmp(argv[i], "-h", 2) || !strcasecmp(argv[i], "-?"))
-				usage(argv[0]);
+			else if(!strcasecmp(argv[i], "-noyuvpad")) pad=1;
+			else if(!strcasecmp(argv[i], "-alloc")) alloc=1;
+			else usage(argv[0]);
 		}
 	}
 	if(alloc) printf("Testing automatic buffer allocation\n");
diff --git a/transupp.c b/transupp.c
index d1c56c6..b51ef39 100644
--- a/transupp.c
+++ b/transupp.c
@@ -4,7 +4,7 @@
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1997-2011, Thomas G. Lane, Guido Vollbeding.
  * libjpeg-turbo Modifications:
- * Copyright (C) 2010, D. R. Commander.
+ * Copyright (C) 2010, 2017, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
  *
@@ -1177,7 +1177,6 @@
  * We try to adjust the Tags ExifImageWidth and ExifImageHeight if possible.
  */
 
-#if JPEG_LIB_VERSION >= 70
 LOCAL(void)
 adjust_exif_parameters (JOCTET *data, unsigned int length,
                         JDIMENSION new_width, JDIMENSION new_height)
@@ -1327,7 +1326,6 @@
     offset += 12;
   } while (--number_of_tags);
 }
-#endif
 
 
 /* Adjust output image parameters as needed.
@@ -1384,7 +1382,7 @@
   /* Correct the destination's image dimensions as necessary
    * for rotate/flip, resize, and crop operations.
    */
-#if JPEG_LIB_VERSION >= 70
+#if JPEG_LIB_VERSION >= 80
   dstinfo->jpeg_width = info->output_width;
   dstinfo->jpeg_height = info->output_height;
 #endif
@@ -1395,14 +1393,14 @@
   case JXFORM_TRANSVERSE:
   case JXFORM_ROT_90:
   case JXFORM_ROT_270:
-#if JPEG_LIB_VERSION < 70
+#if JPEG_LIB_VERSION < 80
     dstinfo->image_width = info->output_height;
     dstinfo->image_height = info->output_width;
 #endif
     transpose_critical_parameters(dstinfo);
     break;
   default:
-#if JPEG_LIB_VERSION < 70
+#if JPEG_LIB_VERSION < 80
     dstinfo->image_width = info->output_width;
     dstinfo->image_height = info->output_height;
 #endif
@@ -1421,14 +1419,21 @@
       GETJOCTET(srcinfo->marker_list->data[5]) == 0) {
     /* Suppress output of JFIF marker */
     dstinfo->write_JFIF_header = FALSE;
-#if JPEG_LIB_VERSION >= 70
     /* Adjust Exif image parameters */
+#if JPEG_LIB_VERSION >= 80
     if (dstinfo->jpeg_width != srcinfo->image_width ||
         dstinfo->jpeg_height != srcinfo->image_height)
       /* Align data segment to start of TIFF structure for parsing */
       adjust_exif_parameters(srcinfo->marker_list->data + 6,
         srcinfo->marker_list->data_length - 6,
         dstinfo->jpeg_width, dstinfo->jpeg_height);
+#else
+    if (dstinfo->image_width != srcinfo->image_width ||
+        dstinfo->image_height != srcinfo->image_height)
+      /* Align data segment to start of TIFF structure for parsing */
+      adjust_exif_parameters(srcinfo->marker_list->data + 6,
+        srcinfo->marker_list->data_length - 6,
+        dstinfo->image_width, dstinfo->image_height);
 #endif
   }
 
diff --git a/turbojpeg.c b/turbojpeg.c
index 6533b41..662c68f 100644
--- a/turbojpeg.c
+++ b/turbojpeg.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (C)2009-2016 D. R. Commander.  All Rights Reserved.
+ * Copyright (C)2009-2017 D. R. Commander.  All Rights Reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are met:
@@ -222,7 +222,7 @@
 #ifndef NO_GETENV
 	if((env=getenv("TJ_OPTIMIZE"))!=NULL && strlen(env)>0 && !strcmp(env, "1"))
 		cinfo->optimize_coding=TRUE;
-	if((env=getenv("TJ_ARITHMETIC"))!=NULL && strlen(env)>0	&& !strcmp(env, "1"))
+	if((env=getenv("TJ_ARITHMETIC"))!=NULL && strlen(env)>0 && !strcmp(env, "1"))
 		cinfo->arith_code=TRUE;
 	if((env=getenv("TJ_RESTART"))!=NULL && strlen(env)>0)
 	{
@@ -772,13 +772,6 @@
 		|| jpegSubsamp<0 || jpegSubsamp>=NUMSUBOPT || jpegQual<0 || jpegQual>100)
 		_throw("tjCompress2(): Invalid argument");
 
-	if(setjmp(this->jerr.setjmp_buffer))
-	{
-		/* If we get here, the JPEG code has signaled an error. */
-		retval=-1;
-		goto bailout;
-	}
-
 	if(pitch==0) pitch=width*tjPixelSize[pixelFormat];
 
 	#ifndef JCS_EXTENSIONS
@@ -791,6 +784,15 @@
 	}
 	#endif
 
+	if((row_pointer=(JSAMPROW *)malloc(sizeof(JSAMPROW)*height))==NULL)
+		_throw("tjCompress2(): Memory allocation failure");
+
+	if(setjmp(this->jerr.setjmp_buffer))
+	{
+		/* If we get here, the JPEG code has signaled an error. */
+		retval=-1;  goto bailout;
+	}
+
 	cinfo->image_width=width;
 	cinfo->image_height=height;
 
@@ -807,8 +809,6 @@
 		return -1;
 
 	jpeg_start_compress(cinfo, TRUE);
-	if((row_pointer=(JSAMPROW *)malloc(sizeof(JSAMPROW)*height))==NULL)
-		_throw("tjCompress2(): Memory allocation failure");
 	for(i=0; i<height; i++)
 	{
 		if(flags&TJFLAG_BOTTOMUP)
@@ -888,13 +888,6 @@
 	if(subsamp!=TJSAMP_GRAY && (!dstPlanes[1] || !dstPlanes[2]))
 		_throw("tjEncodeYUVPlanes(): Invalid argument");
 
-	if(setjmp(this->jerr.setjmp_buffer))
-	{
-		/* If we get here, the JPEG code has signaled an error. */
-		retval=-1;
-		goto bailout;
-	}
-
 	if(pixelFormat==TJPF_CMYK)
 		_throw("tjEncodeYUVPlanes(): Cannot generate YUV images from CMYK pixels");
 
@@ -910,6 +903,12 @@
 	}
 	#endif
 
+	if(setjmp(this->jerr.setjmp_buffer))
+	{
+		/* If we get here, the JPEG code has signaled an error. */
+		retval=-1;  goto bailout;
+	}
+
 	cinfo->image_width=width;
 	cinfo->image_height=height;
 
@@ -986,6 +985,12 @@
 		}
 	}
 
+	if(setjmp(this->jerr.setjmp_buffer))
+	{
+		/* If we get here, the JPEG code has signaled an error. */
+		retval=-1;  goto bailout;
+	}
+
 	for(row=0; row<ph0; row+=cinfo->max_v_samp_factor)
 	{
 		(*cinfo->cconvert->color_convert)(cinfo, &row_pointer[row], tmpbuf, 0,
@@ -1100,8 +1105,7 @@
 	if(setjmp(this->jerr.setjmp_buffer))
 	{
 		/* If we get here, the JPEG code has signaled an error. */
-		retval=-1;
-		goto bailout;
+		retval=-1;  goto bailout;
 	}
 
 	cinfo->image_width=width;
@@ -1160,6 +1164,12 @@
 		}
 	}
 
+	if(setjmp(this->jerr.setjmp_buffer))
+	{
+		/* If we get here, the JPEG code has signaled an error. */
+		retval=-1;  goto bailout;
+	}
+
 	for(row=0; row<(int)cinfo->image_height;
 		row+=cinfo->max_v_samp_factor*DCTSIZE)
 	{
@@ -1389,8 +1399,7 @@
 	if(setjmp(this->jerr.setjmp_buffer))
 	{
 		/* If we get here, the JPEG code has signaled an error. */
-		retval=-1;
-		goto bailout;
+		retval=-1;  goto bailout;
 	}
 
 	jpeg_mem_src_tj(dinfo, jpegBuf, jpegSize);
@@ -1438,6 +1447,11 @@
 	if((row_pointer=(JSAMPROW *)malloc(sizeof(JSAMPROW)
 		*dinfo->output_height))==NULL)
 		_throw("tjDecompress2(): Memory allocation failure");
+	if(setjmp(this->jerr.setjmp_buffer))
+	{
+		/* If we get here, the JPEG code has signaled an error. */
+		retval=-1;  goto bailout;
+	}
 	for(i=0; i<(int)dinfo->output_height; i++)
 	{
 		if(flags&TJFLAG_BOTTOMUP)
@@ -1568,8 +1582,7 @@
 	if(setjmp(this->jerr.setjmp_buffer))
 	{
 		/* If we get here, the JPEG code has signaled an error. */
-		retval=-1;
-		goto bailout;
+		retval=-1;  goto bailout;
 	}
 
 	if(pixelFormat==TJPF_CMYK)
@@ -1660,6 +1673,12 @@
 		}
 	}
 
+	if(setjmp(this->jerr.setjmp_buffer))
+	{
+		/* If we get here, the JPEG code has signaled an error. */
+		retval=-1;  goto bailout;
+	}
+
 	for(row=0; row<ph0; row+=dinfo->max_v_samp_factor)
 	{
 		JDIMENSION inrow=0, outrow=0;
@@ -1761,8 +1780,7 @@
 	if(setjmp(this->jerr.setjmp_buffer))
 	{
 		/* If we get here, the JPEG code has signaled an error. */
-		retval=-1;
-		goto bailout;
+		retval=-1;  goto bailout;
 	}
 
 	if(!this->headerRead)
@@ -1840,6 +1858,12 @@
 		}
 	}
 
+	if(setjmp(this->jerr.setjmp_buffer))
+	{
+		/* If we get here, the JPEG code has signaled an error. */
+		retval=-1;  goto bailout;
+	}
+
 	if(flags&TJFLAG_FASTUPSAMPLE) dinfo->do_fancy_upsampling=FALSE;
 	if(flags&TJFLAG_FASTDCT) dinfo->dct_method=JDCT_FASTEST;
 	dinfo->raw_data_out=TRUE;
@@ -2017,20 +2041,19 @@
 	else if(flags&TJFLAG_FORCESSE) putenv("JSIMD_FORCESSE=1");
 	else if(flags&TJFLAG_FORCESSE2) putenv("JSIMD_FORCESSE2=1");
 
-	if(setjmp(this->jerr.setjmp_buffer))
-	{
-		/* If we get here, the JPEG code has signaled an error. */
-		retval=-1;
-		goto bailout;
-	}
-
-	jpeg_mem_src_tj(dinfo, jpegBuf, jpegSize);
-
 	if((xinfo=(jpeg_transform_info *)malloc(sizeof(jpeg_transform_info)*n))
 		==NULL)
 		_throw("tjTransform(): Memory allocation failure");
 	MEMZERO(xinfo, sizeof(jpeg_transform_info)*n);
 
+	if(setjmp(this->jerr.setjmp_buffer))
+	{
+		/* If we get here, the JPEG code has signaled an error. */
+		retval=-1;  goto bailout;
+	}
+
+	jpeg_mem_src_tj(dinfo, jpegBuf, jpegSize);
+
 	for(i=0; i<n; i++)
 	{
 		xinfo[i].transform=xformtypes[t[i].op];
diff --git a/turbojpeg.h b/turbojpeg.h
index 583029f..307dc6f 100644
--- a/turbojpeg.h
+++ b/turbojpeg.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (C)2009-2015 D. R. Commander.  All Rights Reserved.
+ * Copyright (C)2009-2015, 2017 D. R. Commander.  All Rights Reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are met:
@@ -275,7 +275,6 @@
  * then the blue component will be <tt>pixel[tjBlueOffset[TJ_BGRX]]</tt>.
  */
 static const int tjBlueOffset[TJ_NUMPF] = {2, 0, 2, 0, 1, 3, 0, 2, 0, 1, 3, -1};
-
 /**
  * Pixel size (in bytes) for a given pixel format.
  */
@@ -348,7 +347,7 @@
  * The uncompressed source/destination image is stored in bottom-up (Windows,
  * OpenGL) order, not top-down (X11) order.
  */
-#define TJFLAG_BOTTOMUP        2
+#define TJFLAG_BOTTOMUP      2
 /**
  * When decompressing an image that was compressed using chrominance
  * subsampling, use the fastest chrominance upsampling algorithm available in
@@ -358,11 +357,11 @@
  */
 #define TJFLAG_FASTUPSAMPLE  256
 /**
- * Disable buffer (re)allocation.  If passed to #tjCompress2() or
- * #tjTransform(), this flag will cause those functions to generate an error if
- * the JPEG image buffer is invalid or too small rather than attempting to
- * allocate or reallocate that buffer.  This reproduces the behavior of earlier
- * versions of TurboJPEG.
+ * Disable buffer (re)allocation.  If passed to one of the JPEG compression or
+ * transform functions, this flag will cause those functions to generate an
+ * error if the JPEG image buffer is invalid or too small rather than
+ * attempting to allocate or reallocate that buffer.  This reproduces the
+ * behavior of earlier versions of TurboJPEG.
  */
 #define TJFLAG_NOREALLOC     1024
 /**
@@ -645,7 +644,7 @@
  * for you, or
  * -# pre-allocate the buffer to a "worst case" size determined by calling
  * #tjBufSize().  This should ensure that the buffer never has to be
- * re-allocated (setting #TJFLAG_NOREALLOC guarantees this.)
+ * re-allocated (setting #TJFLAG_NOREALLOC guarantees that it won't be.)
  * .
  * If you choose option 1, <tt>*jpegSize</tt> should be set to the size of your
  * pre-allocated buffer.  In any case, unless you have set #TJFLAG_NOREALLOC,
@@ -667,7 +666,7 @@
  * @param jpegQual the image quality of the generated JPEG image (1 = worst,
  * 100 = best)
  *
- * @param flags the bitwise OR of one or more of the @ref TJFLAG_BOTTOMUP
+ * @param flags the bitwise OR of one or more of the @ref TJFLAG_ACCURATEDCT
  * "flags"
  *
  * @return 0 if successful, or -1 if an error occurred (see #tjGetErrorStr().)
@@ -713,7 +712,7 @@
  * for you, or
  * -# pre-allocate the buffer to a "worst case" size determined by calling
  * #tjBufSize().  This should ensure that the buffer never has to be
- * re-allocated (setting #TJFLAG_NOREALLOC guarantees this.)
+ * re-allocated (setting #TJFLAG_NOREALLOC guarantees that it won't be.)
  * .
  * If you choose option 1, <tt>*jpegSize</tt> should be set to the size of your
  * pre-allocated buffer.  In any case, unless you have set #TJFLAG_NOREALLOC,
@@ -731,7 +730,7 @@
  * @param jpegQual the image quality of the generated JPEG image (1 = worst,
  * 100 = best)
  *
- * @param flags the bitwise OR of one or more of the @ref TJFLAG_BOTTOMUP
+ * @param flags the bitwise OR of one or more of the @ref TJFLAG_ACCURATEDCT
  * "flags"
  *
  * @return 0 if successful, or -1 if an error occurred (see #tjGetErrorStr().)
@@ -783,7 +782,7 @@
  * for you, or
  * -# pre-allocate the buffer to a "worst case" size determined by calling
  * #tjBufSize().  This should ensure that the buffer never has to be
- * re-allocated (setting #TJFLAG_NOREALLOC guarantees this.)
+ * re-allocated (setting #TJFLAG_NOREALLOC guarantees that it won't be.)
  * .
  * If you choose option 1, <tt>*jpegSize</tt> should be set to the size of your
  * pre-allocated buffer.  In any case, unless you have set #TJFLAG_NOREALLOC,
@@ -801,7 +800,7 @@
  * @param jpegQual the image quality of the generated JPEG image (1 = worst,
  * 100 = best)
  *
- * @param flags the bitwise OR of one or more of the @ref TJFLAG_BOTTOMUP
+ * @param flags the bitwise OR of one or more of the @ref TJFLAG_ACCURATEDCT
  * "flags"
  *
  * @return 0 if successful, or -1 if an error occurred (see #tjGetErrorStr().)
@@ -961,7 +960,7 @@
  * Video, <tt>subsamp</tt> should be set to @ref TJSAMP_420.  This produces an
  * image compatible with the I420 (AKA "YUV420P") format.
  *
- * @param flags the bitwise OR of one or more of the @ref TJFLAG_BOTTOMUP
+ * @param flags the bitwise OR of one or more of the @ref TJFLAG_ACCURATEDCT
  * "flags"
  *
  * @return 0 if successful, or -1 if an error occurred (see #tjGetErrorStr().)
@@ -1019,7 +1018,7 @@
  * Video, <tt>subsamp</tt> should be set to @ref TJSAMP_420.  This produces an
  * image compatible with the I420 (AKA "YUV420P") format.
  *
- * @param flags the bitwise OR of one or more of the @ref TJFLAG_BOTTOMUP
+ * @param flags the bitwise OR of one or more of the @ref TJFLAG_ACCURATEDCT
  * "flags"
  *
  * @return 0 if successful, or -1 if an error occurred (see #tjGetErrorStr().)
@@ -1126,7 +1125,7 @@
  * @param pixelFormat pixel format of the destination image (see @ref
  * TJPF "Pixel formats".)
  *
- * @param flags the bitwise OR of one or more of the @ref TJFLAG_BOTTOMUP
+ * @param flags the bitwise OR of one or more of the @ref TJFLAG_ACCURATEDCT
  * "flags"
  *
  * @return 0 if successful, or -1 if an error occurred (see #tjGetErrorStr().)
@@ -1176,7 +1175,7 @@
  * block height (see #tjMCUHeight), then an intermediate buffer copy will be
  * performed within TurboJPEG.
  *
- * @param flags the bitwise OR of one or more of the @ref TJFLAG_BOTTOMUP
+ * @param flags the bitwise OR of one or more of the @ref TJFLAG_ACCURATEDCT
  * "flags"
  *
  * @return 0 if successful, or -1 if an error occurred (see #tjGetErrorStr().)
@@ -1232,7 +1231,7 @@
  * block height (see #tjMCUHeight), then an intermediate buffer copy will be
  * performed within TurboJPEG.
  *
- * @param flags the bitwise OR of one or more of the @ref TJFLAG_BOTTOMUP
+ * @param flags the bitwise OR of one or more of the @ref TJFLAG_ACCURATEDCT
  * "flags"
  *
  * @return 0 if successful, or -1 if an error occurred (see #tjGetErrorStr().)
@@ -1284,7 +1283,7 @@
  * @param pixelFormat pixel format of the destination image (see @ref TJPF
  * "Pixel formats".)
  *
- * @param flags the bitwise OR of one or more of the @ref TJFLAG_BOTTOMUP
+ * @param flags the bitwise OR of one or more of the @ref TJFLAG_ACCURATEDCT
  * "flags"
  *
  * @return 0 if successful, or -1 if an error occurred (see #tjGetErrorStr().)
@@ -1341,7 +1340,7 @@
  * @param pixelFormat pixel format of the destination image (see @ref TJPF
  * "Pixel formats".)
  *
- * @param flags the bitwise OR of one or more of the @ref TJFLAG_BOTTOMUP
+ * @param flags the bitwise OR of one or more of the @ref TJFLAG_ACCURATEDCT
  * "flags"
  *
  * @return 0 if successful, or -1 if an error occurred (see #tjGetErrorStr().)
@@ -1392,9 +1391,13 @@
  * -# set <tt>dstBufs[i]</tt> to NULL to tell TurboJPEG to allocate the buffer
  * for you, or
  * -# pre-allocate the buffer to a "worst case" size determined by calling
- * #tjBufSize() with the transformed or cropped width and height.  This should
- * ensure that the buffer never has to be re-allocated (setting
- * #TJFLAG_NOREALLOC guarantees this.)
+ * #tjBufSize() with the transformed or cropped width and height.  Under normal
+ * circumstances, this should ensure that the buffer never has to be
+ * re-allocated (setting #TJFLAG_NOREALLOC guarantees that it won't be.)  Note,
+ * however, that there are some rare cases (such as transforming images with a
+ * large amount of embedded EXIF or ICC profile data) in which the output image
+ * will be larger than the worst-case size, and #TJFLAG_NOREALLOC cannot be
+ * used in those cases.
  * .
  * If you choose option 1, <tt>dstSizes[i]</tt> should be set to the size of
  * your pre-allocated buffer.  In any case, unless you have set
@@ -1411,7 +1414,7 @@
  * which specifies the transform parameters and/or cropping region for the
  * corresponding transformed output image.
  *
- * @param flags the bitwise OR of one or more of the @ref TJFLAG_BOTTOMUP
+ * @param flags the bitwise OR of one or more of the @ref TJFLAG_ACCURATEDCT
  * "flags"
  *
  * @return 0 if successful, or -1 if an error occurred (see #tjGetErrorStr().)
@@ -1435,8 +1438,8 @@
 
 /**
  * Allocate an image buffer for use with TurboJPEG.  You should always use
- * this function to allocate the JPEG destination buffer(s) for #tjCompress2()
- * and #tjTransform() unless you are disabling automatic buffer
+ * this function to allocate the JPEG destination buffer(s) for the compression
+ * and transform functions unless you are disabling automatic buffer
  * (re)allocation (by setting #TJFLAG_NOREALLOC.)
  *
  * @param bytes the number of bytes to allocate
@@ -1452,8 +1455,8 @@
 /**
  * Free an image buffer previously allocated by TurboJPEG.  You should always
  * use this function to free JPEG destination buffer(s) that were automatically
- * (re)allocated by #tjCompress2() or #tjTransform() or that were manually
- * allocated using #tjAlloc().
+ * (re)allocated by the compression and transform functions or that were
+ * manually allocated using #tjAlloc().
  *
  * @param buffer address of the buffer to free
  *
diff --git a/usage.txt b/usage.txt
index 5abda4e..ed97aa9 100644
--- a/usage.txt
+++ b/usage.txt
@@ -212,7 +212,7 @@
                         large images.  Value is in thousands of bytes, or
                         millions of bytes if "M" is attached to the number.
                         For example, -max 4m selects 4000000 bytes.  If more
-                        space is needed, temporary files will be used.
+                        space is needed, an error will occur.
 
         -verbose        Enable debug printout.  More -v's give more printout.
         or  -debug      Also, version information is printed at startup.
@@ -377,7 +377,7 @@
                         large images.  Value is in thousands of bytes, or
                         millions of bytes if "M" is attached to the number.
                         For example, -max 4m selects 4000000 bytes.  If more
-                        space is needed, temporary files will be used.
+                        space is needed, an error will occur.
 
         -verbose        Enable debug printout.  More -v's give more printout.
         or  -debug      Also, version information is printed at startup.
@@ -423,11 +423,6 @@
 much lower quality than the default behavior.  "-dither none" may give
 acceptable results in two-pass mode, but is seldom tolerable in one-pass mode.
 
-Two-pass color quantization requires a good deal of memory; on MS-DOS machines
-it may run out of memory even with -maxmemory 0.  In that case you can still
-decompress, with some loss of image quality, by specifying -onepass for
-one-pass quantization.
-
 To avoid the Unisys LZW patent (now expired), djpeg produces uncompressed GIF
 files.  These are larger than they should be, but are readable by standard GIF
 decoders.
@@ -435,24 +430,9 @@
 
 HINTS FOR BOTH PROGRAMS
 
-If more space is needed than will fit in the available main memory (as
-determined by -maxmemory), temporary files will be used.  (MS-DOS versions
-will try to get extended or expanded memory first.)  The temporary files are
-often rather large: in typical cases they occupy three bytes per pixel, for
-example 3*800*600 = 1.44Mb for an 800x600 image.  If you don't have enough
-free disk space, leave out -progressive and -optimize (for cjpeg) or specify
--onepass (for djpeg).
-
-On MS-DOS, the temporary files are created in the directory named by the TMP
-or TEMP environment variable, or in the current directory if neither of those
-exist.  Amiga implementations put the temp files in the directory named by
-JPEGTMP:, so be sure to assign JPEGTMP: to a disk partition with adequate free
-space.
-
-The default memory usage limit (-maxmemory) is set when the software is
-compiled.  If you get an "insufficient memory" error, try specifying a smaller
--maxmemory value, even -maxmemory 0 to use the absolute minimum space.  You
-may want to recompile with a smaller default value if this happens often.
+If the memory needed by cjpeg or djpeg exceeds the limit specified by
+-maxmemory, an error will occur.  You can leave out -progressive and -optimize
+(for cjpeg) or specify -onepass (for djpeg) to reduce memory usage.
 
 On machines that have "environment" variables, you can define the environment
 variable JPEGMEM to set the default memory limit.  The value is specified as
@@ -460,11 +440,6 @@
 specified when the program was compiled, and itself is overridden by an
 explicit -maxmemory switch.
 
-On MS-DOS machines, -maxmemory is the amount of main (conventional) memory to
-use.  (Extended or expanded memory is also used if available.)  Most
-DOS-specific versions of this software do their own memory space estimation
-and do not need you to specify -maxmemory.
-
 
 JPEGTRAN
 
diff --git a/win/jpeg7-memsrcdst.def b/win/jpeg7-memsrcdst.def
index 8c9f517..37a4777 100644
--- a/win/jpeg7-memsrcdst.def
+++ b/win/jpeg7-memsrcdst.def
@@ -104,3 +104,5 @@
 	jzero_far @ 103 ; 
 	jpeg_mem_dest @ 104 ; 
 	jpeg_mem_src @ 105 ; 
+	jpeg_skip_scanlines @ 106 ; 
+	jpeg_crop_scanline @ 107 ; 
diff --git a/wrbmp.c b/wrbmp.c
index 50e469c..728bbad 100644
--- a/wrbmp.c
+++ b/wrbmp.c
@@ -5,7 +5,7 @@
  * Copyright (C) 1994-1996, Thomas G. Lane.
  * libjpeg-turbo Modifications:
  * Copyright (C) 2013, Linaro Limited.
- * Copyright (C) 2014-2015, D. R. Commander.
+ * Copyright (C) 2014-2015, 2017, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
  *
@@ -70,7 +70,7 @@
 static INLINE boolean is_big_endian(void)
 {
   int test_value = 1;
-  if(*(char *)&test_value != 1)
+  if (*(char *)&test_value != 1)
     return TRUE;
   return FALSE;
 }
@@ -104,7 +104,7 @@
   inptr = dest->pub.buffer[0];
   outptr = image_ptr[0];
 
-  if(cinfo->out_color_space == JCS_RGB565) {
+  if (cinfo->out_color_space == JCS_RGB565) {
     boolean big_endian = is_big_endian();
     unsigned short *inptr2 = (unsigned short *)inptr;
     for (col = cinfo->output_width; col > 0; col--) {
@@ -437,6 +437,7 @@
                                   sizeof(bmp_dest_struct));
   dest->pub.start_output = start_output_bmp;
   dest->pub.finish_output = finish_output_bmp;
+  dest->pub.calc_buffer_dimensions = NULL;
   dest->is_os2 = is_os2;
 
   if (cinfo->out_color_space == JCS_GRAYSCALE) {
@@ -446,7 +447,7 @@
       dest->pub.put_pixel_rows = put_gray_rows;
     else
       dest->pub.put_pixel_rows = put_pixel_rows;
-  } else if(cinfo->out_color_space == JCS_RGB565 ) {
+  } else if (cinfo->out_color_space == JCS_RGB565) {
       dest->pub.put_pixel_rows = put_pixel_rows;
   } else {
     ERREXIT(cinfo, JERR_BMP_COLORSPACE);
diff --git a/wrgif.c b/wrgif.c
index cc06f1d..8d2050f 100644
--- a/wrgif.c
+++ b/wrgif.c
@@ -4,7 +4,7 @@
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1991-1997, Thomas G. Lane.
  * libjpeg-turbo Modifications:
- * Copyright (C) 2015, D. R. Commander.
+ * Copyright (C) 2015, 2017, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
  *
@@ -356,6 +356,16 @@
 
 
 /*
+ * Re-calculate buffer dimensions based on output dimensions.
+ */
+
+METHODDEF(void)
+calc_buffer_dimensions_gif (j_decompress_ptr cinfo, djpeg_dest_ptr dinfo)
+{
+}
+
+
+/*
  * The module selection routine for GIF format output.
  */
 
@@ -372,6 +382,7 @@
   dest->pub.start_output = start_output_gif;
   dest->pub.put_pixel_rows = put_pixel_rows;
   dest->pub.finish_output = finish_output_gif;
+  dest->pub.calc_buffer_dimensions = calc_buffer_dimensions_gif;
 
   if (cinfo->out_color_space != JCS_GRAYSCALE &&
       cinfo->out_color_space != JCS_RGB)
diff --git a/wrjpgcom.c b/wrjpgcom.c
index c970757..531c152 100644
--- a/wrjpgcom.c
+++ b/wrjpgcom.c
@@ -247,7 +247,7 @@
   if (length < 2)
     ERREXIT("Erroneous JPEG marker length");
   length -= 2;
-  /* Skip over the remaining bytes */
+  /* Copy the remaining bytes */
   while (length > 0) {
     write_1_byte(read_1_byte());
     length--;
diff --git a/wrppm.c b/wrppm.c
index 40fbf1f..91cb10b 100644
--- a/wrppm.c
+++ b/wrppm.c
@@ -4,8 +4,8 @@
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1991-1996, Thomas G. Lane.
  * Modified 2009 by Guido Vollbeding.
- * It was modified by The libjpeg-turbo Project to include only code and
- * information relevant to libjpeg-turbo.
+ * libjpeg-turbo Modifications:
+ * Copyright (C) 2017, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
  *
@@ -20,7 +20,6 @@
  */
 
 #include "cdjpeg.h"             /* Common decls for cjpeg/djpeg applications */
-#include "wrppm.h"
 
 #ifdef PPM_SUPPORTED
 
@@ -63,6 +62,21 @@
  */
 
 
+/* Private version of data destination object */
+
+typedef struct {
+  struct djpeg_dest_struct pub; /* public fields */
+
+  /* Usually these two pointers point to the same place: */
+  char *iobuffer;               /* fwrite's I/O buffer */
+  JSAMPROW pixrow;              /* decompressor output buffer */
+  size_t buffer_width;          /* width of I/O buffer */
+  JDIMENSION samples_per_row;   /* JSAMPLEs per output row */
+} ppm_dest_struct;
+
+typedef ppm_dest_struct *ppm_dest_ptr;
+
+
 /*
  * Write some pixel data.
  * In this module rows_supplied will always be 1.
@@ -197,6 +211,20 @@
 
 
 /*
+ * Re-calculate buffer dimensions based on output dimensions.
+ */
+
+METHODDEF(void)
+calc_buffer_dimensions_ppm (j_decompress_ptr cinfo, djpeg_dest_ptr dinfo)
+{
+  ppm_dest_ptr dest = (ppm_dest_ptr) dinfo;
+
+  dest->samples_per_row = cinfo->output_width * cinfo->out_color_components;
+  dest->buffer_width = dest->samples_per_row * (BYTESPERSAMPLE * sizeof(char));
+}
+
+
+/*
  * The module selection routine for PPM format output.
  */
 
@@ -211,13 +239,13 @@
                                   sizeof(ppm_dest_struct));
   dest->pub.start_output = start_output_ppm;
   dest->pub.finish_output = finish_output_ppm;
+  dest->pub.calc_buffer_dimensions = calc_buffer_dimensions_ppm;
 
   /* Calculate output image dimensions so we can allocate space */
   jpeg_calc_output_dimensions(cinfo);
 
   /* Create physical I/O buffer */
-  dest->samples_per_row = cinfo->output_width * cinfo->out_color_components;
-  dest->buffer_width = dest->samples_per_row * (BYTESPERSAMPLE * sizeof(char));
+  dest->pub.calc_buffer_dimensions (cinfo, (djpeg_dest_ptr) dest);
   dest->iobuffer = (char *) (*cinfo->mem->alloc_small)
     ((j_common_ptr) cinfo, JPOOL_IMAGE, dest->buffer_width);
 
diff --git a/wrrle.c b/wrrle.c
index cc95b41..880fadf 100644
--- a/wrrle.c
+++ b/wrrle.c
@@ -3,8 +3,8 @@
  *
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1991-1996, Thomas G. Lane.
- * It was modified by The libjpeg-turbo Project to include only code and
- * information relevant to libjpeg-turbo.
+ * libjpeg-turbo Modifications:
+ * Copyright (C) 2017, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
  *
@@ -286,6 +286,7 @@
                                   sizeof(rle_dest_struct));
   dest->pub.start_output = start_output_rle;
   dest->pub.finish_output = finish_output_rle;
+  dest->pub.calc_buffer_dimensions = NULL;
 
   /* Calculate output image dimensions so we can allocate space */
   jpeg_calc_output_dimensions(cinfo);
diff --git a/wrtarga.c b/wrtarga.c
index c02b332..4db9313 100644
--- a/wrtarga.c
+++ b/wrtarga.c
@@ -3,8 +3,8 @@
  *
  * This file was part of the Independent JPEG Group's software:
  * Copyright (C) 1991-1996, Thomas G. Lane.
- * It was modified by The libjpeg-turbo Project to include only code and
- * information relevant to libjpeg-turbo.
+ * libjpeg-turbo Modifications:
+ * Copyright (C) 2017, D. R. Commander.
  * For conditions of distribution and use, see the accompanying README.ijg
  * file.
  *
@@ -212,6 +212,19 @@
 
 
 /*
+ * Re-calculate buffer dimensions based on output dimensions.
+ */
+
+METHODDEF(void)
+calc_buffer_dimensions_tga (j_decompress_ptr cinfo, djpeg_dest_ptr dinfo)
+{
+  tga_dest_ptr dest = (tga_dest_ptr) dinfo;
+
+  dest->buffer_width = cinfo->output_width * cinfo->output_components;
+}
+
+
+/*
  * The module selection routine for Targa format output.
  */
 
@@ -226,12 +239,13 @@
                                   sizeof(tga_dest_struct));
   dest->pub.start_output = start_output_tga;
   dest->pub.finish_output = finish_output_tga;
+  dest->pub.calc_buffer_dimensions = calc_buffer_dimensions_tga;
 
   /* Calculate output image dimensions so we can allocate space */
   jpeg_calc_output_dimensions(cinfo);
 
   /* Create I/O buffer. */
-  dest->buffer_width = cinfo->output_width * cinfo->output_components;
+  dest->pub.calc_buffer_dimensions (cinfo, (djpeg_dest_ptr) dest);
   dest->iobuffer = (char *)
     (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
                                 (size_t) (dest->buffer_width * sizeof(char)));