commit | 030e8cf4ab0d2842181d7ed5faea100d8488a3a4 | [log] [tgz] |
---|---|---|
author | DichenZhang1 <140119224+DichenZhang1@users.noreply.github.com> | Tue Aug 08 17:44:25 2023 +0000 |
committer | GitHub <noreply@github.com> | Tue Aug 08 17:44:25 2023 +0000 |
tree | 19eb6aaf5ffc5c1fc48edd6e55f5978f2d319ac6 | |
parent | 83857885dab5ba647fe5c08dbc0337d348826d5f [diff] |
Add files via upload
This document defines the behavior of a new file format that encodes a logarithmic range gain map image in a JPEG image file. Legacy readers that don't support the new format read and display the conventional low dynamic range image from the image file. Readers that support the format combine the primary image with the gain map and render a high dynamic range image on compatible displays.
The remainder of this document describes the methods of the processes needed to make use of this format. At a high level, the life cycle of an image conforming to this format is:
Encoding
Decoding
Figure 1. Example file layout and relevant metadata.{:.img-caption}
The goal of this file format is to encode additional information in SDR image files that can be used in combination with the display technique to produce their optimal HDR renditions, in a single file.
For this to be practical, the file format must:
Additionally, the display technique must:
And finally, the technique must be able to do all of the preceding actions without ever:
The following are normative references for this specification:
SDR display
HDR display
Primary image
Secondary image
Range compression
SDR white point
HDR white point
Boost
Max content boost (max_content_boost
in equations)
Min content boost (min_content_boost
in equations)
Max display boost (max_display_boost
in equations)
Display boost
Target HDR rendition
Adapted HDR rendition
Gain map (recovery(x, y)
in equations)
clamp(x, a, b)
exp2(x)
floor(x)
log2(x)
pow(b, x)
XMP
Multi-Picture Format
GContainer
This section describes how to encode a conforming JPEG file. Refer to T.81 (09/92) Digital compression and coding of continuous-tone still images{:.external}, in the Dependencies section, for more information about the JPEG format.
Camera imaging pipelines commonly perform a range compression operation to compress higher dynamic range luminance data to the lower range of conventional SDR displays. The gain map provides a mechanism to store data sufficient to recover the original, higher dynamic range luminance data.
The following calculations in this section assume floating point arithmetic.
Note: To illustrate the following formulas, this document assumes that you are using a single-channel gain map. If this isn't the case, then formulas relating to these properties can be extrapolated to each color channel. You skip converting from SDR(x, y)
and HDR(x, y)
to Ysdr(x, y)
and Yhdr(x, y)
respectively and instead define pixel_gain(x, y)
as a vector function relative to SDR(x, y)
and HDR(x, y)
. You also utilize per-channel metadata values, as required. Use a single-channel gain map whenever possible.
The following functions describe the SDR image:
SDR'(x, y)
is the three-channel, non-linear (typically gamma-encoded) primary image.SDR(x, y)
is the linear version of the three-channel primary image, obtained by transforming to a linear version of the primary image color space. For example, from a color space with a sRGB transfer function to a linear color space that preserves sRGB color primaries.The Ysdr(x, y)
function is defined on the range of 0.0 to 1.0 and is the standard dynamic range primary image linear luminance:
Ysdr(x, y) = primary_color_profile_to_luminance(SDR(x, y))
Similar definitions exist for the HDR image.
HDR'(x, y)
is the three-channel non-linear, that is, a PQ or HLG encoded image.HDR(x, y)
is the three-channel linear HDR image.Yhdr(x, y)
is the luminance at a given point of the HDR image:
Yhdr(x, y) = primary_color_profile_to_luminance(HDR(x, y))
Yhdr(x, y)
is defined in the range 0.0 to max content boost.
The SDR and HDR images must be the same resolution. The color profile of the SDR image defines the color space of the HDR image.
For example, if the SDR primary image has a Display-P3 color profile, then the HDR image is defined relative to the primary colors of that profile. This means the HDR image also has Display-P3 primaries.
The gain map is computed from two linear images containing the wanted HDR image luminance, Yhdr(x, y)
, and the standard range luminance image, Ysdr(x, y)
.
The pixel_gain(x, y)
function is defined as the ratio between the Yhdr(x, y)
function and the Ysdr(x, y)
function:
pixel_gain(x, y) = (Yhdr(x, y) + offset_hdr) / (Ysdr(x, y) + offset_sdr)
The pixel_gain(x, y)
function behavior where Ysdr(x, y)
and offset_sdr
are both zero is implementation-defined.
For example, implementations can handle the case where Ysdr(x, y)
and offset_sdr
are both zero by defining pixel_gain(x, y)
as 1.0. Alternatively, implementations also avoid this scenario by utilizing a non-zero offset_sdr
.
The implementation might choose the values of offset_sdr
and offset_hdr
.
Tip: Use a value of 0.015625 (1/64) for offset_sdr
and offset_hdr
. The purpose of these values is to balance the ability to raise near blacks with the ability to precisely encode smaller gain values. Increasing these offset values increases the ability to recover near blacks, while maintaining a reasonable value for map_max_log2
. Increasing the values too high can reduce precision of the map.
The gain map is a scalar function that encodes pixel_gain(x, y)
in a logarithmic space, relative to max content boost and min content boost:
map_min_log2 = log2(min_content_boost) map_max_log2 = log2(max_content_boost) log_recovery(x, y) = (log2(pixel_gain(x, y)) - map_min_log2) / (map_max_log2 - map_min_log2) clamped_recovery(x, y) = clamp(log_recovery(x, y), 0.0, 1.0) recovery(x, y) = pow(clamped_recovery(x, y), map_gamma)
The recovery(x, y)
function behavior where pixel_gain(x, y)
is zero is implementation defined, because log2(0)
is undefined.
Tip: Define recovery(x, y)
as 0.0 where pixel_gain(x, y)
is zero. This is because recovery(x, y)
of 0.0 represents the largest possible attenuation that can be stored in the gain map.
map_gamma
is a floating point number that must be greater than 0.0 and is chosen by the implementation.
Tip: Use a map_gamma
of 1.0. You can use a different value if your gain map has a very uneven distribution of log_recovery(x, y)
values. For example, this might apply if a gain map has a lot of detail just above SDR range (represented as small log_recovery(x, y)
values), and a very large map_max_log2
for the top end of the HDR rendition's desired brightness (represented by large log_recovery(x, y)
values). In this case, you can use a map_gamma
higher than 1.0 so that recovery(x, y)
can precisely encode the detail in both the low end and high end of log_recovery(x, y)
.
The values of max content boost and min content boost are implementation-defined, and can be arbitrarily decided by the content creator. Max content boost must be greater than or equal to 1.0. Min content boost must be in the range (0.0, 1.0].
Note: For example, say the image comes from a camera, and the camera‘s image-processing pipeline generates the SDR image by performing range compression on a higher dynamic range input. Then, max content boost and min content boost might be a function of the amount of range compression performed and what regions of luminance it affected.
Such pipelines might choose a min content boost of 1.0 if they don’t want to encode any attenuation of the HDR image relative to the SDR image. Alternatively, if generating the gain map based on the SDR and HDR renditions themselves, max content boost and min content boost can be determined based on the maximum and minimum gain needed to convert between the two renditions.
Content creators or editors can also adjust max content boost to control the maximum allowable difference between SDR and HDR luminance when the image is displayed.
Values in recovery(x, y)
are limited to the range [0.0, 1.0].
Note: The brightest areas of the image typically have recovery(x, y)
values close to 1.0, while darker areas of the image have values around 0.0.
The gain map is stored in a secondary image JPEG, and therefore must be encoded using 8-bit, unsigned integer values, thus in the range [0, 255]. Each value represents a recovery(x, y)
value and is stored in one pixel of the secondary image.
For 8-bit unsigned integer storage, the encoded value is defined as the following:
encoded_recovery(x, y) = floor(recovery(x, y) * 255.0 + 0.5)
Calculation of the encode function is done in floating point and converted at the end to the 8-bit unsigned integer result by rounding as indicated.
This encoding results in an 8-bit unsigned integer representation of recovery(x, y)
values, from 0.0 to 1.0. The encoded gain map must be stored in a secondary image item as a JPEG. The implementation chooses the amount of compression to use during JPEG encoding.
Tip: Start with an implementation defined JPEG algorithm quality parameter of 85 to 90 out of 100, or similar, for most JPEG encoding implementations.
After the gain map is stored in a secondary image, it is appended to a primary image with MPF and GContainer XMP metadata. The primary image GContainer directory must contain an item for the gain map image.
The resolution of the stored gain map is implementation-defined and can be different from the resolution from the primary image. In the case that the Gain Map is scaled to a different resolution from the primary image for storage, the sampling method must be bilinear or better, and is implementation defined.
Tip: Have the gain map downsampled to ¹⁄₁₆ of the size (¼ on each dimension) of the primary image before storage, such as a 480x270 gain map for a 1920x1080 primary image. The gain map needs a similar aspect ratio to the primary image.
The orientation of the gain map must match that of the primary image. If present, any orientation metadata in the stored gain map image, as in EXIF, isn't used.
If present, the gain map‘s color profile isn’t used.
The color profile of the image must be indicated via an ICC Profile for the primary image.
Tip: Use a Display-P3 color profile.
The primary image contains XMP metadata to define at least two images with extra semantic information for the HDR gain map format.
The following subsections contain details specific to this format. Additional information regarding general conformance to GContainer is specified in the GContainer details section.
Attribute values described in the following tables are stored as XMP simple values of the specified XMP basic value types.
The Item:Semantic
property defines the application-specific meaning of each media item in the container directory.
Gain map metadata encodes information about how to interpret and apply the gain map to produce the HDR representation of the primary image.
The XMP namespace URI for the gain map metadata XMP extension is http://ns.adobe.com/hdr-gain-map/1.0/
. The default namespace prefix is hdrgm
.
This metadata is stored in the gain map image‘s XMP packet and the following properties must appear in the gain map image XMP’s rdf:Description
:
Tip: Whenever possible, use only a single Real value for each preceding property that permits one value or an array of values, with one for each color channel. Set HDRCapacityMax
equal to GainMapMax
and HDRCapacityMin
to the greater of GainMapMin
or 0.0. The rest of this document assumes that you are using a single Real value for all properties which may have an array of Real values. Formulas about these properties can be extrapolated to each color channel when this isn't the case in practice.
The following example of a valid gain map XMP packet contains metadata taken from the example file illustrated in the Introduction section.
The gain map image must be stored as an additional image as defined in CIPA DC-x 007-2009 Multi-Picture Format{:.external}, as referenced in the Dependencies section.
This section describes how to decode the gain map from a conforming JPEG file.
A JPEG file conforming to this format may be identified by the presence of hdrgm:Version="1.0"
in the primary image's XMP packet, where hdrgm
is the namespace URI http://ns.adobe.com/hdr-gain-map/1.0/
.
For details on parsing and decoding the image, see the following GContainer details section. A “GainMap” semantic item within the XMP rdf:Directory
is used to signal the location of a gain map image. Alternatively, the MPF Index IFD and scanning images' XMP is used to determine the location of a gain map.
Metadata is considered invalid if a required field is not present, or if any field is present with an invalid value. A value may be invalid because it is not parseable to the specified type or because it is outside of its expected range.
If invalid metadata is encountered, the gain map should be ignored and the SDR image should be displayed.
Files encoded in the HDR gain map format might be rendered on either conventional SDR displays, or on HDR displays capable of higher-luminance output.
The following calculations in this section assume floating-point arithmetic.
encoded_recovery(x, y)
is the single-channel, 8-bit, unsigned integer value from the gain map image.
If the gain map is a different resolution than the primary image, then encoded_recovery(x, y)
is instead determined by a filtered sampling of the gain map image for x and y over the range of the primary image width and height, respectively. The filtering method must be bilinear or better and is implementation defined.
map_gamma
is determined by the hdrgm:Gamma
metadata field.
log_recovery(x, y)
is the normalized floating point pixel gain in a logarithmic space:
recovery(x, y) = encoded_recovery(x, y) / 255.0 log_recovery(x, y) = pow(recovery(x, y), 1.0 / map_gamma)
Max display boost is a scalar floating point value defined as the ratio between the current HDR white point and divided by the current SDR white point. This value is provided by the display system and can change over time.
hdr_capacity_max
is determined by the hdrgm:HDRCapacityMax
metadata field. hdr_capacity_min
is determined by the hdrgm:HDRCapacityMin
metadata field.
weight_factor
is determined as follows when hdrgm:BaseRenditionIsHDR
is “False”:
unclamped_weight_factor = (log2(max_display_boost) - hdr_capacity_min) / (hdr_capacity_max - hdr_capacity_min) weight_factor = clamp(unclamped_weight_factor, 0.0, 1.0)
When hdrgm:BaseRenditionIsHDR
is “True”, the second equation is instead:
weight_factor = 1.0 - clamp(unclamped_weight_factor, 0.0, 1.0)
gain_map_max
is determined by the hdrgm:GainMapMax
metadata field. gain_map_min
is determined by the hdrgm:GainMapMin
metadata field. offset_sdr
is determined by the hdrgm:OffsetSDR
metadata field. offset_hdr
is determined by the hdrgm:OffsetHDR
metadata field.
The linear adapted HDR rendition can be computed as follows:
log_boost(x, y) = gain_map_min * (1.0f - log_recovery(x, y)) + gain_map_max * log_recovery(x, y) HDR(x, y) = (SDR(x, y) + offset_sdr) * exp2(log_boost(x, y) * weight_factor) - offset_hdr
If needed, the implementation might apply a transform to HDR(x, y)
, to put the data in the space expected by the display. Any such transformations must be colorimetrically correct.
Note: The gain map must be applied as shown earlier (using exponentiation) to properly produce the needed relative tonal relationships (between areas of different brightness in the image) at any wanted display boost. If the Gain Map is applied any other way, such as via simple linear interpolation, then the relative tonal relationships within the image are correct only at max content boost, and are compromised at other boost factors.
The described method for recovering the adapted HDR rendition means that display results transition seamlessly as max display boost changes. This method always produces a display result that matches the content creator's intention as closely as possible, and that meets all of the image quality requirements enumerated in the Motivation section.
The amount of attenuation applied in cases where min content boost is less than 1.0 also scales based on the relationship between max display boost and max content boost. For example, if max content boost is 4.0, min content boost is 0.5, and Max display boost is 2.0, then the maximum attenuation of the displayed image is 0.7071, rather than 0.5 if max display boost is 4.0 or higher.
This section specifies additional requirements such that this format conforms with GContainer XML metadata. The metadata is serialized following ISO 166841:2011(E) XMP Specification Part 1{:.external} and embedded inside the primary image file as described in Adobe XMP Specification Part 3 Storage in Files{:.external}. The primary image file contains the following items, formatted as RDF/XML.
The XMP packet shall include the gain map metadata XMP extension via the namespace URI http://ns.adobe.com/hdr-gain-map/1.0/
. The default namespace prefix is hdrgm
.
The XMP packet shall define hdrgm:Version="1.0"
.
The XMP namespace for the GContainer XMP extension is http://ns.google.com/photos/1.0/container/
. The default namespace prefix is Container
.
The primary image contains a Container:Directory
element in XMP metadata defining the order and properties of the subsequent media file in the file container. Each file in the container has a corresponding media item in the Container:Directory
. The media item describes the location in the file container and the basic properties of each concatenated file.
The container element is encoded into the XMP metadata of the primary image and defines a directory of media items in the container. Media items must be located in the container file in the same order as the media item elements in the directory and must be tightly packed.
The directory can contain only one “Primary” image item and it must be the first item in the directory.
Item elements describe how each media item is used by the application.
The XMP namespace URI for the GContainer Item XMP extension is http://ns.google.com/photos/1.0/container/item/
. The default namespace prefix is Item
.
The first media item must be the primary image. It must specify Item:Semantic = "Primary"
and an Item:Mime
listed in Item MIME type values.
The length of the primary image item is determined by parsing the primary image based on its MIME type starting at the beginning of the file container.
Media items can contain an Item:Padding
attribute specifying additional padding between the end of the media item and the beginning of the next media item. When present on the last media item in the Container:Directory
, Item:Padding
indicates padding between the end of the item and the end of the file.
Each media item must contain Item:Mime
type and Item:Semantic
attributes. The secondary image media items must contain Item:Length
attributes.
Sequential media items can share resource data within the file container. The first media item determines the location of the resource in the file container, and subsequent shared media items have Item:Length
set to 0. In the case that the resource data is itself a container, Item:URI
might be used to determine the location of the media item data within the resource.
The location of media item resources in the container is determined by summing the length of the primary image encoding, the Item:Length
values of the preceding secondary media item resources, and all preceding Item:Padding
values. Item:Padding
is considered to be 0 on media item resources that don't specify its value.
The Item:Mime
attribute defines the MIME type of each media item data.
The following example of a valid GContainer XMP packet has metadata taken from the example file illustrated in the Introduction section.