README - platform/external/chromium_org/third_party/brotli/src - Git at Google

 This is a README for the font compression reference code. There are several
 compression related modules in this repository.

 brotli/ contains reference code for the Brotli byte-level compression
 algorithm. Note that it is licensed under an Apache 2 license.

 src/ contains prototype Java code for compressing fonts.

 cpp/ contains prototype C++ code for decompressing fonts.

 docs/ contains documents describing the proposed compression format.

 = How to run the compression test tool =

 This document documents how to run the compression reference code. At this
 writing, the code, while it is intended to produce a bytestream that can be
 reconstructed into a working font, the reference decompression code is not
 done, and the exact format of that bytestream is subject to change.

 == Building the tool ==

 On a standard Unix-style environment, it should be as simple as running “ant”.

 The tool depends on sfntly for much of the font work. The lib/ directory
 contains a snapshot jar. If you want to use the latest sfntly sources, then cd
 to the java subdirectory, run “ant”, then copy these files dist/lib/sfntly.jar
 dist/tools/conversion/eot/eotconverter.jar and
 dist.tools/conversion/woff/woffconverter.jar to $(thisproject)/lib:

 dist/lib/sfntly.jar dist/tools/conversion/eot/eotconverter.jar
 dist.tools/conversion/woff/woffconverter.jar

 There’s also a dependency on guava (see references below).

 The dependencies are subject to their own licenses.

 == Setting up the test ==

 A run of the tool evaluates a “base” configuration plus one or more test
 configurations, for each font. It measures the file size of the test as a ratio
 over the base file size, then graphs the value of that ratio sorted across all
 files given on the command line.

 The test parameters are set by command line options (an improvement from the
 last snapshot). The base is set by the -b command line option, and the
 additional tests are specified by repeated -x command line options (see below).

 Each test is specified by a string description. It is a colon-separated list of
 stages. The final stage is entropy compression and can be one of “gzip”,
 “lzma”, “bzip2”, “woff”, “eot” (with actual wire-format MTX compression), or
 “uncomp” (for raw, uncompressed TTF’s). Also, the new wire-format draft
 WOFF2 spec is available as "woff2", and takes an entropy coding as an
 optional argument, as in "woff2/gzip" or "woff2/lzma".

 Other stages may optionally include subparameters (following a slash, and
 comma-separated). The stages are:

 glyf: performs glyf-table preprocessing based on MTX. There are subparameters:
 1. cbbox (composite bounding box). When specified, the bounding box for
 composite glyphs is included, otherwise stripped 2. sbbox (simple bounding
 box). When specified, the bounding box for simple glyphs is included 3. code:
 the bytecode is separated out into a separate stream 4. triplet: triplet coding
 (as in MTX) is used 5. push: push sequences are separated; if unset, pushes are
 kept inline in the bytecode 6. reslice: components of the glyf table are
 separated into individual streams, taking the MTX idea of separating the
 bytecodes further.

 hmtx: strips lsb’s from the hmtx table. Based on the idea that lsb’s can be
 reconstructed from bbox.

 hdmx: performs the delta coding on hdmx, essentially the same as MTX.

 cmap: compresses cmap table: wire format representation is inverse of cmap
 table plus exceptions (one glyph encoded by multiple character codes).

 kern: compresses kern table (not robust, intended just for rough testing).

 strip: the subparameters are a list of tables to be stripped entirely
 (comma-separated).

 The string roughly corresponding to MTX is:

 glyf/cbbox,code,triplet,push,hop:hdmx:gzip

 Meaning: glyph encoding is used, with simple glyph bboxes stripped (but
 composite glyph bboxes included), triplet coding, push sequences, and hop
 codes. The hdmx table is compressed. And finally, gzip is used as the entropy
 coder.

 This differs from MTX in a number of small ways: LZCOMP is not exactly the same
 as gzip. MTX uses three separate compression streams (the base font including
 triplet-coded glyph data), the bytecodes, and the push sequences, while this
 test uses a single stream. MTX also compresses the CVT table (an upper bound on
 the impact of this can be estimated by testing strip/cvt)

 Lastly, as a point of methodology, the code by default strips the “dsig” table,
 which would be invalidated by any non-bit-identical change to the font data. If
 it is desired to keep this table, add the “keepdsig” stage.

 The string representing the currently most aggressive optimization level is:

 glyf/triplet,code,push,reslice:hdmx:hmtx:cmap:kern:lzma

 In addition to the MTX one above, it strips the bboxes from composite glyphs,
 reslices the glyf table, compresses the htmx, cmap, and kern tables, and uses
 lzma as the entropy coding.

 The string corresponding to the current WOFF Ultra Condensed draft spec
 document is:

 glyf/cbbox,triplet,code,reslice:woff2/lzma

 The current C++ codebase can roundtrip compressed files as long as no per-table
 entropy coding is specified, as below (this will be fixed soon).

 glyf/cbbox,triplet,code,reslice:woff2


 == Running the tool ==

 java -jar build/jar/compression.jar *.ttf > chart.html

 The tool takes a list of OpenType fonts on the commandline, and generates an
 HTML chart, which it simply outputs to stdout. This chart uses the Google Chart
 API for plotting.

 Options:

 -b <desc>

 Sets the baseline experiment description.

 [ -x <desc> ]...

 Sets an experiment description. Can be used multiple times.

 -o

 Outputs the actual compressed file, substituting ".wof2" for ".ttf" in
 the input file name. Only useful when a single -x parameter is specified.

 = Decompressing the fonts =

 See the cpp/ directory (including cpp/README) for the C++ implementation of
 decompression. This code is based on OTS, and successfully roundtrips the
 basic compression as described in the draft spec.

 = References =

 sfntly: http://code.google.com/p/sfntly/ Guava:
 http://code.google.com/p/guava-libraries/ MTX:
 http://www.w3.org/Submission/MTX/

 Also please refer to documents (currently Google Docs):

 WOFF Ultra Condensed file format: proposals and discussion of wire format
 issues (PDF is in docs/ directory)

 WIFF Ultra Condensed: more discussion of results and compression techniques.
 This tool was used to prepare the data in that document.
	This is a README for the font compression reference code. There are several
	compression related modules in this repository.

	brotli/ contains reference code for the Brotli byte-level compression
	algorithm. Note that it is licensed under an Apache 2 license.

	src/ contains prototype Java code for compressing fonts.

	cpp/ contains prototype C++ code for decompressing fonts.

	docs/ contains documents describing the proposed compression format.

	= How to run the compression test tool =

	This document documents how to run the compression reference code. At this
	writing, the code, while it is intended to produce a bytestream that can be
	reconstructed into a working font, the reference decompression code is not
	done, and the exact format of that bytestream is subject to change.

	== Building the tool ==

	On a standard Unix-style environment, it should be as simple as running “ant”.

	The tool depends on sfntly for much of the font work. The lib/ directory
	contains a snapshot jar. If you want to use the latest sfntly sources, then cd
	to the java subdirectory, run “ant”, then copy these files dist/lib/sfntly.jar
	dist/tools/conversion/eot/eotconverter.jar and
	dist.tools/conversion/woff/woffconverter.jar to $(thisproject)/lib:

	dist/lib/sfntly.jar dist/tools/conversion/eot/eotconverter.jar
	dist.tools/conversion/woff/woffconverter.jar

	There’s also a dependency on guava (see references below).

	The dependencies are subject to their own licenses.

	== Setting up the test ==

	A run of the tool evaluates a “base” configuration plus one or more test
	configurations, for each font. It measures the file size of the test as a ratio
	over the base file size, then graphs the value of that ratio sorted across all
	files given on the command line.

	The test parameters are set by command line options (an improvement from the
	last snapshot). The base is set by the -b command line option, and the
	additional tests are specified by repeated -x command line options (see below).

	Each test is specified by a string description. It is a colon-separated list of
	stages. The final stage is entropy compression and can be one of “gzip”,
	“lzma”, “bzip2”, “woff”, “eot” (with actual wire-format MTX compression), or
	“uncomp” (for raw, uncompressed TTF’s). Also, the new wire-format draft
	WOFF2 spec is available as "woff2", and takes an entropy coding as an
	optional argument, as in "woff2/gzip" or "woff2/lzma".

	Other stages may optionally include subparameters (following a slash, and
	comma-separated). The stages are:

	glyf: performs glyf-table preprocessing based on MTX. There are subparameters:
	1. cbbox (composite bounding box). When specified, the bounding box for
	composite glyphs is included, otherwise stripped 2. sbbox (simple bounding
	box). When specified, the bounding box for simple glyphs is included 3. code:
	the bytecode is separated out into a separate stream 4. triplet: triplet coding
	(as in MTX) is used 5. push: push sequences are separated; if unset, pushes are
	kept inline in the bytecode 6. reslice: components of the glyf table are
	separated into individual streams, taking the MTX idea of separating the
	bytecodes further.

	hmtx: strips lsb’s from the hmtx table. Based on the idea that lsb’s can be
	reconstructed from bbox.

	hdmx: performs the delta coding on hdmx, essentially the same as MTX.

	cmap: compresses cmap table: wire format representation is inverse of cmap
	table plus exceptions (one glyph encoded by multiple character codes).

	kern: compresses kern table (not robust, intended just for rough testing).

	strip: the subparameters are a list of tables to be stripped entirely
	(comma-separated).

	The string roughly corresponding to MTX is:

	glyf/cbbox,code,triplet,push,hop:hdmx:gzip

	Meaning: glyph encoding is used, with simple glyph bboxes stripped (but
	composite glyph bboxes included), triplet coding, push sequences, and hop
	codes. The hdmx table is compressed. And finally, gzip is used as the entropy
	coder.

	This differs from MTX in a number of small ways: LZCOMP is not exactly the same
	as gzip. MTX uses three separate compression streams (the base font including
	triplet-coded glyph data), the bytecodes, and the push sequences, while this
	test uses a single stream. MTX also compresses the CVT table (an upper bound on
	the impact of this can be estimated by testing strip/cvt)

	Lastly, as a point of methodology, the code by default strips the “dsig” table,
	which would be invalidated by any non-bit-identical change to the font data. If
	it is desired to keep this table, add the “keepdsig” stage.

	The string representing the currently most aggressive optimization level is:

	glyf/triplet,code,push,reslice:hdmx:hmtx:cmap:kern:lzma

	In addition to the MTX one above, it strips the bboxes from composite glyphs,
	reslices the glyf table, compresses the htmx, cmap, and kern tables, and uses
	lzma as the entropy coding.

	The string corresponding to the current WOFF Ultra Condensed draft spec
	document is:

	glyf/cbbox,triplet,code,reslice:woff2/lzma

	The current C++ codebase can roundtrip compressed files as long as no per-table
	entropy coding is specified, as below (this will be fixed soon).

	glyf/cbbox,triplet,code,reslice:woff2


	== Running the tool ==

	java -jar build/jar/compression.jar *.ttf > chart.html

	The tool takes a list of OpenType fonts on the commandline, and generates an
	HTML chart, which it simply outputs to stdout. This chart uses the Google Chart
	API for plotting.

	Options:

	-b <desc>

	Sets the baseline experiment description.

	[ -x <desc> ]...

	Sets an experiment description. Can be used multiple times.

	-o

	Outputs the actual compressed file, substituting ".wof2" for ".ttf" in
	the input file name. Only useful when a single -x parameter is specified.

	= Decompressing the fonts =

	See the cpp/ directory (including cpp/README) for the C++ implementation of
	decompression. This code is based on OTS, and successfully roundtrips the
	basic compression as described in the draft spec.

	= References =

	sfntly: http://code.google.com/p/sfntly/ Guava:
	http://code.google.com/p/guava-libraries/ MTX:
	http://www.w3.org/Submission/MTX/

	Also please refer to documents (currently Google Docs):

	WOFF Ultra Condensed file format: proposals and discussion of wire format
	issues (PDF is in docs/ directory)

	WIFF Ultra Condensed: more discussion of results and compression techniques.
	This tool was used to prepare the data in that document.