Fix a bug in the client

In very rare cases, at the very end of a deflate block, there will be a distance
which has X bits in Huffman code, an end of block symbol with Y bits in Huffman
code and distances have maximum Z bits in Huffman code. If X + Y < Z, then we
incorrectly cache Z bits which we may not have enough bits to cover. This causes
a crash in the client, but it is catchable in the paygen stage.

This patch adds a new parameter in the Puffer to catch these scenarios and adds
a new function RemoveDeflatesWithBadDistanceCaches() which detects and removes
these problematic deflate instances. This function can be called from the
update_engine to do that.

Bug: crbug.com/915559
Test: unittests
Test: puffin_corpus
Change-Id: I450204dc3c0e3f56d263aff47c420eba65f8453b
11 files changed
tree: 9b31a005af654c8d652f4aebebf840cfb42879c8
  1. scripts/
  2. src/
  3. .clang-format
  4. Android.bp
  5. BUILD.gn
  6. libpuffdiff.pc
  7. libpuffpatch.pc
  8. LICENSE
  9. Makefile
  10. OWNERS
  11. PRESUBMIT.cfg
  12. PREUPLOAD.cfg
  13. README.md
  14. README.version
README.md

Puffin

Source code for Puffin: A utility for deterministic DEFLATE recompression.

TODO(ahassani): Describe the directory structure and how-tos.

Glossary

  • Alphabet A value that occurs in the input stream. It can be either a literal:[0..255], and end of block sign [256], a length[257..285], or a distance [0..29].

  • Huffman code A variable length code representing the Huffman encoded of an alphabet. Huffman codes can be created uniquely using Huffman code length array.

  • Huffman code array An array which an array index identifies a Huffman code and the array element in that index represents the corresponding alphabet. Throughout the code, Huffman code arrays are identified by vectors with postfix hcodes_.

  • Huffman reverse code array An array which an array index identifies an alphabet and the array element in that index contains the Huffman code of the alphabet. Throughout the code, The Huffman reverse code arrays are identified by vectors with postfix rcodes_.

  • Huffman code length The number of bits in a Huffman code.

  • Huffman code length array An array of Huffman code lengths with the array index as the alphabet. Throughout the code, Huffman code length arrays are identified by vectors with postfix lens_.