Add missing invalid distance check

In the puffin spec, we defined the distance on the wire to be zero
based ([1..32768] on deflate RFC). But I missed it in the implementation and
stored non-zero based distances. However, this is not a bug and since we have
not shipped it yet, it is possible to change.

Another change include, adding proper checks for finding invalid copy length or
distance. Also more unittests for invalid copy length/distance values are
added. Lack of these checks were found by running puffin_fuzzer on cluster fuzz.

Bug: crbug.com/817733
Bug: crbug.com/817686
Test: unittests
Test: test_corpus.py

Change-Id: I38bf630904d7996a3c4f15919960517d26520987
4 files changed
tree: 49ce7bafe0791566fb81b49291df98d9e7c1c329
  1. scripts/
  2. src/
  3. .clang-format
  4. Android.bp
  5. libpuffdiff.pc
  6. libpuffpatch.pc
  7. LICENSE
  8. Makefile
  9. OWNERS
  10. PRESUBMIT.cfg
  11. PREUPLOAD.cfg
  12. puffin.gyp
  13. README.md
README.md

Puffin

Source code for Puffin: A utility for deterministic DEFLATE recompression.

TODO(ahassani): Describe the directory structure and how-tos.

Glossary

  • Alphabet A value that occurs in the input stream. It can be either a literal:[0..255], and end of block sign [256], a length[257..285], or a distance [0..29].

  • Huffman code A variable length code representing the Huffman encoded of an alphabet. Huffman codes can be created uniquely using Huffman code length array.

  • Huffman code array An array which an array index identifies a Huffman code and the array element in that index represents the corresponding alphabet. Throughout the code, Huffman code arrays are identified by vectors with postfix hcodes_.

  • Huffman reverse code array An array which an array index identifies an alphabet and the array element in that index contains the Huffman code of the alphabet. Throughout the code, The Huffman reverse code arrays are identified by vectors with postfix rcodes_.

  • Huffman code length The number of bits in a Huffman code.

  • Huffman code length array An array of Huffman code lengths with the array index as the alphabet. Throughout the code, Huffman code length arrays are identified by vectors with postfix lens_.