lib/xmlparse.c: Detect billion laughs attack with isolated external parser

When parsing DTD content with code like ..

  XML_Parser parser = XML_ParserCreate(NULL);
  XML_Parser ext_parser = XML_ExternalEntityParserCreate(parser, NULL, NULL);
  enum XML_Status status = XML_Parse(ext_parser, doc, (int)strlen(doc), XML_TRUE);

.. there are 0 bytes accounted as direct input and all input from `doc` accounted
as indirect input.  Now function accountingGetCurrentAmplification cannot calculate
the current amplification ratio as "(direct + indirect) / direct", and it did refuse
to divide by 0 as one would expect, but it returned 1.0 for this case to indicate
no amplification over direct input.  As a result, billion laughs attacks from
DTD-only input were not detected with this isolated way of using an external parser.

The new approach is to assume direct input of length not 0 but 22 -- derived from
ghost input "<!ENTITY a SYSTEM 'b'>", the shortest possible way to include an external
DTD --, and do the usual "(direct + indirect) / direct" math with "direct := 22".

GitHub issue #839 has more details on this issue and its origin in ClusterFuzz
finding 66812.
1 file changed