[CVE-2022-40303] Fix integer overflows with XML_PARSE_HUGE

Also impose size limits when XML_PARSE_HUGE is set. Limit size of names
to XML_MAX_TEXT_LENGTH (10 million bytes) and other content to
XML_MAX_HUGE_LENGTH (1 billion bytes).

Move some the length checks to the end of the respective loop to make
them strict.

xmlParseEntityValue didn't have a length limitation at all. But without
XML_PARSE_HUGE, this should eventually trigger an error in xmlGROW.

Thanks to Maddie Stone working with Google Project Zero for the report!

Bug: http://b/260709824
Test: TreeHugger
Merged-In: I3c735c190e95210d16001f6466aa6c73f45188b3
Change-Id: I67dd2cb590e2af64b46cc88277ce78a1f279427d
1 file changed
tree: b4fc7761dcb6f131185a51b590ea8224beb6bfa5
  1. bakefile/
  2. doc/
  3. example/
  4. fuzz/
  5. include/
  6. macos/
  7. optim/
  8. os400/
  9. python/
  10. result/
  11. test/
  12. vms/
  13. VxWorks/
  14. win32/
  15. xstc/
  16. .gitattributes
  17. .gitignore
  18. .gitlab-ci.yml
  19. acinclude.m4
  20. Android.bp
  21. autogen.sh
  22. buf.c
  23. buf.h
  24. build_glob.py
  25. c14n.c
  26. catalog.c
  27. check-relaxng-test-suite.py
  28. check-relaxng-test-suite2.py
  29. check-xinclude-test-suite.py
  30. check-xml-test-suite.py
  31. check-xsddata-test-suite.py
  32. chvalid.c
  33. chvalid.def
  34. CleanSpec.mk
  35. CMakeLists.txt
  36. config.h
  37. config.h.cmake.in
  38. configure.ac
  39. Copyright
  40. dbgen.pl
  41. dbgenattr.pl
  42. debugXML.c
  43. dict.c
  44. DOCBparser.c
  45. enc.h
  46. encoding.c
  47. entities.c
  48. error.c
  49. genChRanges.py
  50. gentest.py
  51. genUnicode.py
  52. global.data
  53. globals.c
  54. hash.c
  55. HTMLparser.c
  56. HTMLtree.c
  57. legacy.c
  58. libxml-2.0-uninstalled.pc.in
  59. libxml-2.0.pc.in
  60. libxml.3
  61. libxml.h
  62. libxml.m4
  63. libxml.spec.in
  64. libxml2-config.cmake.cmake.in
  65. libxml2-config.cmake.in
  66. libxml2.doap
  67. libxml2.syms
  68. list.c
  69. Makefile.am
  70. Makefile.tests
  73. nanoftp.c
  74. nanohttp.c
  75. NEWS
  76. OWNERS
  77. parser.c
  78. parserInternals.c
  79. pattern.c
  80. post_update.sh
  81. README.md
  82. README.tests
  83. README.zOS
  84. regressions.py
  85. regressions.xml
  86. relaxng.c
  87. rngparser.c
  88. runsuite.c
  89. runtest.c
  90. runxmlconf.c
  91. save.h
  92. SAX.c
  93. SAX2.c
  94. schematron.c
  95. testapi.c
  96. testAutomata.c
  97. testC14N.c
  98. testchar.c
  99. testdict.c
  100. testdso.c
  101. testHTML.c
  102. testlimits.c
  103. testModule.c
  104. testOOM.c
  105. testOOMlib.c
  106. testOOMlib.h
  107. testReader.c
  108. testrecurse.c
  109. testRegexp.c
  110. testRelax.c
  111. testSAX.c
  112. testSchemas.c
  113. testThreads.c
  114. testURI.c
  115. testXPath.c
  116. threads.c
  117. timsort.h
  118. TODO
  120. tree.c
  121. trio.c
  122. trio.h
  123. triodef.h
  124. trionan.c
  125. trionan.h
  126. triop.h
  127. triostr.c
  128. triostr.h
  129. uri.c
  130. valid.c
  131. xinclude.c
  132. xlink.c
  133. xml2-config.1
  134. xml2-config.in
  135. xml2Conf.sh.in
  136. xmlcatalog.c
  137. xmlIO.c
  138. xmllint.c
  139. xmlmemory.c
  140. xmlmodule.c
  141. xmlreader.c
  142. xmlregexp.c
  143. xmlsave.c
  144. xmlschemas.c
  145. xmlschemastypes.c
  146. xmlstring.c
  147. xmlunicode.c
  148. xmlwriter.c
  149. xpath.c
  150. xpointer.c
  151. xzlib.c
  152. xzlib.h


libxml2 is an XML toolkit implemented in C, originally developed for the GNOME Project.

Full documentation is available at https://gitlab.gnome.org/GNOME/libxml2/-/wikis.

Bugs should be reported at https://gitlab.gnome.org/GNOME/libxml2/-/issues.

A mailing list xml@gnome.org is available. You can subscribe at https://mail.gnome.org/mailman/listinfo/xml. The list archive is at https://mail.gnome.org/archives/xml/.


This code is released under the MIT License, see the Copyright file.

Build instructions

libxml2 can be built with GNU Autotools, CMake, or several other build systems in platform-specific subdirectories.

Autotools (for POSIX systems like Linux, BSD, macOS)

If you build from a Git tree, you have to install Autotools and start by generating the configuration files with:


If you build from a source tarball, extract the archive with:

tar xf libxml2-xxx.tar.gz
cd libxml2-xxx

To see a list of build options:

./configure --help

Also see the INSTALL file for additional instructions. Then you can configure and build the library:

./configure [possible options]

Note that by default, no optimization options are used. You have to enable them manually, for example with:

CFLAGS='-O2 -fno-semantic-interposition' ./configure

Now you can run the test suite with:

make check

Please report test failures to the mailing list or bug tracker.

Then you can install the library:

make install

At that point you may have to rerun ldconfig or a similar utility to update your list of installed shared libs.

CMake (mainly for Windows)

Another option for compiling libxml is using CMake:

cmake -E tar xf libxml2-xxx.tar.gz
cmake -S libxml2-xxx -B libxml2-xxx-build [possible options]
cmake --build libxml2-xxx-build
cmake --install libxml2-xxx-build

Common CMake options include:

-D BUILD_SHARED_LIBS=OFF            # build static libraries
-D CMAKE_BUILD_TYPE=Release         # specify build type
-D CMAKE_INSTALL_PREFIX=/usr/local  # specify the install path
-D LIBXML2_WITH_ICONV=OFF           # disable iconv
-D LIBXML2_WITH_LZMA=OFF            # disable liblzma
-D LIBXML2_WITH_PYTHON=OFF          # disable Python
-D LIBXML2_WITH_ZLIB=OFF            # disable libz

You can also open the libxml source directory with its CMakeLists.txt directly in various IDEs such as CLion, QtCreator, or Visual Studio.


Libxml does not require any other libraries. A platform with somewhat recent POSIX support should be sufficient (please report any violation to this rule you may find).

However, if found at configuration time, libxml will detect and use the following libraries:

  • libz, a highly portable and widely available compression library.
  • liblzma, another compression library.
  • libiconv, a character encoding conversion library. The iconv function is part of POSIX.1-2001, so libiconv isn't required on modern UNIX-like systems like Linux, BSD or macOS.
  • ICU, a Unicode library. Mainly useful as an alternative to iconv on Windows. Unnecessary on most other systems.


The current version of the code can be found in GNOME's GitLab at at https://gitlab.gnome.org/GNOME/libxml2. The best way to get involved is by creating issues and merge requests on GitLab. Alternatively, you can start discussions and send patches to the mailing list. If you want to work with patches, please format them with git-format-patch and use plain text attachments.

All code must conform to C89 and pass the GitLab CI tests. Add regression tests if possible.


  • Daniel Veillard
  • Bjorn Reese
  • William Brack
  • Igor Zlatkovic for the Windows port
  • Aleksey Sanin
  • Nick Wellnhofer