blob: 92b312892214a24273fc96ac26f75b42f14c98e6 [file] [log] [blame]
erofs-utils
===========
userspace tools for EROFS filesystem, currently including:
mkfs.erofs filesystem formatter
erofsfuse FUSE daemon alternative
dump.erofs filesystem analyzer
fsck.erofs filesystem compatibility & consistency checker as well
as extractor
Dependencies & build
--------------------
lz4 1.8.0+ for lz4 enabled [2], lz4 1.9.3+ highly recommended [4][5].
XZ Utils 5.3.2alpha [6] or later versions for MicroLZMA enabled.
libfuse 2.6+ for erofsfuse enabled as a plus.
How to build with lz4-1.9.0 or above
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To build, you can run the following commands in order:
::
$ ./autogen.sh
$ ./configure
$ make
mkfs.erofs binary will be generated under mkfs folder.
* For lz4 < 1.9.2, there are some stability issues about
LZ4_compress_destSize(). (lz4hc isn't impacted) [3].
** For lz4 = 1.9.2, there is a noticeable regression about
LZ4_decompress_safe_partial() [5], which impacts erofsfuse
functionality for legacy images (without 0PADDING).
How to build with lz4-1.8.0~1.8.3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For these old lz4 versions, lz4hc algorithm cannot be supported
without lz4-static installed due to LZ4_compress_HC_destSize()
unstable api usage, which means lz4 will only be available if
lz4-static isn't found.
On Fedora, lz4-static can be installed by using:
yum install lz4-static.x86_64
However, it's still not recommended using those versions directly
since there are serious bugs in these compressors, see [2] [3] [4]
as well.
How to build with liblzma
~~~~~~~~~~~~~~~~~~~~~~~~~
In order to enable LZMA support, build with the following commands:
$ ./configure --enable-lzma
$ make
Additionally, you could specify liblzma build paths with:
--with-liblzma-incdir and --with-liblzma-libdir
mkfs.erofs
----------
two main kinds of EROFS images can be generated: (un)compressed.
- For uncompressed images, there will be none of compression
files in these images. However, it can decide whether the tail
block of a file should be inlined or not properly [1].
- For compressed images, it'll try to use specific algorithms
first for each regular file and see if storage space can be
saved with compression. If not, fallback to an uncompressed
file.
How to generate EROFS images (lz4 for Linux 5.3+, lzma for Linux 5.16+)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Currently lz4(hc) and lzma are available for compression, e.g.
$ mkfs.erofs -zlz4hc foo.erofs.img foo/
Or leave all files uncompressed as an option:
$ mkfs.erofs foo.erofs.img foo/
In addition, you could specify a higher compression level to get a
(slightly) better compression ratio than the default level, e.g.
$ mkfs.erofs -zlz4hc,12 foo.erofs.img foo/
Note that all compressors are still single-threaded for now, thus it
could take more time on the multiprocessor platform. Multi-threaded
approach is already in our TODO list.
How to generate EROFS big pcluster images (Linux 5.13+)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to get much better compression ratios (thus better sequential
read performance for common storage devices), big pluster feature has
been introduced since linux-5.13, which is not forward-compatible with
old kernels.
In details, -C is used to specify the maximum size of each big pcluster
in bytes, e.g.
$ mkfs.erofs -zlz4hc -C65536 foo.erofs.img foo/
So in that case, pcluster size can be 64KiB at most.
Note that large pcluster size can cause bad random performance, so
please evaluate carefully in advance. Or make your own per-(sub)file
compression strategies according to file access patterns if needed.
How to generate legacy EROFS images (Linux 4.19+)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Decompression inplace and compacted indexes have been introduced in
Linux upstream v5.3, which are not forward-compatible with older
kernels.
In order to generate _legacy_ EROFS images for old kernels,
consider adding "-E legacy-compress" to the command line, e.g.
$ mkfs.erofs -E legacy-compress -zlz4hc foo.erofs.img foo/
For Linux kernel >= 5.3, legacy EROFS images are _NOT recommended_
due to runtime performance loss compared with non-legacy images.
Obsoleted erofs.mkfs
~~~~~~~~~~~~~~~~~~~~
There is an original erofs.mkfs version developed by Li Guifu,
which was replaced by the new erofs-utils implementation.
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git -b obsoleted_mkfs
PLEASE NOTE: This version is highly _NOT recommended_ now.
erofsfuse
---------
erofsfuse is introduced to support EROFS format for various platforms
(including older linux kernels) and new on-disk features iteration.
It can also be used as an unpacking tool for unprivileged users.
It supports fixed-sized output decompression *without* any in-place
I/O or in-place decompression optimization. Also like the other FUSE
implementations, it suffers from most common performance issues (e.g.
significant I/O overhead, double caching, etc.)
Therefore, NEVER use it if performance is the top concern.
Note that extended attributes and ACLs aren't implemented yet due to
the current Android use case vs limited time. If you are interested,
contribution is, as always, welcome.
How to build erofsfuse
~~~~~~~~~~~~~~~~~~~~~~
It's disabled by default as an experimental feature for now due to
the extra libfuse dependency, to enable and build it manually:
$ ./configure --enable-fuse
$ make
erofsfuse binary will be generated under fuse folder.
How to mount an EROFS image with erofsfuse
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As the other FUSE implementations, it's quite simple to mount with
erofsfuse, e.g.:
$ erofsfuse foo.erofs.img foo/
Alternatively, to make it run in foreground (with debugging level 3):
$ erofsfuse -f --dbglevel=3 foo.erofs.img foo/
To debug erofsfuse (also automatically run in foreground):
$ erofsfuse -d foo.erofs.img foo/
To unmount an erofsfuse mountpoint as a non-root user:
$ fusermount -u foo/
dump.erofs and fsck.erofs
-------------------------
dump.erofs and fsck.erofs are used to analyze, check, and extract
EROFS filesystems. Note that extended attributes and ACLs are still
unsupported when extracting images with fsck.erofs.
Container images
----------------
EROFS filesystem is well-suitably used for container images with
advanced features like chunk-based files, multi-devices (blobs)
and new fscache backend for lazy pulling and cache management, etc.
For example, CNCF Dragonfly Nydus image service [7] introduces an
(EROFS-compatible) RAFS v6 image format to overcome flaws of the
current OCIv1 tgz images so that:
- Images can be downloaded on demand in chunks aka lazy pulling with
new fscache backend (5.19+) or userspace block devices (5.16+);
- Finer chunk-based content-addressable data deduplication to minimize
storage, transmission and memory footprints;
- Merged filesystem tree to remove all metadata of intermediate layers
as an option;
- (e)stargz, zstd::chunked and other formats can be converted and run
on the fly;
- and more.
Apart from Dragonfly Nydus, a native user daemon is planned to be added
to erofs-utils to parse EROFS, (e)stargz and zstd::chunked images from
network too as a real part of EROFS filesystem project.
Contribution
------------
erofs-utils is a part of EROFS filesystem project, feel free to send
patches or feedback to:
linux-erofs mailing list <linux-erofs@lists.ozlabs.org>
Comments
--------
[1] According to the EROFS on-disk format, the tail block of files
could be inlined aggressively with its metadata in order to reduce
the I/O overhead and save the storage space (called tail-packing).
[2] There was a bug until lz4-1.8.3, which can crash erofs-utils
randomly. Fortunately bugfix by our colleague Qiuyang Sun was
merged in lz4-1.9.0.
For more details, please refer to
https://github.com/lz4/lz4/commit/660d21272e4c8a0f49db5fc1e6853f08713dff82
[3] There were many bugfixes merged into lz4-1.9.2 for
LZ4_compress_destSize(), and I once ran into some crashs due to
those issues. * Again lz4hc is not affected. *
[LZ4_compress_destSize] Allow 2 more bytes of match length
https://github.com/lz4/lz4/commit/690009e2c2f9e5dcb0d40e7c0c40610ce6006eda
[LZ4_compress_destSize] Fix rare data corruption bug
https://github.com/lz4/lz4/commit/6bc6f836a18d1f8fd05c8fc2b42f1d800bc25de1
[LZ4_compress_destSize] Fix overflow condition
https://github.com/lz4/lz4/commit/13a2d9e34ffc4170720ce417c73e396d0ac1471a
[LZ4_compress_destSize] Fix off-by-one error in fix
https://github.com/lz4/lz4/commit/7c32101c655d93b61fc212dcd512b87119dd7333
[LZ4_compress_destSize] Fix off-by-one error
https://github.com/lz4/lz4/commit/d7cad81093cd805110291f84d64d385557d0ffba
since upstream lz4 doesn't have stable branch for old versions, it's
preferred to use latest upstream lz4 library (although some regressions
could happen since new features are also introduced to latest upstream
version as well) or backport all stable bugfixes to old stable versions,
e.g. our unofficial lz4 fork: https://github.com/erofs/lz4
[4] LZ4HC didn't compress long zeroed buffer properly with
LZ4_compress_HC_destSize()
https://github.com/lz4/lz4/issues/784
which has been resolved in
https://github.com/lz4/lz4/commit/e7fe105ac6ed02019d34731d2ba3aceb11b51bb1
and already included in lz4-1.9.3, see:
https://github.com/lz4/lz4/releases/tag/v1.9.3
[5] LZ4_decompress_safe_partial is broken in 1.9.2
https://github.com/lz4/lz4/issues/783
which is also resolved in lz4-1.9.3.
[6] https://tukaani.org/xz/xz-5.3.2alpha.tar.xz
[7] https://nydus.dev
https://github.com/dragonflyoss/image-service