blob: ce8e67c3059fb349f4c00f1eba0e95fc86e07f68 [file] [log] [blame]
.TH MINIJAIL0 "1" "March 2016" "Chromium OS" "User Commands"
.SH NAME
minijail0 \- sandbox a process
.SH SYNOPSIS
.B minijail0
[\fIOPTION\fR]... <\fIPROGRAM\fR> [\fIargs\fR]...
.SH DESCRIPTION
.PP
Runs PROGRAM inside a sandbox.
.TP
\fB-a <table>\fR
Run using the alternate syscall table named \fItable\fR. Only available on kernels
and architectures that support the \fBPR_ALT_SYSCALL\fR option of \fBprctl\fR(2).
.TP
\fB-b <src>[,<dest>[,<writeable>]]
Bind-mount \fIsrc\fR into the chroot directory at \fIdest\fR, optionally writeable.
The \fIsrc\fR path must be an absolute path.
If \fIdest\fR is not specified, it will default to \fIsrc\fR.
If the destination does not exist, it will be created as a file or directory
based on the \fIsrc\fR type (including missing parent directories).
.TP
\fB-c <caps>\fR
Restrict capabilities to \fIcaps\fR. When used in conjunction with \fB-u\fR and
\fB-g\fR, this allows a program to have access to only certain parts of root's
default privileges while running as another user and group ID altogether. Note
that these capabilities are not inherited by subprocesses of the process given
capabilities unless those subprocesses have POSIX file capabilities. See
\fBcapabilities\fR(7).
.TP
\fB-C <dir>\fR
Change root (using \fBchroot\fR(2)) to \fIdir\fR.
.TP
\fB-d\fR, \fB--mount-dev\fR
Create a new /dev mount with a minimal set of nodes. Implies \fB-v\fR.
Additional nodes can be bound with the \fB-b\fR or \fB-k\fR options.
The initial set of nodes are: full null tty urandom zero.
Symlinks are also created for: fd ptmx stderr stdin stdout.
.TP
\fB-e[file]\fR
Enter a new network namespace, or if \fIfile\fR is specified, enter an existing
network namespace specified by \fIfile\fR which is typically of the form
/proc/<pid>/ns/net.
.TP
\fB-f <file>\fR
Write the pid of the jailed process to \fIfile\fR.
.TP
\fB-g <group>\fR
Change groups to \fIgroup\fR, which may be either a group name or a numeric
group ID.
.TP
\fB-G\fR
Inherit all the supplementary groups of the user specified with \fB-u\fR. It
is an error to use this option without having specified a \fBuser name\fR to
\fB-u\fR.
.TP
\fB-h\fR
Print a help message.
.TP
\fB-H\fR
Print a help message detailing supported system call names for seccomp_filter.
(Other direct numbers may be specified if minijail0 is not in sync with the
host kernel or something like 32/64-bit compatibility issues exist.)
.TP
\fB-I\fR
Run \fIprogram\fR as init (pid 1) inside a new pid namespace (implies \fB-p\fR).
.TP
\fB-k <src>,<dest>,<type>[,<flags>[,<data>]]\fR
Mount \fIsrc\fR, a \fItype\fR filesystem, at \fIdest\fR. If a chroot or pivot
root is active, \fIdest\fR will automatically be placed below that path.
The \fIflags\fR field is optional and is a hex constant. These represent the
\fIMS_XXX\fR settings (see \fBmount\fR(2) for details). Their values can be
looked up in the sys/mount.h header file. \fI0xe\fR is a common value here
(a writable mount with nodev/nosuid/noexec bits set), and it is strongly
recommended that all mounts have these three bits set whenever possible.
The \fIdata\fR field is optional and is a comma delimited string (see
\fBmount\fR(2) for details). It is passed directly to the kernel, so all
fields here are filesystem specific.
If the mount is not a pseudo filesystem (e.g. proc or sysfs), \fIsrc\fR path
must be an absolute path (e.g. \fI/dev/sda1\fR and not \fIsda1\fR).
If the destination does not exist, it will be created as a directory (including
missing parent directories).
.TP
\fB-K[mode]\fR
Don't mark all existing mounts as MS_PRIVATE.
This option is \fBdangerous\fR as it negates most of the functionality of \fB-v\fR.
You very likely don't need this.
You may specify a mount propagation mode in which case, that will be used
instead of the default MS_PRIVATE. See the \fBmount\fR(2) man page and the
kernel docs \fIDocumentation/filesystems/sharedsubtree.txt\fR for more
technical details, but a brief guide:
.IP
\[bu] \fBslave\fR Changes in the parent mount namespace will propagate in, but
changes in this mount namespace will not propagate back out. This is usually
what people want to use.
.IP
\[bu] \fBprivate\fR No changes in either mount namespace will propagate.
This is the default behavior if you don't specify \fB-K\fR.
.IP
\[bu] \fBshared\fR Changes in the parent and this mount namespace will freely
propagate back and forth. This is not recommended.
.IP
\[bu] \fBunbindable\fR Mark all mounts as unbindable.
.TP
\fB-l\fR
Run inside a new IPC namespace. This option makes the program's System V IPC
namespace independent.
.TP
\fB-L\fR
Report blocked syscalls to syslog when using seccomp filter. This option will
force certain syscalls to be allowed in order to achieve this, depending on the
system.
.TP
\fB-m[<uid> <loweruid> <count>[,<uid> <loweruid> <count>]]\fR
Set the uid mapping of a user namespace (implies \fB-pU\fR). Same arguments as
\fBnewuidmap\fR(1). Multiple mappings should be separated by ','. With no mapping,
map the current uid to root inside the user namespace.
.TP
\fB-M[<uid> <loweruid> <count>[,<uid> <loweruid> <count>]]\fR
Set the gid mapping of a user namespace (implies \fB-pU\fR). Same arguments as
\fBnewgidmap\fR(1). Multiple mappings should be separated by ','. With no mapping,
map the current gid to root inside the user namespace.
.TP
\fB-n\fR
Set the process's \fIno_new_privs\fR bit. See \fBprctl\fR(2) and the kernel
source file \fIDocumentation/prctl/no_new_privs.txt\fR for more info.
.TP
\fB-N\fR
Run inside a new cgroup namespace. This option runs the program with a cgroup
view showing the program's cgroup as the root. This is only available on v4.6+
of the Linux kernel.
.TP
\fB-p\fR
Run inside a new PID namespace. This option will make it impossible for the
program to see or affect processes that are not its descendants. This implies
\fB-v\fR and \fB-r\fR, since otherwise the process can see outside its namespace
by inspecting /proc.
.TP
\fB-P <dir>\fR
Set \fIdir\fR as the root fs using \fBpivot_root\fR. Implies \fB-v\fR, not
compatible with \fB-C\fR.
.TP
\fB-r\fR
Remount /proc readonly. This implies \fB-v\fR. Remounting /proc readonly means
that even if the process has write access to a system config knob in /proc
(e.g., in /sys/kernel), it cannot change the value.
.TP
\fB-R <rlim_type>,<rlim_cur>,<rlim_max>\fR
Set an rlimit value, see \fBgetrlimit\fR(2) for allowed values. The string
\fBunlimited\fR can be used for \fBrlim_cur\fR and \fBrlim_max\fR, which will
translate to \fBRLIM_INFINITY\fR.
.TP
\fB-s\fR
Enable \fBseccomp\fR(2) in mode 1, which restricts the child process to a very
small set of system calls.
You most likely do not want to use this with the seccomp filter mode (\fB-S\fR)
as they are completely different (even though they have similar names).
.TP
\fB-S <arch-specific seccomp_filter policy file>\fR
Enable \fBseccomp\fR(2) in mode 13 which restricts the child process to a set of
system calls defined in the policy file. Note that system calls often change
names based on the architecture or mode. (uname -m is your friend.)
.TP
\fB-t[size]\fR
Mounts a tmpfs filesystem on /tmp. /tmp must exist already (e.g. in the chroot).
The filesystem has a default size of "64M", overridden with an optional
argument. It has standard /tmp permissions (1777), and is mounted
nodev/noexec/nosuid. Implies \fB-v\fR.
.TP
\fB-T <type>\fR
Assume binary's ELF linkage type is \fItype\fR, which must be either 'static'
or 'dynamic'. Either setting will prevent minijail0 from manually parsing the
ELF header to determine the type. Type 'static' can be used to avoid preload
hooking, and will force minijail0 to instead set everything up before the
program is executed. Type 'dynamic' will force minijail0 to preload
\fIlibminijailpreload.so\fR to setup hooks, but will fail on actually
statically-linked binaries.
.TP
\fB-u <user>\fR
Change users to \fIuser\fR, which may be either a user name or a numeric user
ID.
.TP
\fB-U\fR
Enter a new user namespace (implies \fB-p\fR).
.TP
\fB-v\fR
Run inside a new VFS namespace. This option makes the program's mountpoints
independent of the rest of the system's.
.TP
\fB-V <file>\fR
Enter the VFS namespace specified by \fIfile\fR.
.TP
\fB-w\fR
Create and join a new anonymous session keyring. See \fBkeyrings\fR(7) for more
details.
.TP
\fB-y\fR
Keep the current user's supplementary groups.
.TP
\fB-Y\fR
Synchronize seccomp filters across thread group.
.TP
\fB--uts[=hostname]\fR
Create a new UTS/hostname namespace, and optionally set the hostname in the new
namespace to \fIhostname\fR.
.TP
\fB--logging=<system>\fR
Use \fIsystem\fR as the logging system. \fIsystem\fR must be one of
\fBsyslog\fR (the default) or \fBstderr\fR.
.TP
\fB--profile <profile>\fR
Choose from one of the available sandboxing profiles, which are simple way to
get a standardized environment. See the
.BR "SANDBOXING PROFILES"
section below for the full list of supported values for \fIprofile\fR.
.SH SANDBOXING PROFILES
The following sandboxing profiles are supported:
.TP
\fBminimalistic-mountns\fR
Set up a minimalistic mount namespace. Equivalent to \fB-v -P /var/empty
-b /,/ -b /proc,/proc -t -r --mount-dev\fR.
.SH IMPLEMENTATION
This program is broken up into two parts: \fBminijail0\fR (the frontend) and a helper
library called \fBlibminijailpreload\fR. Some jailings can only be achieved from
the process to which they will actually apply - specifically capability use
(since capabilities are not inherited to an exec'd process unless the exec'd
process has POSIX file capabilities), seccomp (since we can't exec() once we're
seccomp'd), and ptrace-disable (which is always cleared on exec()).
To this end, \fBlibminijailpreload\fR is forcibly loaded into all
dynamically-linked target programs if any of these restrictions are in effect;
we pass the specific restrictions in an environment variable which the preloaded
library looks for. The forcibly-loaded library then applies the restrictions
to the newly-loaded program.
.SH AUTHOR
The Chromium OS Authors <chromiumos-dev@chromium.org>
.SH COPYRIGHT
Copyright \(co 2011 The Chromium OS Authors
License BSD-like.
.SH "SEE ALSO"
\fBlibminijail.h\fR \fBminijail0\fR(5)