# System call interception
LXD supports intercepting some specific system calls from unprivileged
containers and if they're considered to be safe, will executed with
elevated privileges on the host.

Doing so comes with a performance impact for the syscall in question and
will cause some work for LXD to evaluate the request and if allowed,
process it with elevated privileges.

## Available system calls
### mknod / mknodat
The `mknod` and `mknodat` system calls can be used to create a variety of special files.

Most commonly inside containers, they may be called to create block or character devices.
Creating such devices isn't allowed in unprivileged containers as this
is a very easy way to escalate privileges by allowing direct write
access to resources like disks or memory.

But there are files which are safe to create. For those, intercepting
this syscall may unblock some specific workloads and allow them to run
inside an unprivileged containers.

The devices which are currently allowed are:

 - overlayfs whiteout (char 0:0)
 - /dev/console (char 5:1)
 - /dev/full (char 1:7)
 - /dev/null (char 1:3)
 - /dev/random (char 1:8)
 - /dev/tty (char 5:0)
 - /dev/urandom (char 1:9)
 - /dev/zero (char 1:5)

All file types other than character devices are currently sent to the
kernel as usual, so enabling this feature doesn't change their behavior
at all.

This can be enabled by setting `security.syscalls.intercept.mknod` to `true`.

### setxattr
The `setxattr` system call is used to set extended attributes on files.

The attributes which are handled by this currently are:

 - trusted.overlay.opaque (overlayfs directory whiteout)

Note that because the mediation must happen on a number of character
strings, there is no easy way at present to only intercept the few
attributes we care about. As we only allow the attributes above, this
may result in breakage for other attributes that would have been
previously allowed by the kernel.

This can be enabled by setting `security.syscalls.intercept.setxattr` to `true`.
