Blame man2/seccomp.2

Packit 7cfc04
.\" Copyright (C) 2014 Kees Cook <keescook@chromium.org>
Packit 7cfc04
.\" and Copyright (C) 2012 Will Drewry <wad@chromium.org>
Packit 7cfc04
.\" and Copyright (C) 2008, 2014,2017 Michael Kerrisk <mtk.manpages@gmail.com>
Packit 7cfc04
.\" and Copyright (C) 2017 Tyler Hicks <tyhicks@canonical.com>
Packit 7cfc04
.\"
Packit 7cfc04
.\" %%%LICENSE_START(VERBATIM)
Packit 7cfc04
.\" Permission is granted to make and distribute verbatim copies of this
Packit 7cfc04
.\" manual provided the copyright notice and this permission notice are
Packit 7cfc04
.\" preserved on all copies.
Packit 7cfc04
.\"
Packit 7cfc04
.\" Permission is granted to copy and distribute modified versions of this
Packit 7cfc04
.\" manual under the conditions for verbatim copying, provided that the
Packit 7cfc04
.\" entire resulting derived work is distributed under the terms of a
Packit 7cfc04
.\" permission notice identical to this one.
Packit 7cfc04
.\"
Packit 7cfc04
.\" Since the Linux kernel and libraries are constantly changing, this
Packit 7cfc04
.\" manual page may be incorrect or out-of-date.  The author(s) assume no
Packit 7cfc04
.\" responsibility for errors or omissions, or for damages resulting from
Packit 7cfc04
.\" the use of the information contained herein.  The author(s) may not
Packit 7cfc04
.\" have taken the same level of care in the production of this manual,
Packit 7cfc04
.\" which is licensed free of charge, as they might when working
Packit 7cfc04
.\" professionally.
Packit 7cfc04
.\"
Packit 7cfc04
.\" Formatted or processed versions of this manual, if unaccompanied by
Packit 7cfc04
.\" the source, must acknowledge the copyright and authors of this work.
Packit 7cfc04
.\" %%%LICENSE_END
Packit 7cfc04
.\"
Packit 7cfc04
.TH SECCOMP 2 2018-02-02 "Linux" "Linux Programmer's Manual"
Packit 7cfc04
.SH NAME
Packit 7cfc04
seccomp \- operate on Secure Computing state of the process
Packit 7cfc04
.SH SYNOPSIS
Packit 7cfc04
.nf
Packit 7cfc04
.B #include <linux/seccomp.h>
Packit 7cfc04
.B #include <linux/filter.h>
Packit 7cfc04
.B #include <linux/audit.h>
Packit 7cfc04
.B #include <linux/signal.h>
Packit 7cfc04
.B #include <sys/ptrace.h>
Packit 7cfc04
.\" Kees Cook noted: Anything that uses SECCOMP_RET_TRACE returns will
Packit 7cfc04
.\"                  need <sys/ptrace.h>
Packit 7cfc04
.PP
Packit 7cfc04
.BI "int seccomp(unsigned int " operation ", unsigned int " flags \
Packit 7cfc04
", void *" args );
Packit 7cfc04
.fi
Packit 7cfc04
.SH DESCRIPTION
Packit 7cfc04
The
Packit 7cfc04
.BR seccomp ()
Packit 7cfc04
system call operates on the Secure Computing (seccomp) state of the
Packit 7cfc04
calling process.
Packit 7cfc04
.PP
Packit 7cfc04
Currently, Linux supports the following
Packit 7cfc04
.IR operation
Packit 7cfc04
values:
Packit 7cfc04
.TP
Packit 7cfc04
.BR SECCOMP_SET_MODE_STRICT
Packit 7cfc04
The only system calls that the calling thread is permitted to make are
Packit 7cfc04
.BR read (2),
Packit 7cfc04
.BR write (2),
Packit 7cfc04
.BR _exit (2)
Packit 7cfc04
(but not
Packit 7cfc04
.BR exit_group (2)),
Packit 7cfc04
and
Packit 7cfc04
.BR sigreturn (2).
Packit 7cfc04
Other system calls result in the delivery of a
Packit 7cfc04
.BR SIGKILL
Packit 7cfc04
signal.
Packit 7cfc04
Strict secure computing mode is useful for number-crunching
Packit 7cfc04
applications that may need to execute untrusted byte code, perhaps
Packit 7cfc04
obtained by reading from a pipe or socket.
Packit 7cfc04
.IP
Packit 7cfc04
Note that although the calling thread can no longer call
Packit 7cfc04
.BR sigprocmask (2),
Packit 7cfc04
it can use
Packit 7cfc04
.BR sigreturn (2)
Packit 7cfc04
to block all signals apart from
Packit 7cfc04
.BR SIGKILL
Packit 7cfc04
and
Packit 7cfc04
.BR SIGSTOP .
Packit 7cfc04
This means that
Packit 7cfc04
.BR alarm (2)
Packit 7cfc04
(for example) is not sufficient for restricting the process's execution time.
Packit 7cfc04
Instead, to reliably terminate the process,
Packit 7cfc04
.BR SIGKILL
Packit 7cfc04
must be used.
Packit 7cfc04
This can be done by using
Packit 7cfc04
.BR timer_create (2)
Packit 7cfc04
with
Packit 7cfc04
.BR SIGEV_SIGNAL
Packit 7cfc04
and
Packit 7cfc04
.IR sigev_signo
Packit 7cfc04
set to
Packit 7cfc04
.BR SIGKILL ,
Packit 7cfc04
or by using
Packit 7cfc04
.BR setrlimit (2)
Packit 7cfc04
to set the hard limit for
Packit 7cfc04
.BR RLIMIT_CPU .
Packit 7cfc04
.IP
Packit 7cfc04
This operation is available only if the kernel is configured with
Packit 7cfc04
.BR CONFIG_SECCOMP
Packit 7cfc04
enabled.
Packit 7cfc04
.IP
Packit 7cfc04
The value of
Packit 7cfc04
.IR flags
Packit 7cfc04
must be 0, and
Packit 7cfc04
.IR args
Packit 7cfc04
must be NULL.
Packit 7cfc04
.IP
Packit 7cfc04
This operation is functionally identical to the call:
Packit 7cfc04
.IP
Packit 7cfc04
    prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT);
Packit 7cfc04
.TP
Packit 7cfc04
.BR SECCOMP_SET_MODE_FILTER
Packit 7cfc04
The system calls allowed are defined by a pointer to a Berkeley Packet
Packit 7cfc04
Filter (BPF) passed via
Packit 7cfc04
.IR args .
Packit 7cfc04
This argument is a pointer to a
Packit 7cfc04
.IR "struct\ sock_fprog" ;
Packit 7cfc04
it can be designed to filter arbitrary system calls and system call
Packit 7cfc04
arguments.
Packit 7cfc04
If the filter is invalid,
Packit 7cfc04
.BR seccomp ()
Packit 7cfc04
fails, returning
Packit 7cfc04
.BR EINVAL
Packit 7cfc04
in
Packit 7cfc04
.IR errno .
Packit 7cfc04
.IP
Packit 7cfc04
If
Packit 7cfc04
.BR fork (2)
Packit 7cfc04
or
Packit 7cfc04
.BR clone (2)
Packit 7cfc04
is allowed by the filter, any child processes will be constrained to
Packit 7cfc04
the same system call filters as the parent.
Packit 7cfc04
If
Packit 7cfc04
.BR execve (2)
Packit 7cfc04
is allowed,
Packit 7cfc04
the existing filters will be preserved across a call to
Packit 7cfc04
.BR execve (2).
Packit 7cfc04
.IP
Packit 7cfc04
In order to use the
Packit 7cfc04
.BR SECCOMP_SET_MODE_FILTER
Packit 7cfc04
operation, either the caller must have the
Packit 7cfc04
.BR CAP_SYS_ADMIN
Packit 7cfc04
capability in its user namespace, or the thread must already have the
Packit 7cfc04
.I no_new_privs
Packit 7cfc04
bit set.
Packit 7cfc04
If that bit was not already set by an ancestor of this thread,
Packit 7cfc04
the thread must make the following call:
Packit 7cfc04
.IP
Packit 7cfc04
    prctl(PR_SET_NO_NEW_PRIVS, 1);
Packit 7cfc04
.IP
Packit 7cfc04
Otherwise, the
Packit 7cfc04
.BR SECCOMP_SET_MODE_FILTER
Packit 7cfc04
operation fails and returns
Packit 7cfc04
.BR EACCES
Packit 7cfc04
in
Packit 7cfc04
.IR errno .
Packit 7cfc04
This requirement ensures that an unprivileged process cannot apply
Packit 7cfc04
a malicious filter and then invoke a set-user-ID or
Packit 7cfc04
other privileged program using
Packit 7cfc04
.BR execve (2),
Packit 7cfc04
thus potentially compromising that program.
Packit 7cfc04
(Such a malicious filter might, for example, cause an attempt to use
Packit 7cfc04
.BR setuid (2)
Packit 7cfc04
to set the caller's user IDs to nonzero values to instead
Packit 7cfc04
return 0 without actually making the system call.
Packit 7cfc04
Thus, the program might be tricked into retaining superuser privileges
Packit 7cfc04
in circumstances where it is possible to influence it to do
Packit 7cfc04
dangerous things because it did not actually drop privileges.)
Packit 7cfc04
.IP
Packit 7cfc04
If
Packit 7cfc04
.BR prctl (2)
Packit 7cfc04
or
Packit 7cfc04
.BR seccomp ()
Packit 7cfc04
is allowed by the attached filter, further filters may be added.
Packit 7cfc04
This will increase evaluation time, but allows for further reduction of
Packit 7cfc04
the attack surface during execution of a thread.
Packit 7cfc04
.IP
Packit 7cfc04
The
Packit 7cfc04
.BR SECCOMP_SET_MODE_FILTER
Packit 7cfc04
operation is available only if the kernel is configured with
Packit 7cfc04
.BR CONFIG_SECCOMP_FILTER
Packit 7cfc04
enabled.
Packit 7cfc04
.IP
Packit 7cfc04
When
Packit 7cfc04
.IR flags
Packit 7cfc04
is 0, this operation is functionally identical to the call:
Packit 7cfc04
.IP
Packit 7cfc04
    prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, args);
Packit 7cfc04
.IP
Packit 7cfc04
The recognized
Packit 7cfc04
.IR flags
Packit 7cfc04
are:
Packit 7cfc04
.RS
Packit 7cfc04
.TP
Packit 7cfc04
.BR SECCOMP_FILTER_FLAG_TSYNC
Packit 7cfc04
When adding a new filter, synchronize all other threads of the calling
Packit 7cfc04
process to the same seccomp filter tree.
Packit 7cfc04
A "filter tree" is the ordered list of filters attached to a thread.
Packit 7cfc04
(Attaching identical filters in separate
Packit 7cfc04
.BR seccomp ()
Packit 7cfc04
calls results in different filters from this perspective.)
Packit 7cfc04
.IP
Packit 7cfc04
If any thread cannot synchronize to the same filter tree,
Packit 7cfc04
the call will not attach the new seccomp filter,
Packit 7cfc04
and will fail, returning the first thread ID found that cannot synchronize.
Packit 7cfc04
Synchronization will fail if another thread in the same process is in
Packit 7cfc04
.BR SECCOMP_MODE_STRICT
Packit 7cfc04
or if it has attached new seccomp filters to itself,
Packit 7cfc04
diverging from the calling thread's filter tree.
Packit 7cfc04
.TP
Packit 7cfc04
.BR SECCOMP_FILTER_FLAG_LOG " (since Linux 4.14)"
Packit 7cfc04
.\" commit e66a39977985b1e69e17c4042cb290768eca9b02
Packit 7cfc04
All filter return actions except
Packit 7cfc04
.BR SECCOMP_RET_ALLOW
Packit 7cfc04
should be logged.
Packit 7cfc04
An administrator may override this filter flag by preventing specific
Packit 7cfc04
actions from being logged via the
Packit 7cfc04
.IR /proc/sys/kernel/seccomp/actions_logged
Packit 7cfc04
file.
Packit 7cfc04
.RE
Packit 7cfc04
.TP
Packit 7cfc04
.BR SECCOMP_GET_ACTION_AVAIL " (since Linux 4.14)"
Packit 7cfc04
.\" commit d612b1fd8010d0d67b5287fe146b8b55bcbb8655
Packit 7cfc04
Test to see if an action is supported by the kernel.
Packit 7cfc04
This operation is helpful to confirm that the kernel knows
Packit 7cfc04
of a more recently added filter return action
Packit 7cfc04
since the kernel treats all unknown actions as
Packit 7cfc04
.BR SECCOMP_RET_KILL_PROCESS .
Packit 7cfc04
.IP
Packit 7cfc04
The value of
Packit 7cfc04
.IR flags
Packit 7cfc04
must be 0, and
Packit 7cfc04
.IR args
Packit 7cfc04
must be a pointer to an unsigned 32-bit filter return action.
Packit 7cfc04
.SS Filters
Packit 7cfc04
When adding filters via
Packit 7cfc04
.BR SECCOMP_SET_MODE_FILTER ,
Packit 7cfc04
.IR args
Packit 7cfc04
points to a filter program:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
struct sock_fprog {
Packit 7cfc04
    unsigned short      len;    /* Number of BPF instructions */
Packit 7cfc04
    struct sock_filter *filter; /* Pointer to array of
Packit 7cfc04
                                   BPF instructions */
Packit 7cfc04
};
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
Each program must contain one or more BPF instructions:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
struct sock_filter {            /* Filter block */
Packit 7cfc04
    __u16 code;                 /* Actual filter code */
Packit 7cfc04
    __u8  jt;                   /* Jump true */
Packit 7cfc04
    __u8  jf;                   /* Jump false */
Packit 7cfc04
    __u32 k;                    /* Generic multiuse field */
Packit 7cfc04
};
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
When executing the instructions, the BPF program operates on the
Packit 7cfc04
system call information made available (i.e., use the
Packit 7cfc04
.BR BPF_ABS
Packit 7cfc04
addressing mode) as a (read-only)
Packit 7cfc04
.\" Quoting Kees Cook:
Packit 7cfc04
.\"     If BPF even allows changing the data, it's not copied back to
Packit 7cfc04
.\"     the syscall when it runs. Anything wanting to do things like
Packit 7cfc04
.\"     that would need to use ptrace to catch the call and directly
Packit 7cfc04
.\"     modify the registers before continuing with the call.
Packit 7cfc04
buffer of the following form:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
struct seccomp_data {
Packit 7cfc04
    int   nr;                   /* System call number */
Packit 7cfc04
    __u32 arch;                 /* AUDIT_ARCH_* value
Packit 7cfc04
                                   (see <linux/audit.h>) */
Packit 7cfc04
    __u64 instruction_pointer;  /* CPU instruction pointer */
Packit 7cfc04
    __u64 args[6];              /* Up to 6 system call arguments */
Packit 7cfc04
};
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
Because numbering of system calls varies between architectures and
Packit 7cfc04
some architectures (e.g., x86-64) allow user-space code to use
Packit 7cfc04
the calling conventions of multiple architectures, it is usually
Packit 7cfc04
necessary to verify the value of the
Packit 7cfc04
.IR arch
Packit 7cfc04
field.
Packit 7cfc04
.PP
Packit 7cfc04
It is strongly recommended to use a whitelisting approach whenever
Packit 7cfc04
possible because such an approach is more robust and simple.
Packit 7cfc04
A blacklist will have to be updated whenever a potentially
Packit 7cfc04
dangerous system call is added (or a dangerous flag or option if those
Packit 7cfc04
are blacklisted), and it is often possible to alter the
Packit 7cfc04
representation of a value without altering its meaning, leading to
Packit 7cfc04
a blacklist bypass.
Packit 7cfc04
See also
Packit 7cfc04
.IR Caveats
Packit 7cfc04
below.
Packit 7cfc04
.PP
Packit 7cfc04
The
Packit 7cfc04
.IR arch
Packit 7cfc04
field is not unique for all calling conventions.
Packit 7cfc04
The x86-64 ABI and the x32 ABI both use
Packit 7cfc04
.BR AUDIT_ARCH_X86_64
Packit 7cfc04
as
Packit 7cfc04
.IR arch ,
Packit 7cfc04
and they run on the same processors.
Packit 7cfc04
Instead, the mask
Packit 7cfc04
.BR __X32_SYSCALL_BIT
Packit 7cfc04
is used on the system call number to tell the two ABIs apart.
Packit 7cfc04
.\" As noted by Dave Drysdale in a note at the end of
Packit 7cfc04
.\" https://lwn.net/Articles/604515/
Packit 7cfc04
.\"     One additional detail to point out for the x32 ABI case:
Packit 7cfc04
.\"     the syscall number gets a high bit set (__X32_SYSCALL_BIT),
Packit 7cfc04
.\"     to mark it as an x32 call.
Packit 7cfc04
.\"
Packit 7cfc04
.\"     If x32 support is included in the kernel, then __SYSCALL_MASK
Packit 7cfc04
.\"     will have a value that is not all-ones, and this will trigger
Packit 7cfc04
.\"     an extra instruction in system_call to mask off the extra bit,
Packit 7cfc04
.\"     so that the syscall table indexing still works.
Packit 7cfc04
.PP
Packit 7cfc04
This means that in order to create a seccomp-based
Packit 7cfc04
blacklist for system calls performed through the x86-64 ABI,
Packit 7cfc04
it is necessary to not only check that
Packit 7cfc04
.IR arch
Packit 7cfc04
equals
Packit 7cfc04
.BR AUDIT_ARCH_X86_64 ,
Packit 7cfc04
but also to explicitly reject all system calls that contain
Packit 7cfc04
.BR __X32_SYSCALL_BIT
Packit 7cfc04
in
Packit 7cfc04
.IR nr .
Packit 7cfc04
.PP
Packit 7cfc04
The
Packit 7cfc04
.I instruction_pointer
Packit 7cfc04
field provides the address of the machine-language instruction that
Packit 7cfc04
performed the system call.
Packit 7cfc04
This might be useful in conjunction with the use of
Packit 7cfc04
.I /proc/[pid]/maps
Packit 7cfc04
to perform checks based on which region (mapping) of the program
Packit 7cfc04
made the system call.
Packit 7cfc04
(Probably, it is wise to lock down the
Packit 7cfc04
.BR mmap (2)
Packit 7cfc04
and
Packit 7cfc04
.BR mprotect (2)
Packit 7cfc04
system calls to prevent the program from subverting such checks.)
Packit 7cfc04
.PP
Packit 7cfc04
When checking values from
Packit 7cfc04
.IR args
Packit 7cfc04
against a blacklist, keep in mind that arguments are often
Packit 7cfc04
silently truncated before being processed, but after the seccomp check.
Packit 7cfc04
For example, this happens if the i386 ABI is used on an
Packit 7cfc04
x86-64 kernel: although the kernel will normally not look beyond
Packit 7cfc04
the 32 lowest bits of the arguments, the values of the full
Packit 7cfc04
64-bit registers will be present in the seccomp data.
Packit 7cfc04
A less surprising example is that if the x86-64 ABI is used to perform
Packit 7cfc04
a system call that takes an argument of type
Packit 7cfc04
.IR int ,
Packit 7cfc04
the more-significant half of the argument register is ignored by
Packit 7cfc04
the system call, but visible in the seccomp data.
Packit 7cfc04
.PP
Packit 7cfc04
A seccomp filter returns a 32-bit value consisting of two parts:
Packit 7cfc04
the most significant 16 bits
Packit 7cfc04
(corresponding to the mask defined by the constant
Packit 7cfc04
.BR SECCOMP_RET_ACTION_FULL )
Packit 7cfc04
contain one of the "action" values listed below;
Packit 7cfc04
the least significant 16-bits (defined by the constant
Packit 7cfc04
.BR SECCOMP_RET_DATA )
Packit 7cfc04
are "data" to be associated with this return value.
Packit 7cfc04
.PP
Packit 7cfc04
If multiple filters exist, they are \fIall\fP executed,
Packit 7cfc04
in reverse order of their addition to the filter tree\(emthat is,
Packit 7cfc04
the most recently installed filter is executed first.
Packit 7cfc04
(Note that all filters will be called
Packit 7cfc04
even if one of the earlier filters returns
Packit 7cfc04
.BR SECCOMP_RET_KILL .
Packit 7cfc04
This is done to simplify the kernel code and to provide a
Packit 7cfc04
tiny speed-up in the execution of sets of filters by
Packit 7cfc04
avoiding a check for this uncommon case.)
Packit 7cfc04
.\" From an Aug 2015 conversation with Kees Cook where I asked why *all*
Packit 7cfc04
.\" filters are applied even if one of the early filters returns
Packit 7cfc04
.\" SECCOMP_RET_KILL:
Packit 7cfc04
.\"
Packit 7cfc04
.\"     It's just because it would be an optimization that would only speed up
Packit 7cfc04
.\"     the RET_KILL case, but it's the uncommon one and the one that doesn't
Packit 7cfc04
.\"     benefit meaningfully from such a change (you need to kill the process
Packit 7cfc04
.\"     really quickly?). We would speed up killing a program at the (albeit
Packit 7cfc04
.\"     tiny) expense to all other filtered programs. Best to keep the filter
Packit 7cfc04
.\"     execution logic clear, simple, and as fast as possible for all
Packit 7cfc04
.\"     filters.
Packit 7cfc04
The return value for the evaluation of a given system call is the first-seen
Packit 7cfc04
action value of highest precedence (along with its accompanying data)
Packit 7cfc04
returned by execution of all of the filters.
Packit 7cfc04
.PP
Packit 7cfc04
In decreasing order of precedence,
Packit 7cfc04
the action values that may be returned by a seccomp filter are:
Packit 7cfc04
.TP
Packit 7cfc04
.BR SECCOMP_RET_KILL_PROCESS " (since Linux 4.14)"
Packit 7cfc04
.\" commit 4d3b0b05aae9ee9ce0970dc4cc0fb3fad5e85945
Packit 7cfc04
.\" commit 0466bdb99e8744bc9befa8d62a317f0fd7fd7421
Packit 7cfc04
This value results in immediate termination of the process,
Packit 7cfc04
with a core dump.
Packit 7cfc04
The system call is not executed.
Packit 7cfc04
By contrast with
Packit 7cfc04
.BR SECCOMP_RET_KILL_THREAD
Packit 7cfc04
below, all threads in the thread group are terminated.
Packit 7cfc04
(For a discussion of thread groups, see the description of the
Packit 7cfc04
.BR CLONE_THREAD
Packit 7cfc04
flag in
Packit 7cfc04
.BR clone (2).)
Packit 7cfc04
.IP
Packit 7cfc04
The process terminates
Packit 7cfc04
.I "as though"
Packit 7cfc04
killed by a
Packit 7cfc04
.B SIGSYS
Packit 7cfc04
signal.
Packit 7cfc04
Even if a signal handler has been registered for
Packit 7cfc04
.BR SIGSYS ,
Packit 7cfc04
the handler will be ignored in this case and the process always terminates.
Packit 7cfc04
To a parent process that is waiting on this process (using
Packit 7cfc04
.BR waitpid (2)
Packit 7cfc04
or similar), the returned
Packit 7cfc04
.I wstatus
Packit 7cfc04
will indicate that its child was terminated as though by a
Packit 7cfc04
.BR SIGSYS
Packit 7cfc04
signal.
Packit 7cfc04
.TP
Packit 7cfc04
.BR SECCOMP_RET_KILL_THREAD " (or " SECCOMP_RET_KILL )
Packit 7cfc04
This value results in immediate termination of the thread
Packit 7cfc04
that made the system call.
Packit 7cfc04
The system call is not executed.
Packit 7cfc04
Other threads in the same thread group will continue to execute.
Packit 7cfc04
.IP
Packit 7cfc04
The thread terminates
Packit 7cfc04
.I "as though"
Packit 7cfc04
killed by a
Packit 7cfc04
.B SIGSYS
Packit 7cfc04
signal.
Packit 7cfc04
See
Packit 7cfc04
.BR SECCOMP_RET_KILL_PROCESS
Packit 7cfc04
above.
Packit 7cfc04
.IP
Packit 7cfc04
.\" See these commits:
Packit 7cfc04
.\" seccomp: dump core when using SECCOMP_RET_KILL
Packit 7cfc04
.\"    (b25e67161c295c98acda92123b2dd1e7d8642901)
Packit 7cfc04
.\" seccomp: Only dump core when single-threaded
Packit 7cfc04
.\"    (d7276e321ff8a53106a59c85ca46d03e34288893)
Packit 7cfc04
Before Linux 4.11,
Packit 7cfc04
any process terminated in this way would not trigger a coredump
Packit 7cfc04
(even though
Packit 7cfc04
.B SIGSYS
Packit 7cfc04
is documented in
Packit 7cfc04
.BR signal (7)
Packit 7cfc04
as having a default action of termination with a core dump).
Packit 7cfc04
Since Linux 4.11,
Packit 7cfc04
a single-threaded process will dump core if terminated in this way.
Packit 7cfc04
.IP
Packit 7cfc04
With the addition of
Packit 7cfc04
.BR SECCOMP_RET_KILL_PROCESS
Packit 7cfc04
in Linux 4.14,
Packit 7cfc04
.BR SECCOMP_RET_KILL_THREAD
Packit 7cfc04
was added as a synonym for
Packit 7cfc04
.BR SECCOMP_RET_KILL ,
Packit 7cfc04
in order to more clearly distinguish the two actions.
Packit 7cfc04
.TP
Packit 7cfc04
.BR SECCOMP_RET_TRAP
Packit 7cfc04
This value results in the kernel sending a thread-directed
Packit 7cfc04
.BR SIGSYS
Packit 7cfc04
signal to the triggering thread.
Packit 7cfc04
(The system call is not executed.)
Packit 7cfc04
Various fields will be set in the
Packit 7cfc04
.I siginfo_t
Packit 7cfc04
structure (see
Packit 7cfc04
.BR sigaction (2))
Packit 7cfc04
associated with signal:
Packit 7cfc04
.RS
Packit 7cfc04
.IP * 3
Packit 7cfc04
.I si_signo
Packit 7cfc04
will contain
Packit 7cfc04
.BR SIGSYS .
Packit 7cfc04
.IP *
Packit 7cfc04
.IR si_call_addr
Packit 7cfc04
will show the address of the system call instruction.
Packit 7cfc04
.IP *
Packit 7cfc04
.IR si_syscall
Packit 7cfc04
and
Packit 7cfc04
.IR si_arch
Packit 7cfc04
will indicate which system call was attempted.
Packit 7cfc04
.IP *
Packit 7cfc04
.I si_code
Packit 7cfc04
will contain
Packit 7cfc04
.BR SYS_SECCOMP .
Packit 7cfc04
.IP *
Packit 7cfc04
.I si_errno
Packit 7cfc04
will contain the
Packit 7cfc04
.BR SECCOMP_RET_DATA
Packit 7cfc04
portion of the filter return value.
Packit 7cfc04
.RE
Packit 7cfc04
.IP
Packit 7cfc04
The program counter will be as though the system call happened
Packit 7cfc04
(i.e., the program counter will not point to the system call instruction).
Packit 7cfc04
The return value register will contain an architecture\-dependent value;
Packit 7cfc04
if resuming execution, set it to something appropriate for the system call.
Packit 7cfc04
(The architecture dependency is because replacing it with
Packit 7cfc04
.BR ENOSYS
Packit 7cfc04
could overwrite some useful information.)
Packit 7cfc04
.TP
Packit 7cfc04
.BR SECCOMP_RET_ERRNO
Packit 7cfc04
This value results in the
Packit 7cfc04
.B SECCOMP_RET_DATA
Packit 7cfc04
portion of the filter's return value being passed to user space as the
Packit 7cfc04
.IR errno
Packit 7cfc04
value without executing the system call.
Packit 7cfc04
.TP
Packit 7cfc04
.BR SECCOMP_RET_TRACE
Packit 7cfc04
When returned, this value will cause the kernel to attempt to notify a
Packit 7cfc04
.BR ptrace (2)-based
Packit 7cfc04
tracer prior to executing the system call.
Packit 7cfc04
If there is no tracer present,
Packit 7cfc04
the system call is not executed and returns a failure status with
Packit 7cfc04
.I errno
Packit 7cfc04
set to
Packit 7cfc04
.BR ENOSYS .
Packit 7cfc04
.IP
Packit 7cfc04
A tracer will be notified if it requests
Packit 7cfc04
.BR PTRACE_O_TRACESECCOMP
Packit 7cfc04
using
Packit 7cfc04
.IR ptrace(PTRACE_SETOPTIONS) .
Packit 7cfc04
The tracer will be notified of a
Packit 7cfc04
.BR PTRACE_EVENT_SECCOMP
Packit 7cfc04
and the
Packit 7cfc04
.BR SECCOMP_RET_DATA
Packit 7cfc04
portion of the filter's return value will be available to the tracer via
Packit 7cfc04
.BR PTRACE_GETEVENTMSG .
Packit 7cfc04
.IP
Packit 7cfc04
The tracer can skip the system call by changing the system call number
Packit 7cfc04
to \-1.
Packit 7cfc04
Alternatively, the tracer can change the system call
Packit 7cfc04
requested by changing the system call to a valid system call number.
Packit 7cfc04
If the tracer asks to skip the system call, then the system call will
Packit 7cfc04
appear to return the value that the tracer puts in the return value register.
Packit 7cfc04
.IP
Packit 7cfc04
.\" This was changed in ce6526e8afa4.
Packit 7cfc04
.\" A related hole, using PTRACE_SYSCALL instead of SECCOMP_RET_TRACE, was
Packit 7cfc04
.\" changed in arch-specific commits, e.g. 93e35efb8de4 for X86 and
Packit 7cfc04
.\" 0f3912fd934c for ARM.
Packit 7cfc04
Before kernel 4.8, the seccomp check will not be run again after the tracer is
Packit 7cfc04
notified.
Packit 7cfc04
(This means that, on older kernels, seccomp-based sandboxes
Packit 7cfc04
.B "must not"
Packit 7cfc04
allow use of
Packit 7cfc04
.BR ptrace (2)\(emeven
Packit 7cfc04
of other
Packit 7cfc04
sandboxed processes\(emwithout extreme care;
Packit 7cfc04
ptracers can use this mechanism to escape from the seccomp sandbox.)
Packit 7cfc04
.TP
Packit 7cfc04
.BR SECCOMP_RET_LOG " (since Linux 4.14)"
Packit 7cfc04
.\" commit 59f5cf44a38284eb9e76270c786fb6cc62ef8ac4
Packit 7cfc04
This value results in the system call being executed after
Packit 7cfc04
the filter return action is logged.
Packit 7cfc04
An administrator may override the logging of this action via
Packit 7cfc04
the
Packit 7cfc04
.IR /proc/sys/kernel/seccomp/actions_logged
Packit 7cfc04
file.
Packit 7cfc04
.TP
Packit 7cfc04
.BR SECCOMP_RET_ALLOW
Packit 7cfc04
This value results in the system call being executed.
Packit 7cfc04
.PP
Packit 7cfc04
If an action value other than one of the above is specified,
Packit 7cfc04
then the filter action is treated as either
Packit 7cfc04
.BR SECCOMP_RET_KILL_PROCESS
Packit 7cfc04
(since Linux 4.14)
Packit 7cfc04
.\" commit 4d3b0b05aae9ee9ce0970dc4cc0fb3fad5e85945
Packit 7cfc04
or
Packit 7cfc04
.BR SECCOMP_RET_KILL_THREAD
Packit 7cfc04
(in Linux 4.13 and earlier).
Packit 7cfc04
.\"
Packit 7cfc04
.SS /proc interfaces
Packit 7cfc04
The files in the directory
Packit 7cfc04
.IR /proc/sys/kernel/seccomp
Packit 7cfc04
provide additional seccomp information and configuration:
Packit 7cfc04
.TP
Packit 7cfc04
.IR actions_avail " (since Linux 4.14)"
Packit 7cfc04
.\" commit 8e5f1ad116df6b0de65eac458d5e7c318d1c05af
Packit 7cfc04
A read-only ordered list of seccomp filter return actions in string form.
Packit 7cfc04
The ordering, from left-to-right, is in decreasing order of precedence.
Packit 7cfc04
The list represents the set of seccomp filter return actions
Packit 7cfc04
supported by the kernel.
Packit 7cfc04
.TP
Packit 7cfc04
.IR actions_logged " (since Linux 4.14)"
Packit 7cfc04
.\" commit 0ddec0fc8900201c0897b87b762b7c420436662f
Packit 7cfc04
A read-write ordered list of seccomp filter return actions that
Packit 7cfc04
are allowed to be logged.
Packit 7cfc04
Writes to the file do not need to be in ordered form but reads from
Packit 7cfc04
the file will be ordered in the same way as the
Packit 7cfc04
.IR actions_avail
Packit 7cfc04
file.
Packit 7cfc04
.IP
Packit 7cfc04
It is important to note that the value of
Packit 7cfc04
.IR actions_logged
Packit 7cfc04
does not prevent certain filter return actions from being logged when
Packit 7cfc04
the audit subsystem is configured to audit a task.
Packit 7cfc04
If the action is not found in the
Packit 7cfc04
.IR actions_logged
Packit 7cfc04
file, the final decision on whether to audit the action for that task is
Packit 7cfc04
ultimately left up to the audit subsystem to decide for all filter return
Packit 7cfc04
actions other than
Packit 7cfc04
.BR SECCOMP_RET_ALLOW .
Packit 7cfc04
.IP
Packit 7cfc04
The "allow" string is not accepted in the
Packit 7cfc04
.IR actions_logged
Packit 7cfc04
file as it is not possible to log
Packit 7cfc04
.BR SECCOMP_RET_ALLOW
Packit 7cfc04
actions.
Packit 7cfc04
Attempting to write "allow" to the file will fail with the error
Packit 7cfc04
.BR EINVAL .
Packit 7cfc04
.\"
Packit 7cfc04
.SS Audit logging of seccomp actions
Packit 7cfc04
.\" commit 59f5cf44a38284eb9e76270c786fb6cc62ef8ac4
Packit 7cfc04
Since Linux 4.14, the kernel provides the facility to log the
Packit 7cfc04
actions returned by seccomp filters in the audit log.
Packit 7cfc04
The kernel makes the decision to log an action based on
Packit 7cfc04
the action type,  whether or not the action is present in the
Packit 7cfc04
.I actions_logged
Packit 7cfc04
file, and whether kernel auditing is enabled
Packit 7cfc04
(e.g., via the kernel boot option
Packit 7cfc04
.IR audit=1 ).
Packit 7cfc04
.\" or auditing could be enabled via the netlink API (AUDIT_SET)
Packit 7cfc04
The rules are as follows:
Packit 7cfc04
.IP * 3
Packit 7cfc04
If the action is
Packit 7cfc04
.BR SECCOMP_RET_ALLOW ,
Packit 7cfc04
the action is not logged.
Packit 7cfc04
.IP *
Packit 7cfc04
Otherwise, if the action is either
Packit 7cfc04
.BR SECCOMP_RET_KILL_PROCESS
Packit 7cfc04
or
Packit 7cfc04
.BR SECCOMP_RET_KILL_THREAD ,
Packit 7cfc04
and that action appears in the
Packit 7cfc04
.IR actions_logged
Packit 7cfc04
file, the action is logged.
Packit 7cfc04
.IP *
Packit 7cfc04
Otherwise, if the filter has requested logging (the
Packit 7cfc04
.BR SECCOMP_FILTER_FLAG_LOG
Packit 7cfc04
flag)
Packit 7cfc04
and the action appears in the
Packit 7cfc04
.IR actions_logged
Packit 7cfc04
file, the action is logged.
Packit 7cfc04
.IP *
Packit 7cfc04
Otherwise, if kernel auditing is enabled and the process is being audited
Packit 7cfc04
.RB ( autrace (8)),
Packit 7cfc04
the action is logged.
Packit 7cfc04
.IP *
Packit 7cfc04
Otherwise, the action is not logged.
Packit 7cfc04
.SH RETURN VALUE
Packit 7cfc04
On success,
Packit 7cfc04
.BR seccomp ()
Packit 7cfc04
returns 0.
Packit 7cfc04
On error, if
Packit 7cfc04
.BR SECCOMP_FILTER_FLAG_TSYNC
Packit 7cfc04
was used,
Packit 7cfc04
the return value is the ID of the thread
Packit 7cfc04
that caused the synchronization failure.
Packit 7cfc04
(This ID is a kernel thread ID of the type returned by
Packit 7cfc04
.BR clone (2)
Packit 7cfc04
and
Packit 7cfc04
.BR gettid (2).)
Packit 7cfc04
On other errors, \-1 is returned, and
Packit 7cfc04
.IR errno
Packit 7cfc04
is set to indicate the cause of the error.
Packit 7cfc04
.SH ERRORS
Packit 7cfc04
.BR seccomp ()
Packit 7cfc04
can fail for the following reasons:
Packit 7cfc04
.TP
Packit 7cfc04
.BR EACCESS
Packit 7cfc04
The caller did not have the
Packit 7cfc04
.BR CAP_SYS_ADMIN
Packit 7cfc04
capability in its user namespace, or had not set
Packit 7cfc04
.IR no_new_privs
Packit 7cfc04
before using
Packit 7cfc04
.BR SECCOMP_SET_MODE_FILTER .
Packit 7cfc04
.TP
Packit 7cfc04
.BR EFAULT
Packit 7cfc04
.IR args
Packit 7cfc04
was not a valid address.
Packit 7cfc04
.TP
Packit 7cfc04
.BR EINVAL
Packit 7cfc04
.IR operation
Packit 7cfc04
is unknown or is not supported by this kernel version or configuration.
Packit 7cfc04
.TP
Packit 7cfc04
.B EINVAL
Packit 7cfc04
The specified
Packit 7cfc04
.IR flags
Packit 7cfc04
are invalid for the given
Packit 7cfc04
.IR operation .
Packit 7cfc04
.TP
Packit 7cfc04
.BR EINVAL
Packit 7cfc04
.I operation
Packit 7cfc04
included
Packit 7cfc04
.BR BPF_ABS ,
Packit 7cfc04
but the specified offset was not aligned to a 32-bit boundary or exceeded
Packit 7cfc04
.IR "sizeof(struct\ seccomp_data)" .
Packit 7cfc04
.TP
Packit 7cfc04
.BR EINVAL
Packit 7cfc04
.\" See kernel/seccomp.c::seccomp_may_assign_mode() in 3.18 sources
Packit 7cfc04
A secure computing mode has already been set, and
Packit 7cfc04
.I operation
Packit 7cfc04
differs from the existing setting.
Packit 7cfc04
.TP
Packit 7cfc04
.BR EINVAL
Packit 7cfc04
.I operation
Packit 7cfc04
specified
Packit 7cfc04
.BR SECCOMP_SET_MODE_FILTER ,
Packit 7cfc04
but the filter program pointed to by
Packit 7cfc04
.I args
Packit 7cfc04
was not valid or the length of the filter program was zero or exceeded
Packit 7cfc04
.B BPF_MAXINSNS
Packit 7cfc04
(4096) instructions.
Packit 7cfc04
.TP
Packit 7cfc04
.BR ENOMEM
Packit 7cfc04
Out of memory.
Packit 7cfc04
.TP
Packit 7cfc04
.BR ENOMEM
Packit 7cfc04
.\" ENOMEM in kernel/seccomp.c::seccomp_attach_filter() in 3.18 sources
Packit 7cfc04
The total length of all filter programs attached
Packit 7cfc04
to the calling thread would exceed
Packit 7cfc04
.B MAX_INSNS_PER_PATH
Packit 7cfc04
(32768) instructions.
Packit 7cfc04
Note that for the purposes of calculating this limit,
Packit 7cfc04
each already existing filter program incurs an
Packit 7cfc04
overhead penalty of 4 instructions.
Packit 7cfc04
.TP
Packit 7cfc04
.BR EOPNOTSUPP
Packit 7cfc04
.I operation
Packit 7cfc04
specified
Packit 7cfc04
.BR SECCOMP_GET_ACTION_AVAIL ,
Packit 7cfc04
but the kernel does not support the filter return action specified by
Packit 7cfc04
.IR args .
Packit 7cfc04
.TP
Packit 7cfc04
.BR ESRCH
Packit 7cfc04
Another thread caused a failure during thread sync, but its ID could not
Packit 7cfc04
be determined.
Packit 7cfc04
.SH VERSIONS
Packit 7cfc04
The
Packit 7cfc04
.BR seccomp ()
Packit 7cfc04
system call first appeared in Linux 3.17.
Packit 7cfc04
.\" FIXME . Add glibc version
Packit 7cfc04
.SH CONFORMING TO
Packit 7cfc04
The
Packit 7cfc04
.BR seccomp ()
Packit 7cfc04
system call is a nonstandard Linux extension.
Packit 7cfc04
.SH NOTES
Packit 7cfc04
Rather than hand-coding seccomp filters as shown in the example below,
Packit 7cfc04
you may prefer to employ the
Packit 7cfc04
.I libseccomp
Packit 7cfc04
library, which provides a front-end for generating seccomp filters.
Packit 7cfc04
.PP
Packit 7cfc04
The
Packit 7cfc04
.IR Seccomp
Packit 7cfc04
field of the
Packit 7cfc04
.IR /proc/[pid]/status
Packit 7cfc04
file provides a method of viewing the seccomp mode of a process; see
Packit 7cfc04
.BR proc (5).
Packit 7cfc04
.PP
Packit 7cfc04
.BR seccomp ()
Packit 7cfc04
provides a superset of the functionality provided by the
Packit 7cfc04
.BR prctl (2)
Packit 7cfc04
.BR PR_SET_SECCOMP
Packit 7cfc04
operation (which does not support
Packit 7cfc04
.IR flags ).
Packit 7cfc04
.PP
Packit 7cfc04
Since Linux 4.4, the
Packit 7cfc04
.BR prctl (2)
Packit 7cfc04
.B PTRACE_SECCOMP_GET_FILTER
Packit 7cfc04
operation can be used to dump a process's seccomp filters.
Packit 7cfc04
.\"
Packit 7cfc04
.SS Caveats
Packit 7cfc04
There are various subtleties to consider when applying seccomp filters
Packit 7cfc04
to a program, including the following:
Packit 7cfc04
.IP * 3
Packit 7cfc04
Some traditional system calls have user-space implementations in the
Packit 7cfc04
.BR vdso (7)
Packit 7cfc04
on many architectures.
Packit 7cfc04
Notable examples include
Packit 7cfc04
.BR clock_gettime (2),
Packit 7cfc04
.BR gettimeofday (2),
Packit 7cfc04
and
Packit 7cfc04
.BR time (2).
Packit 7cfc04
On such architectures,
Packit 7cfc04
seccomp filtering for these system calls will have no effect.
Packit 7cfc04
(However, there are cases where the
Packit 7cfc04
.BR vdso (7)
Packit 7cfc04
implementations may fall back to invoking the true system call,
Packit 7cfc04
in which case seccomp filters would see the system call.)
Packit 7cfc04
.IP *
Packit 7cfc04
Seccomp filtering is based on system call numbers.
Packit 7cfc04
However, applications typically do not directly invoke system calls,
Packit 7cfc04
but instead call wrapper functions in the C library which
Packit 7cfc04
in turn invoke the system calls.
Packit 7cfc04
Consequently, one must be aware of the following:
Packit 7cfc04
.RS
Packit 7cfc04
.IP \(bu 3
Packit 7cfc04
The glibc wrappers for some traditional system calls may actually
Packit 7cfc04
employ system calls with different names in the kernel.
Packit 7cfc04
For example, the
Packit 7cfc04
.BR exit (2)
Packit 7cfc04
wrapper function actually employs the
Packit 7cfc04
.BR exit_group (2)
Packit 7cfc04
system call, and the
Packit 7cfc04
.BR fork (2)
Packit 7cfc04
wrapper function actually calls
Packit 7cfc04
.BR clone (2).
Packit 7cfc04
.IP \(bu
Packit 7cfc04
The behavior of wrapper functions may vary across architectures,
Packit 7cfc04
according to the range of system calls provided on those architectures.
Packit 7cfc04
In other words, the same wrapper function may invoke
Packit 7cfc04
different system calls on different architectures.
Packit 7cfc04
.IP \(bu
Packit 7cfc04
Finally, the behavior of wrapper functions can change across glibc versions.
Packit 7cfc04
For example, in older versions, the glibc wrapper function for
Packit 7cfc04
.BR open (2)
Packit 7cfc04
invoked the system call of the same name,
Packit 7cfc04
but starting in glibc 2.26, the implementation switched to calling
Packit 7cfc04
.BR openat (2)
Packit 7cfc04
on all architectures.
Packit 7cfc04
.RE
Packit 7cfc04
.PP
Packit 7cfc04
The consequence of the above points is that it may be necessary
Packit 7cfc04
to filter for a system call other than might be expected.
Packit 7cfc04
Various manual pages in Section 2 provide helpful details
Packit 7cfc04
about the differences between wrapper functions and
Packit 7cfc04
the underlying system calls in subsections entitled
Packit 7cfc04
.IR "C library/kernel differences" .
Packit 7cfc04
.PP
Packit 7cfc04
Furthermore, note that the application of seccomp filters
Packit 7cfc04
even risks causing bugs in an application,
Packit 7cfc04
when the filters cause unexpected failures for legitimate operations
Packit 7cfc04
that the application might need to perform.
Packit 7cfc04
Such bugs may not easily be discovered when testing the seccomp
Packit 7cfc04
filters if the bugs occur in rarely used application code paths.
Packit 7cfc04
.RS 3
Packit 7cfc04
.\"
Packit 7cfc04
.SS Seccomp-specific BPF details
Packit 7cfc04
Note the following BPF details specific to seccomp filters:
Packit 7cfc04
.IP * 3
Packit 7cfc04
The
Packit 7cfc04
.B BPF_H
Packit 7cfc04
and
Packit 7cfc04
.B BPF_B
Packit 7cfc04
size modifiers are not supported: all operations must load and store
Packit 7cfc04
(4-byte) words
Packit 7cfc04
.RB ( BPF_W ).
Packit 7cfc04
.IP *
Packit 7cfc04
To access the contents of the
Packit 7cfc04
.I seccomp_data
Packit 7cfc04
buffer, use the
Packit 7cfc04
.B BPF_ABS
Packit 7cfc04
addressing mode modifier.
Packit 7cfc04
.IP *
Packit 7cfc04
The
Packit 7cfc04
.B BPF_LEN
Packit 7cfc04
addressing mode modifier yields an immediate mode operand
Packit 7cfc04
whose value is the size of the
Packit 7cfc04
.IR seccomp_data
Packit 7cfc04
buffer.
Packit 7cfc04
.SH EXAMPLE
Packit 7cfc04
The program below accepts four or more arguments.
Packit 7cfc04
The first three arguments are a system call number,
Packit 7cfc04
a numeric architecture identifier, and an error number.
Packit 7cfc04
The program uses these values to construct a BPF filter
Packit 7cfc04
that is used at run time to perform the following checks:
Packit 7cfc04
.IP [1] 4
Packit 7cfc04
If the program is not running on the specified architecture,
Packit 7cfc04
the BPF filter causes system calls to fail with the error
Packit 7cfc04
.BR ENOSYS .
Packit 7cfc04
.IP [2]
Packit 7cfc04
If the program attempts to execute the system call with the specified number,
Packit 7cfc04
the BPF filter causes the system call to fail, with
Packit 7cfc04
.I errno
Packit 7cfc04
being set to the specified error number.
Packit 7cfc04
.PP
Packit 7cfc04
The remaining command-line arguments specify
Packit 7cfc04
the pathname and additional arguments of a program
Packit 7cfc04
that the example program should attempt to execute using
Packit 7cfc04
.BR execv (3)
Packit 7cfc04
(a library function that employs the
Packit 7cfc04
.BR execve (2)
Packit 7cfc04
system call).
Packit 7cfc04
Some example runs of the program are shown below.
Packit 7cfc04
.PP
Packit 7cfc04
First, we display the architecture that we are running on (x86-64)
Packit 7cfc04
and then construct a shell function that looks up system call
Packit 7cfc04
numbers on this architecture:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
$ \fBuname -m\fP
Packit 7cfc04
x86_64
Packit 7cfc04
$ \fBsyscall_nr() {
Packit 7cfc04
    cat /usr/src/linux/arch/x86/syscalls/syscall_64.tbl | \\
Packit 7cfc04
    awk '$2 != "x32" && $3 == "'$1'" { print $1 }'
Packit 7cfc04
}\fP
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
When the BPF filter rejects a system call (case [2] above),
Packit 7cfc04
it causes the system call to fail with the error number
Packit 7cfc04
specified on the command line.
Packit 7cfc04
In the experiments shown here, we'll use error number 99:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
$ \fBerrno 99\fP
Packit 7cfc04
EADDRNOTAVAIL 99 Cannot assign requested address
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
In the following example, we attempt to run the command
Packit 7cfc04
.BR whoami (1),
Packit 7cfc04
but the BPF filter rejects the
Packit 7cfc04
.BR execve (2)
Packit 7cfc04
system call, so that the command is not even executed:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
$ \fBsyscall_nr execve\fP
Packit 7cfc04
59
Packit 7cfc04
$ \fB./a.out\fP
Packit 7cfc04
Usage: ./a.out <syscall_nr> <arch> <errno> <prog> [<args>]
Packit 7cfc04
Hint for <arch>: AUDIT_ARCH_I386: 0x40000003
Packit 7cfc04
                 AUDIT_ARCH_X86_64: 0xC000003E
Packit 7cfc04
$ \fB./a.out 59 0xC000003E 99 /bin/whoami\fP
Packit 7cfc04
execv: Cannot assign requested address
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
In the next example, the BPF filter rejects the
Packit 7cfc04
.BR write (2)
Packit 7cfc04
system call, so that, although it is successfully started, the
Packit 7cfc04
.BR whoami (1)
Packit 7cfc04
command is not able to write output:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
$ \fBsyscall_nr write\fP
Packit 7cfc04
1
Packit 7cfc04
$ \fB./a.out 1 0xC000003E 99 /bin/whoami\fP
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
In the final example,
Packit 7cfc04
the BPF filter rejects a system call that is not used by the
Packit 7cfc04
.BR whoami (1)
Packit 7cfc04
command, so it is able to successfully execute and produce output:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
$ \fBsyscall_nr preadv\fP
Packit 7cfc04
295
Packit 7cfc04
$ \fB./a.out 295 0xC000003E 99 /bin/whoami\fP
Packit 7cfc04
cecilia
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.SS Program source
Packit 7cfc04
.EX
Packit 7cfc04
#include <errno.h>
Packit 7cfc04
#include <stddef.h>
Packit 7cfc04
#include <stdio.h>
Packit 7cfc04
#include <stdlib.h>
Packit 7cfc04
#include <unistd.h>
Packit 7cfc04
#include <linux/audit.h>
Packit 7cfc04
#include <linux/filter.h>
Packit 7cfc04
#include <linux/seccomp.h>
Packit 7cfc04
#include <sys/prctl.h>
Packit 7cfc04
Packit 7cfc04
#define X32_SYSCALL_BIT 0x40000000
Packit 7cfc04
Packit 7cfc04
static int
Packit 7cfc04
install_filter(int syscall_nr, int t_arch, int f_errno)
Packit 7cfc04
{
Packit 7cfc04
    unsigned int upper_nr_limit = 0xffffffff;
Packit 7cfc04
Packit 7cfc04
    /* Assume that AUDIT_ARCH_X86_64 means the normal x86-64 ABI */
Packit 7cfc04
    if (t_arch == AUDIT_ARCH_X86_64)
Packit 7cfc04
        upper_nr_limit = X32_SYSCALL_BIT - 1;
Packit 7cfc04
Packit 7cfc04
    struct sock_filter filter[] = {
Packit 7cfc04
        /* [0] Load architecture from 'seccomp_data' buffer into
Packit 7cfc04
               accumulator */
Packit 7cfc04
        BPF_STMT(BPF_LD | BPF_W | BPF_ABS,
Packit 7cfc04
                 (offsetof(struct seccomp_data, arch))),
Packit 7cfc04
Packit 7cfc04
        /* [1] Jump forward 5 instructions if architecture does not
Packit 7cfc04
               match 't_arch' */
Packit 7cfc04
        BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, t_arch, 0, 5),
Packit 7cfc04
Packit 7cfc04
        /* [2] Load system call number from 'seccomp_data' buffer into
Packit 7cfc04
               accumulator */
Packit 7cfc04
        BPF_STMT(BPF_LD | BPF_W | BPF_ABS,
Packit 7cfc04
                 (offsetof(struct seccomp_data, nr))),
Packit 7cfc04
Packit 7cfc04
        /* [3] Check ABI - only needed for x86-64 in blacklist use
Packit 7cfc04
               cases.  Use BPF_JGT instead of checking against the bit
Packit 7cfc04
               mask to avoid having to reload the syscall number. */
Packit 7cfc04
        BPF_JUMP(BPF_JMP | BPF_JGT | BPF_K, upper_nr_limit, 3, 0),
Packit 7cfc04
Packit 7cfc04
        /* [4] Jump forward 1 instruction if system call number
Packit 7cfc04
               does not match 'syscall_nr' */
Packit 7cfc04
        BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, syscall_nr, 0, 1),
Packit 7cfc04
Packit 7cfc04
        /* [5] Matching architecture and system call: don't execute
Packit 7cfc04
	       the system call, and return 'f_errno' in 'errno' */
Packit 7cfc04
        BPF_STMT(BPF_RET | BPF_K,
Packit 7cfc04
                 SECCOMP_RET_ERRNO | (f_errno & SECCOMP_RET_DATA)),
Packit 7cfc04
Packit 7cfc04
        /* [6] Destination of system call number mismatch: allow other
Packit 7cfc04
               system calls */
Packit 7cfc04
        BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW),
Packit 7cfc04
Packit 7cfc04
        /* [7] Destination of architecture mismatch: kill task */
Packit 7cfc04
        BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_KILL),
Packit 7cfc04
    };
Packit 7cfc04
Packit 7cfc04
    struct sock_fprog prog = {
Packit 7cfc04
        .len = (unsigned short) (sizeof(filter) / sizeof(filter[0])),
Packit 7cfc04
        .filter = filter,
Packit 7cfc04
    };
Packit 7cfc04
Packit 7cfc04
    if (seccomp(SECCOMP_SET_MODE_FILTER, 0, &prog)) {
Packit 7cfc04
        perror("seccomp");
Packit 7cfc04
        return 1;
Packit 7cfc04
    }
Packit 7cfc04
Packit 7cfc04
    return 0;
Packit 7cfc04
}
Packit 7cfc04
Packit 7cfc04
int
Packit 7cfc04
main(int argc, char **argv)
Packit 7cfc04
{
Packit 7cfc04
    if (argc < 5) {
Packit 7cfc04
        fprintf(stderr, "Usage: "
Packit 7cfc04
                "%s <syscall_nr> <arch> <errno> <prog> [<args>]\\n"
Packit 7cfc04
                "Hint for <arch>: AUDIT_ARCH_I386: 0x%X\\n"
Packit 7cfc04
                "                 AUDIT_ARCH_X86_64: 0x%X\\n"
Packit 7cfc04
                "\\n", argv[0], AUDIT_ARCH_I386, AUDIT_ARCH_X86_64);
Packit 7cfc04
        exit(EXIT_FAILURE);
Packit 7cfc04
    }
Packit 7cfc04
Packit 7cfc04
    if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
Packit 7cfc04
        perror("prctl");
Packit 7cfc04
        exit(EXIT_FAILURE);
Packit 7cfc04
    }
Packit 7cfc04
Packit 7cfc04
    if (install_filter(strtol(argv[1], NULL, 0),
Packit 7cfc04
                       strtol(argv[2], NULL, 0),
Packit 7cfc04
                       strtol(argv[3], NULL, 0)))
Packit 7cfc04
        exit(EXIT_FAILURE);
Packit 7cfc04
Packit 7cfc04
    execv(argv[4], &argv[4]);
Packit 7cfc04
    perror("execv");
Packit 7cfc04
    exit(EXIT_FAILURE);
Packit 7cfc04
}
Packit 7cfc04
.EE
Packit 7cfc04
.SH SEE ALSO
Packit 7cfc04
.BR strace (1),
Packit 7cfc04
.BR bpf (2),
Packit 7cfc04
.BR prctl (2),
Packit 7cfc04
.BR ptrace (2),
Packit 7cfc04
.BR sigaction (2),
Packit 7cfc04
.BR proc (5),
Packit 7cfc04
.BR signal (7),
Packit 7cfc04
.BR socket (7)
Packit 7cfc04
.PP
Packit 7cfc04
Various pages from the
Packit 7cfc04
.I libseccomp
Packit 7cfc04
library, including:
Packit 7cfc04
.BR scmp_sys_resolver (1),
Packit 7cfc04
.BR seccomp_init (3),
Packit 7cfc04
.BR seccomp_load (3),
Packit 7cfc04
.BR seccomp_rule_add (3),
Packit 7cfc04
and
Packit 7cfc04
.BR seccomp_export_bpf (3).
Packit 7cfc04
.PP
Packit 7cfc04
The kernel source files
Packit 7cfc04
.IR Documentation/networking/filter.txt
Packit 7cfc04
and
Packit 7cfc04
.IR Documentation/userspace\-api/seccomp_filter.rst
Packit 7cfc04
.\" commit c061f33f35be0ccc80f4b8e0aea5dfd2ed7e01a3
Packit 7cfc04
(or
Packit 7cfc04
.IR Documentation/prctl/seccomp_filter.txt
Packit 7cfc04
before Linux 4.13).
Packit 7cfc04
.PP
Packit 7cfc04
McCanne, S. and Jacobson, V. (1992)
Packit 7cfc04
.IR "The BSD Packet Filter: A New Architecture for User-level Packet Capture" ,
Packit 7cfc04
Proceedings of the USENIX Winter 1993 Conference
Packit 7cfc04
.UR http://www.tcpdump.org/papers/bpf\-usenix93.pdf
Packit 7cfc04
.UE
Packit 7cfc04
.SH COLOPHON
Packit 7cfc04
This page is part of release 4.15 of the Linux
Packit 7cfc04
.I man-pages
Packit 7cfc04
project.
Packit 7cfc04
A description of the project,
Packit 7cfc04
information about reporting bugs,
Packit 7cfc04
and the latest version of this page,
Packit 7cfc04
can be found at
Packit 7cfc04
\%https://www.kernel.org/doc/man\-pages/.