Inferensys

Glossary

Seccomp

Seccomp (secure computing mode) is a Linux kernel security feature that restricts the system calls a process is permitted to make, effectively sandboxing it to a defined subset of the kernel's interface.
Data engineer managing feature store on laptop, feature definitions visible, casual data engineering session.
SECURE ENCLAVE EXECUTION

What is Seccomp?

Seccomp (secure computing mode) is a Linux kernel security feature that restricts the system calls a process is permitted to make, effectively sandboxing it to a defined subset of the kernel's interface.

Seccomp is a Linux kernel mechanism that enforces the Principle of Least Privilege on a per-process basis by filtering system calls. A process transitions into a secure computing mode where its ability to interact with the kernel is limited to a strict, predefined allowlist of syscalls (e.g., read, write, exit). This drastically reduces the attack surface, as even if the process is compromised, the attacker cannot execute powerful syscalls like execve or socket. It is a foundational sandboxing technique for container runtimes and high-security applications.

Seccomp operates via BPF (Berkeley Packet Filter) programs that evaluate each syscall attempt. Developers define filters using the seccomp() syscall or higher-level libraries like libseccomp. Seccomp-bpf (the most common mode) allows for sophisticated filtering based on syscall number and arguments. While powerful, it complements other Linux Security Modules (LSM) like SELinux and AppArmor, which enforce file and network access controls. For Secure Enclave Execution, Seccomp is a critical software-based isolation layer, often used in conjunction with namespaces and cgroups to create robust container security boundaries.

SECCOMP OPERATIONAL MODES

Key Features and Modes

Seccomp operates through two primary modes: a strict filter and a configurable filter. These modes define the granularity of control over a process's system call interface.

01

Seccomp Strict Mode (SECCOMP_MODE_STRICT)

The original and most restrictive mode, introduced in Linux 2.6.12. In this mode, the process is only permitted to make four essential system calls: read, write, _exit, and sigreturn. Any attempt to call another system call results in the termination of the process via a SIGKILL signal.

  • Purpose: Provides a simple, absolute sandbox for highly trusted code that requires minimal kernel interaction.
  • Limitation: Its inflexibility makes it unsuitable for most real-world applications, which led to the development of the filter mode.
02

Seccomp Filter Mode (SECCOMP_MODE_FILTER)

The modern, programmable mode introduced via the seccomp() syscall with the SECCOMP_SET_MODE_FILTER flag and the prctl() syscall with PR_SET_SECCOMP. It allows the definition of custom filter programs using Berkeley Packet Filter (BPF) rules to allow, log, or kill specific system calls.

  • BPF Programs: Filters are small programs that inspect each system call number and arguments, returning an action (e.g., ALLOW, ERRNO, KILL, TRAP, LOG).
  • Flexibility: Enables fine-grained policies, such as allowing open() but only for specific file paths or with certain flags.
03

Filter Actions and Return Values

A BPF filter returns a 32-bit value that dictates the kernel's action for the intercepted system call. The high 16 bits specify the action, and the low 16 bits are action-specific data.

  • SECCOMP_RET_ALLOW: The system call executes normally.
  • SECCOMP_RET_ERRNO: Blocks the call; the low 16 bits are returned as the errno value to the process.
  • SECCOMP_RET_KILL: Terminates the process immediately with a SIGSYS signal.
  • SECCOMP_RET_TRAP: Sends a SIGSYS signal to the process, allowing it to catch and handle the violation.
  • SECCOMP_RET_LOG: Allows the call but logs it for auditing. Used when monitoring is more important than blocking.
  • SECCOMP_RET_TRACE: Delegates the decision to a ptrace tracer, if present.
04

Seccomp-BPF: The Filtering Mechanism

Seccomp Filter mode leverages a restricted, in-kernel virtual machine originally designed for network packet filtering: the Berkeley Packet Filter (BPF). A seccomp-BPF program is loaded into the kernel, where it safely evaluates each system call.

  • Safety: BPF programs are guaranteed to terminate and cannot perform loops or access arbitrary kernel memory.
  • Inspection Capabilities: The program can examine the system call number, architecture identifier, and up to six 64-bit arguments. This allows for argument-based filtering (e.g., checking the prot argument of mmap).
  • Program Structure: Typically written using helper macros from <linux/seccomp.h> and <linux/filter.h> or via libraries like libseccomp.
06

Seccomp Notify and Supervisor Model

Introduced in Linux 5.0, the SECCOMP_RET_USER_NOTIF action enables a supervisor process to handle system call violations on behalf of a sandboxed process (the target).

  • Mechanism: When a filtered syscall is encountered, the kernel notifies a userspace supervisor via a file descriptor. The supervisor can inspect the call, modify its arguments or return value, and instruct the kernel to proceed or fail it.
  • Use Case: Enables complex security policies that require runtime context unavailable to the static BPF filter, such as dynamic pathname resolution or user-based access decisions.
  • Tooling: Used by sandboxes like Landlock and systemd-service managers to implement advanced confinement policies.
SECURE ENCLAVE EXECUTION

How Seccomp Works

Seccomp (secure computing mode) is a Linux kernel security feature that restricts the system calls a process is permitted to make, effectively sandboxing it to a defined subset of the kernel's interface.

Seccomp operates by filtering system calls—the fundamental requests a process makes to the Linux kernel for services like file access or network communication. A process can enter a restrictive mode where its permitted syscalls are defined by a Berkeley Packet Filter (BPF) program loaded into the kernel. This filter evaluates each syscall attempt; allowed calls proceed, while blocked ones terminate the process or return an error. This mechanism enforces the principle of least privilege at the most granular OS level.

There are two primary modes: seccomp-bpf, which allows fine-grained, programmable filtering, and the stricter seccomp mode 1, which only permits read(), write(), exit(), and sigreturn(). In modern containerized and secure enclave execution environments, seccomp is a critical layer for sandboxing untrusted code, such as AI agents executing tool calls, by preventing access to dangerous syscalls like execve or ptrace. It is often combined with other Linux Security Modules (LSM) like AppArmor for defense-in-depth.

SECCOMP

Frequently Asked Questions

Seccomp (secure computing mode) is a fundamental Linux kernel security feature for sandboxing applications. These questions address its core mechanisms, practical use, and role in securing AI systems.

Seccomp (secure computing mode) is a Linux kernel security feature that restricts the system calls a process is permitted to make, effectively sandboxing it to a defined subset of the kernel's interface. It works by applying a filter policy, written in Berkeley Packet Filter (BPF) bytecode, to the process. When the process attempts to make a system call (e.g., open, write, execve), the kernel evaluates the filter rules. The filter can allow the call, kill the process, return an error, or notify a userspace supervisor via SECCOMP_RET_TRACE. The most restrictive mode, seccomp-bpf, allows fine-grained filtering, while the simpler seccomp-strict mode only permits read, write, _exit, and sigreturn. This mechanism enforces the principle of least privilege at the most fundamental OS level.

Prasad Kumkar

About the author

Prasad Kumkar

CEO & MD, Inference Systems

Prasad Kumkar is the CEO & MD of Inference Systems and writes about AI systems architecture, LLM infrastructure, model serving, evaluation, and production deployment. Over 5+ years, he has worked across computer vision models, L5 autonomous vehicle systems, and LLM research, with a focus on taking complex AI ideas into real-world engineering systems.

His work and writing cover AI systems, large language models, AI agents, multimodal systems, autonomous systems, inference optimization, RAG, evaluation, and production AI engineering.