lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240305-attentat-robust-b0da8137b7df@brauner>
Date: Tue, 5 Mar 2024 09:59:47 +0100
From: Christian Brauner <brauner@...nel.org>
To: Kees Cook <keescook@...omium.org>
Cc: Adrian Ratiu <adrian.ratiu@...labora.com>, 
	linux-fsdevel@...r.kernel.org, kernel@...labora.com, linux-security-module@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org, Guenter Roeck <groeck@...omium.org>, 
	Doug Anderson <dianders@...omium.org>, Jann Horn <jannh@...gle.com>, 
	Andrew Morton <akpm@...ux-foundation.org>, Randy Dunlap <rdunlap@...radead.org>, 
	Mike Frysinger <vapier@...omium.org>
Subject: Re: [PATCH v2] proc: allow restricting /proc/pid/mem writes

> > Uhm, this will break the seccomp notifier, no? So you can't turn on
> > SECURITY_PROC_MEM_RESTRICT_WRITE when you want to use the seccomp
> > notifier to do system call interception and rewrite memory locations of
> > the calling task, no? Which is very much relied upon in various
> > container managers and possibly other security tools.
> > 
> > Which means that you can't turn this on in any of the regular distros.
> 
> FWIW, it's a run-time toggle, but yes, let's make sure this works
> correctly.
> 
> > So you need to either account for the calling task being a seccomp
> > supervisor for the task whose memory it is trying to access or you need
> > to provide a migration path by adding an api that let's caller's perform
> > these writes through the seccomp notifier.
> 
> How do seccomp supervisors that use USER_NOTIF do those kinds of
> memory writes currently? I thought they were actually using ptrace?
> Everything I'm familiar with is just using SECCOMP_IOCTL_NOTIF_ADDFD,
> and not doing fancy memory pokes.

For example, incus has a seccomp supervisor such that each container
gets it's own goroutine that is responsible for handling system call
interception.

If a container is started the container runtime connects to an AF_UNIX
socket to register with the seccomp supervisor. It stays connected until
it stops. Everytime a system call is performed that is registered in the
seccomp notifier filter the container runtime will send a AF_UNIX
message to the seccomp supervisor. This will include the following fds:

- the pidfd of the task that performed the system call (we should
  actually replace this with SO_PEERPIDFD now that we have that)
- the fd of the task's memory to /proc/<pid>/mem

The seccomp supervisor will then perform the system call interception
including the required memory reads and writes.

There's no ptrace involved. That was the whole point of the seccomp
notifier. :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ