lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+Z3ismwdeqa7iMo0JVD-u-nvSmN2eS5qJ5tUqXT9NjWcw@mail.gmail.com>
Date: Tue, 18 Feb 2025 18:34:34 +0100
From: Dmitry Vyukov <dvyukov@...gle.com>
To: gourry@...rry.net
Cc: krisman@...labora.com, tglx@...utronix.de, luto@...nel.org, 
	peterz@...radead.org, keescook@...omium.org, gregory.price@...verge.com, 
	Marco Elver <elver@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] syscall_user_dispatch: Allow allowed range wrap-around

On Tue, 18 Feb 2025 at 17:58, Gregory Price <gourry@...rry.net> wrote:
>
> On Tue, Feb 18, 2025 at 05:04:34PM +0100, Dmitry Vyukov wrote:
> > There are two possible scenarios for syscall filtering:
> >  - having a trusted/allowed range of PCs, and intercepting everything else
> >  - or the opposite: a single untrusted/intercepted range and allowing
> >    everything else
> > The current implementation only allows the former use case due to
> > allowed range wrap-around check. Allow the latter use case as well
> > by removing the wrap-around check.
> > The latter use case is relevant for any kind of sandboxing scenario,
> > or monitoring behavior of a single library. If a program wants to
> > intercept syscalls for PC range [START, END) then it needs to call:
> > prctl(..., END, -(END-START), ...);
>
> I don't necessarily disagree with the idea, but this sounds like using
> the wrong tool for the job.  The purpose of SUD was for emulating
> foreign OS system calls of entire programs - not a single library.
>
> The point being that it's very difficult to sandbox an individual
> library when you can't ensure it won't allocate resources outside the
> monitored bounds (this would be very difficult to guarantee, at least).
>
> If the intent is to load and re-use a single foreign-OS library, this
> change seems to be the question of "why not allow multiple ranges?",
> and you'd be on your way to reimplementing seccomp or BPF.

The problem with seccomp BPF is that the filter is inherited across
fork/exec which can't be used with SIGSYS and fine-grained custom
user-space policy. USER_DISPATCH is much more flexible in this regard.

Re allocating resources outside of monitored bounds: this is exactly
what syscall filtering is for, right :)
If we install a filter on a library/sandbox, we can control and
prevent it from allocating any more executable pages outside of the
range.

The motivation is sandboxing of libraries loaded within a known fixed
address range, while non-sandboxed code can live on both sides of the
sandboxed range (say, non-pie binary at lower addresses, and libc at
higher addresses).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ