lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+ZJhCD3Y-nNKb42K0561tOceYKRNm6Yi8r9-KwoWfvkbQ@mail.gmail.com>
Date: Wed, 19 Feb 2025 09:54:28 +0100
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Gregory Price <gourry@...rry.net>
Cc: krisman@...labora.com, tglx@...utronix.de, luto@...nel.org, 
	peterz@...radead.org, keescook@...omium.org, gregory.price@...verge.com, 
	Marco Elver <elver@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] syscall_user_dispatch: Allow allowed range wrap-around

On Tue, 18 Feb 2025 at 19:00, Gregory Price <gourry@...rry.net> wrote:
>
> On Tue, Feb 18, 2025 at 06:34:34PM +0100, Dmitry Vyukov wrote:
> > On Tue, 18 Feb 2025 at 17:58, Gregory Price <gourry@...rry.net> wrote:
> > > If the intent is to load and re-use a single foreign-OS library, this
> > > change seems to be the question of "why not allow multiple ranges?",
> > > and you'd be on your way to reimplementing seccomp or BPF.
> >
> > The problem with seccomp BPF is that the filter is inherited across
> > fork/exec which can't be used with SIGSYS and fine-grained custom
> > user-space policy. USER_DISPATCH is much more flexible in this regard.
>
> It's also fundamentally not a security-sufficient interposition system.
>
> > Re allocating resources outside of monitored bounds: this is exactly
> > what syscall filtering is for, right :)
>
> No.  SUD's purpose is to catch foreign-OS syscall execution.

My understanding is that aiming at concrete end problems is not the
kernel approach and design philosophy. Instead it aims at providing
flexible _primitives_ that can be used to solve various end problems.
It's like you are not selling pencils to draw trees, instead you just
sell good pencils.

E.g. if there are 2 end problems A and B that require 98% of the same
primitives, the kernel wouldn't implement 2 completely independent
subsystems to solve A and B that duplicate 98% of the code. Instead it
would provide flexible primitives that can be used to solve A and B
(and yet unknown C and D in future).


> You *can* do hacky stuff like interposing on libc, but it you can do
> hacky things with bpf too.
>
> > If we install a filter on a library/sandbox, we can control and
> > prevent it from allocating any more executable pages outside of the
> > range.
> >
> > The motivation is sandboxing of libraries loaded within a known fixed
> > address range, while non-sandboxed code can live on both sides of the
> > sandboxed range (say, non-pie binary at lower addresses, and libc at
> > higher addresses).
>
> My question is why you aren't doing the opposite.  Exempt the known good
> ranges and hook everything else.  This actually makes it easier to
> ensure the software you're hooking doesn't escape interposition.

The restricted code is a single continuous region. Allowed code lives
on both sides (non-pie binary at the lowest addresses, dynamic libs at
the highest addresses, and there is not enough space before and after
to map a large enough contiguous region for restricted code).

> You can use the SIGSYS register data (instruction pointer) to determine
> whether to act on the syscall or pass it through.

Too expensive (few additional instructions vs microseconds for kernel
transition, sigcontext setup, and sigreturn).

> Like I said, I don't necessarily disagree with the change, just a bit
> concerned about the direction this takes SUD.  It's not a sufficient
> interface to isolate the behavior of a single library, and this change
> naturally begs the question "If we do this, why not implement an entire
> multi-range filtering system? Why stop at one range?"

That's a harder question. I think we don't need to answer it right
now. We can just consider this proposal in isolation. This is where we
stop now.
It preserves all of the existing uses intact + allows more cases with
a trivial code change (actually deleting code).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ