[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACT4Y+YCSk8P-5gh06pSRzLDT7m+L_4wj_yuj4hcwupjk_b=Ug@mail.gmail.com>
Date: Wed, 21 May 2025 17:05:33 +0200
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: krisman@...labora.com, luto@...nel.org, peterz@...radead.org,
keescook@...omium.org, gregory.price@...verge.com,
Marco Elver <elver@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/3] syscall_user_dispatch: Allow allowed range wrap-around
On Sat, 8 Mar 2025 at 12:19, Thomas Gleixner <tglx@...utronix.de> wrote:
>
> On Mon, Feb 24 2025 at 09:45, Dmitry Vyukov wrote:
> > There are two possible scenarios for syscall filtering:
> > - having a trusted/allowed range of PCs, and intercepting everything else
> > - or the opposite: a single untrusted/intercepted range and allowing
> > everything else
> > The current implementation only allows the former use case due to
> > allowed range wrap-around check. Allow the latter use case as well
> > by removing the wrap-around check.
> > The latter use case is relevant for any kind of sandboxing scenario,
> > or monitoring behavior of a single library. If a program wants to
> > intercept syscalls for PC range [START, END) then it needs to call:
> > prctl(..., END, -(END-START), ...);
> > which sets a wrap-around range that excludes everything
> > besides [START, END).
>
> That's not really intuitive and the implementation changes the prctl()
> behaviour in a non backwards compatible way.
>
> Can we please keep the current behaviour and have a new mode. Something
> like:
>
> # define PR_SYS_DISPATCH_OFF 0
> # define PR_SYS_DISPATCH_ON 1
> # define PR_SYS_DISPATCH_EXCLUSIVE_ON PR_SYS_DISPATCH_ON
> # define PR_SYS_DISPATCH_INCLUSIVE_ON 2
>
> That keeps the current mode backwards compatible and avoids the oddity of
>
> prctl(..., END, -(END-START), ...);
>
> i.e. this is clearly and obvious distinguishable for user space:
>
> prctl(..., PR_SYS_DISPATCH_EXCLUSIVE_ON, END, END - START, ...);
> prctl(..., PR_SYS_DISPATCH_INCLUSIVE_ON, END, END - START, ...);
>
> Which makes a lot of sense because these two modes are distinctly
> different, no?
>
> PR_SYS_DISPATCH_INCLUSIVE_ON will fail on older kernels and both modes
> have a sanity check. PR_SYS_DISPATCH_INCLUSIVE_ON should at least check
> for a zero length dispatcher region.
>
> Aside of the better user interface this avoids the in_compat_syscall()
> hack. Because then set_syscall_user_dispatch() does the range inversion
> and that works completely independent of compat.
>
> > kernel/entry/syscall_user_dispatch.c | 9 +++------
> > kernel/sys.c | 6 ++++++
> > 2 files changed, 9 insertions(+), 6 deletions(-)
>
> This clearly lacks an update of
>
> Documentation/admin-guide/syscall-user-dispatch.rst
I like this!
I've just sent v3 with this interface.
Powered by blists - more mailing lists