lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <261A8604-DA8D-468A-83BB-F530D5639A43@oracle.com>
Date: Wed, 19 Nov 2025 00:20:34 +0000
From: Prakash Sangappa <prakash.sangappa@...cle.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra
	<peterz@...radead.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>, Jonathan Corbet <corbet@....net>,
        Madadi Vineeth Reddy
	<vineethr@...ux.ibm.com>,
        K Prateek Nayak <kprateek.nayak@....com>,
        Steven
 Rostedt <rostedt@...dmis.org>,
        Sebastian Andrzej Siewior
	<bigeasy@...utronix.de>,
        Arnd Bergmann <arnd@...db.de>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
Subject: Re: [patch V3 07/12] rseq: Implement syscall entry work for time
 slice extensions



> On Oct 29, 2025, at 6:22 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
> 
> The kernel sets SYSCALL_WORK_RSEQ_SLICE when it grants a time slice
> extension. This allows to handle the rseq_slice_yield() syscall, which is
> used by user space to relinquish the CPU after finishing the critical
> section for which it requested an extension.
> 
> In case the kernel state is still GRANTED, the kernel resets both kernel
> and user space state with a set of sanity checks. If the kernel state is
> already cleared, then this raced against the timer or some other interrupt
> and just clears the work bit.
> 
> Doing it in syscall entry work allows to catch misbehaving user space,
> which issues a syscall from the critical section. Wrong syscall and
> inconsistent user space result in a SIGSEGV.
> 
> 

[…]

> +/*
> + * Invoked from syscall entry if a time slice extension was granted and the
> + * kernel did not clear it before user space left the critical section.
> + */
> +void rseq_syscall_enter_work(long syscall)
> +{

[…]

> 
> + curr->rseq.slice.state.granted = false;
> + /*
> + * Clear the grant in user space and check whether this was the
> + * correct syscall to yield. If the user access fails or the task
> + * used an arbitrary syscall, terminate it.
> + */
> + if (put_user(0U, &curr->rseq.usrptr->slice_ctrl.all) || syscall != __NR_rseq_slice_yield)
> + force_sig(SIGSEGV);
> +}

I have been trying to get our Database team to implement changes to use the slice extension API.
They encounter the issue with a system call being made within the slice extension window and the
process dies with SEGV. 

Apparently it will be hard to enforce not calling a system call in the slice extension window due to layering.
For the DB use case, It is fine to terminate the slice extension if a system call is made, but the process
getting killed will not work.

Thanks,
-Prakash

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ