lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f0f681a0-22b4-45f4-85a1-18f140286cbe@efficios.com>
Date: Wed, 13 Nov 2024 14:36:58 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Peter Zijlstra <peterz@...radead.org>,
 Prakash Sangappa <prakash.sangappa@...cle.com>
Cc: linux-kernel@...r.kernel.org, rostedt@...dmis.org, tglx@...utronix.de,
 daniel.m.jordan@...cle.com
Subject: Re: [RFC PATCH 0/4] Scheduler time slice extension

On 2024-11-13 13:50, Peter Zijlstra wrote:
> On Wed, Nov 13, 2024 at 12:01:22AM +0000, Prakash Sangappa wrote:
> 
>> This patch set implements the above mentioned 50us extension time as posted
>> by Peter. But instead of using restartable sequences as API to set the flag
>> to request the extension, this patch proposes a new API with use of a per
>> thread shared structure implementation described below. This shared structure
>> is accessible in both users pace and kernel. The user thread will set the
>> flag in this shared structure to request execution time extension.
> 
> But why -- we already have rseq, glibc uses it by default. Why add yet
> another thing?

Indeed, what I'm not seeing in this RFC patch series cover letter is an
explanation that justifies adding yet another per-thread memory area
shared between kernel and userspace when we have extensible rseq
already.

Peter, was there anything fundamentally wrong with your approach based
on rseq ? https://lore.kernel.org/lkml/20231030132949.GA38123@noisy.programming.kicks-ass.net

The main thing I wonder is whether loading the rseq delay resched flag
on return to userspace is too late in your patch. Also, I'm not sure it is
realistic to require that no system calls should be done within time extension
slice. If we have this scenario:

A) userspace grabs lock
    - set rseq delay resched flag
B) syscall
    - reschedule
     [...]
    - return to userspace, load rseq delay-resched flag from userspace (too late)

I would have thought loading the delay resched flag should be attempted much
earlier in the scheduler code. Perhaps we could do this from a page fault
disable critical section, and accept that this hint may be a no-op if the
rseq page happens to be swapped out (which is really unlikely). This is
similar to the "on_cpu" sched state rseq extension RFC I posted a while back,
which needed to be accessed from the scheduler:

   https://lore.kernel.org/lkml/20230517152654.7193-1-mathieu.desnoyers@efficios.com/
   https://lore.kernel.org/lkml/20230529191416.53955-1-mathieu.desnoyers@efficios.com/

And we'd leave the delay-resched load in place on return to userspace, so
in the unlikely scenario where it is swapped out, at least it gets paged
back at that point.

Feel free to let me know if I'm missing an important point and/or saying
nonsense here.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ