linux-kernel - Re: [RFC][PATCH 1/2] sched: Extended scheduler time slice

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1A67C4F1-F07E-477C-9781-071546AE3A8B@oracle.com>
Date: Wed, 5 Feb 2025 21:08:47 +0000
From: Prakash Sangappa <prakash.sangappa@...cle.com>
To: Steven Rostedt <rostedt@...dmis.org>
CC: Joel Fernandes <joel@...lfernandes.org>,
        Peter Zijlstra
	<peterz@...radead.org>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>,
        "linux-trace-kernel@...r.kernel.org"
	<linux-trace-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ankur Arora <ankur.a.arora@...cle.com>,
        Linus Torvalds
	<torvalds@...ux-foundation.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "x86@...nel.org" <x86@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "luto@...nel.org" <luto@...nel.org>, "bp@...en8.de" <bp@...en8.de>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "hpa@...or.com"
	<hpa@...or.com>,
        "juri.lelli@...hat.com" <juri.lelli@...hat.com>,
        "vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
        "willy@...radead.org" <willy@...radead.org>,
        "mgorman@...e.de"
	<mgorman@...e.de>,
        "jon.grimm@....com" <jon.grimm@....com>,
        "bharata@....com"
	<bharata@....com>,
        "raghavendra.kt@....com" <raghavendra.kt@....com>,
        Boris
 Ostrovsky <boris.ostrovsky@...cle.com>,
        Konrad Wilk <konrad.wilk@...cle.com>,
        "jgross@...e.com" <jgross@...e.com>,
        "Andrew.Cooper3@...rix.com"
	<Andrew.Cooper3@...rix.com>,
        Vineeth Pillai <vineethrp@...gle.com>,
        Suleiman
 Souhlal <suleiman@...gle.com>,
        Ingo Molnar <mingo@...nel.org>,
        Mathieu
 Desnoyers <mathieu.desnoyers@...icios.com>,
        Clark Williams
	<clark.williams@...il.com>,
        "bigeasy@...utronix.de" <bigeasy@...utronix.de>,
        "daniel.wagner@...e.com" <daniel.wagner@...e.com>,
        Joseph Salisbury
	<joseph.salisbury@...cle.com>,
        "broonie@...il.com" <broonie@...il.com>
Subject: Re: [RFC][PATCH 1/2] sched: Extended scheduler time slice



> On Feb 5, 2025, at 5:16 AM, Steven Rostedt <rostedt@...dmis.org> wrote:
> 
> On Wed, 5 Feb 2025 00:09:51 -0500
> Joel Fernandes <joel@...lfernandes.org> wrote:
> 
>> On Tue, Feb 4, 2025 at 10:03 PM Steven Rostedt <rostedt@...dmis.org> wrote:
>>> 
>>> On Tue, 4 Feb 2025 19:56:09 -0500
>>> Joel Fernandes <joel@...lfernandes.org> wrote:
>>> 
>>>>> Here is the RFC I had sent that Peter is referring  
>>>> 
>>>> FWIW, I second the idea of a new syscall for this than (ab)using rseq
>>>> and also independence from preemption method. I agree that something
>>>> generic is better than relying on preemption method.  
>>> 
>>> So you are for adding another user/kernel memory mapped section?  
>> 
>> I don't personally mind that.
> 
> I'm glad you don't personally mind it. Are you going to help maintain
> another memory mapped section?
> 

The new syscall/API proposed was to provide per thread shared mapped 
area(shared structure) that are allocated from memory pages that are pinned. 
So the kernel could access it without the need for a copyin/copyout. 

The idea is that it would be helpful in places where we cannot take a page 
fault in the kernel codepath.


>> 
>>> And you are also OK with allowing any task to make an RT task wait longer?
>>> 
>>> Putting my RT hat back on, I would definitely disable that on any system
>>> that requires RT.  
>> 
>> Just so I understand, you are basically saying that you want this
>> feature only for FAIR tasks, and allowing RT tasks to extend time
>> slice might actually hurt the latency of (other) RT tasks on the
>> system right? This assumes PREEMPT_RT because the latency is 50us
>> right?
> 
> RT tasks don't have a time slice. They are affected by events. An external
> interrupt coming in, or a timer going off that states something is
> happening. Perhaps we could use this for SCHED_RR or maybe even
> SCHED_DEADLINE, as those do have time slices.
> 
> But if it does get used, it should only be used when the task being
> scheduled is the same SCHED_RR priority, or if SCHED_DEADLINE will not fail
> its guarantees.
> 
>> 
>> But in a poorly designed system, if you have RT tasks at higher
>> priority that preempt things lower in RT, that would already cause
>> latency anyway. Similarly, I would also consider any PREEMPT_RT system
> 
> And that would be a poorly designed system, and not the problem of the
> kernel.
> 
>> that (mis)uses this API in an RT task as also a poorly designed
>> system. I think PREEMPT_RT systems generally require careful design
>> anyway.  So the fact that a system is poorly designed and thus causes
>> latency is not the kernel's problem IMO.
> 
> Correct. And why I don't think this should be used for RT. It's SCHED_OTHER
> that doesn't have any control of the sched tick, where this hint can help.
> 
>> 
>> In any case, if you want this to only work on FAIR tasks and not RT
>> tasks, why is that only possible to do with rseq() + LAZY preemption
>> and not Prakash's new API + all preemption modes?
>> 
>> Also you can just ignore RT tasks (not that I'm saying that's a good
>> idea but..) in taskshrd_delay_resched() in that patch if you ever
>> wanted to do that.
>> 
>> I just feel the RT latency thing is a non-issue AFAICS.
> 
> Have you worked on any RT projects before?
> 
> -- Steve