[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <71164940-E45A-4572-9F8D-4CE7189514E4@oracle.com>
Date: Fri, 15 Nov 2024 17:49:29 +0000
From: Prakash Sangappa <prakash.sangappa@...cle.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
CC: Peter Zijlstra <peterz@...radead.org>,
"linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>,
"rostedt@...dmis.org" <rostedt@...dmis.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
Daniel Jordan
<daniel.m.jordan@...cle.com>
Subject: Re: [RFC PATCH 0/4] Scheduler time slice extension
> On Nov 15, 2024, at 6:41 AM, Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
>
> On 2024-11-14 05:14, Peter Zijlstra wrote:
>> On Wed, Nov 13, 2024 at 02:36:58PM -0500, Mathieu Desnoyers wrote:
>>> On 2024-11-13 13:50, Peter Zijlstra wrote:
>>>> On Wed, Nov 13, 2024 at 12:01:22AM +0000, Prakash Sangappa wrote:
>>>>
>>>>> This patch set implements the above mentioned 50us extension time as posted
>>>>> by Peter. But instead of using restartable sequences as API to set the flag
>>>>> to request the extension, this patch proposes a new API with use of a per
>>>>> thread shared structure implementation described below. This shared structure
>>>>> is accessible in both users pace and kernel. The user thread will set the
>>>>> flag in this shared structure to request execution time extension.
>>>>
>>>> But why -- we already have rseq, glibc uses it by default. Why add yet
>>>> another thing?
>>>
>>> Indeed, what I'm not seeing in this RFC patch series cover letter is an
>>> explanation that justifies adding yet another per-thread memory area
>>> shared between kernel and userspace when we have extensible rseq
>>> already.
>>>
>>> Peter, was there anything fundamentally wrong with your approach based
>>> on rseq ? https://lore.kernel.org/lkml/20231030132949.GA38123@noisy.programming.kicks-ass.net
>> Not that I can remember, but it's a long time ago :-)
>>> The main thing I wonder is whether loading the rseq delay resched flag
>>> on return to userspace is too late in your patch.
>> Too late how? It only loads it at the point we would've called
>> schedule() -- no point in looking at it otherwise, right?
>
> [...]
>
> For the specific return-to-userspace path, I think where you've placed
> the delay-resched flag check is fine.
>
> I'm concerned about other code paths that invoke schedule() besides
> return-to-userspace. For instance:
>
> raw_irqentry_exit_cond_resched():
>
> if (!preempt_count()) {
> [...]
> if (need_resched())
> preempt_schedule_irq();
> }
>
> AFAIU, this could be triggered by an interrupt handler exit when nested
> over a page fault handler, exception handler, or system call.
>
> We may decide that we cannot care less about those scenarios, and just
> ignore the delay-resched flag, but it's relevant to take those into
> consideration and clearly document the rationale behind our decision.
Don’t think the delay-resched will address all scenarios where preemption can
occur when in critical section. We could aim to address frequent paths where
a task can get preempted. Initially the intent was to prevent preemption mainly
at the end of time slice, if the thread is in a critical section in the user space and
has requested delaying reschedule..
Another path to consider is the wakeups occurring on a different cpu which could
enqueue a thread and attempt to preempt this thread when it is running in the
critical section. Should it check if the thread running has been granted extra time
i.e the ‘taskshrd_sched_delay’ has been set for the running thread, avoid setting
TIF_NEED_RESCHED in resched_curr() and sending IPI, ie like lazy preemption?
If ’taskshrd_sched_delay’ has been set we know it will get preempted due to the
timer soon, or the task would sched_yield(if it is a well behaving application).
>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> https://www.efficios.com
>
Powered by blists - more mailing lists