[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <57FC4F60-01FE-4201-95FC-694841BF90F8@oracle.com>
Date: Thu, 7 Aug 2025 16:45:51 +0000
From: Prakash Sangappa <prakash.sangappa@...cle.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"rostedt@...dmis.org"
<rostedt@...dmis.org>,
"mathieu.desnoyers@...icios.com"
<mathieu.desnoyers@...icios.com>,
"bigeasy@...utronix.de"
<bigeasy@...utronix.de>,
"kprateek.nayak@....com" <kprateek.nayak@....com>,
"vineethr@...ux.ibm.com" <vineethr@...ux.ibm.com>
Subject: Re: [PATCH V7 01/11] sched: Scheduler time slice extension
> On Aug 7, 2025, at 7:07 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
>
> On Wed, Aug 06 2025 at 22:34, Thomas Gleixner wrote:
>> On Thu, Jul 24 2025 at 16:16, Prakash Sangappa wrote:
>>> @@ -396,6 +399,9 @@ static __always_inline void syscall_exit_to_user_mode_work(struct pt_regs *regs)
>>>
>>> CT_WARN_ON(ct_state() != CT_STATE_KERNEL);
>>>
>>> + /* Reschedule if scheduler time delay was granted */
>>
>> This is not rescheduling. It sets NEED_RESCHED, which is a completely
>> different thing.
>>
>>> + rseq_delay_set_need_resched();
>>
>> I fundamentally hate this hack as it goes out to user space with
>> NEED_RESCHED set and absolutely zero debug mechanism which validates
>> it. Currently going out with NEED_RESCHED set is a plain bug, rigthfully
>> so.
>>
>> But now this muck comes along and sets the flag, which is semantically
>> just wrong and ill defined.
>>
>> The point is that NEED_RESCHED has been cleared by requesting and
>> granting the extension, which means the task can go out to userspace,
>> until it either relinquishes the CPU or hrtick() whacks it over the
>> head.
>
> Sorry. I misread this. It's placed before it enters the exit work loop
> and not afterwards. I got lost in this maze. :(
Yes.
>
>> The obvious way to solve both issues is to clear NEED_RESCHED when
>> the delay is granted and then do in syscall_enter_from_user_mode_work()
>>
>> rseq_delay_sys_enter()
>> {
>> if (unlikely(current->rseq_delay_resched == GRANTED)) {
>> set_tsk_need_resched(current);
>> schedule();
>> }
>> }
>>
>> No?
>>
>> It's debatable whether the schedule() there is necessary. Removing it
>> would allow the task to either complete the syscall and reschedule on
>> exit to user space or go to sleep in the syscall. But that's a trivial
>> detail.
>
> But, the most important thing is that doing it at entry allows to debug
> this stuff for correctness.
>
> I can kinda see that a sched_yield() shortcut might be the right thing
> to do for relinguishing the CPU, but if that's the user space contract,
> then any other syscall needs to be caught and not silently papered over
> at return from syscall.
Sure. The check to see if delay was GRANTED in syscall_exit_to_user_mode_work()
would catch any other system calls.
>
> Let me think about this some more.
Sure,
We will need a recommended system call, which the application can call
to relinquish the cpu after extra cpu time was granted. sched_yield(2) seems
appropriate. The shortcut in sched_yield() was to avoid going thru do_sched_yield()
when called in the extended time. If we move the GRANTED check to
syscall_enter_from_user_mode_work(), then the shortcut in sched_yield()
cannot be implemented.
Thanks,
-Prakash
>
>
>
Powered by blists - more mailing lists