[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250418193410.2010058-1-prakash.sangappa@oracle.com>
Date: Fri, 18 Apr 2025 19:34:07 +0000
From: Prakash Sangappa <prakash.sangappa@...cle.com>
To: linux-kernel@...r.kernel.org
Cc: peterz@...radead.org, rostedt@...dmis.org, mathieu.desnoyers@...icios.com,
tglx@...utronix.de, bigeasy@...utronix.de
Subject: [PATCH V2 0/3] Scheduler time slice extension
A user thread can get preempted in the middle of executing a critical
section in user space while holding locks, which can have undesirable affect
on performance. Having a way for the thread to request additional execution
time on cpu, so that it can complete the critical section will be useful in
such scenario. The request can be made by setting a bit in mapped memory,
such that the kernel can also access to check and grant extra execution time
on the cpu.
There have been couple of proposals[1][2] for such a feature, which attempt
to address the above scenario by granting one extra tick of execution time.
In patch thread [1] posted by Steven Rostedt, there is ample discussion about
need for this feature.
However, the concern has been that this can lead to abuse. One extra tick can
be a long time(about a millisec or more). Peter Zijlstra in response posted a
prototype solution[3], which grants 50us execution time extension only.
This is achieved with the help of a timer started on that cpu at the time of
granting extra execution time. When the timer fires the thread will be
preempted, if still running.
This patchset implements above solution as suggested, with use of restartable
sequences(rseq) structure for API. Refer to [3][4] for further discussions.
v1:
https://lore.kernel.org/all/20250215005414.224409-1-prakash.sangappa@oracle.com/
v2:
- Based on dicussions in [3], expecting user application to call sched_yield()
to yield the cpu at the end of the critical section may not be advisable as
pointed out by Linus.
So added a check in return path from a system call to reschedule if time
slice extension was granted to the thread. The check could as well be in
syscall enter path from user mode.
This would allow application thread to call any system call to yield the cpu.
Which system call should be suggested? getppid(2) works.
Do we still need the change in sched_yield() to reschedule when the thread
has current->rseq_sched_delay set?
- Added patch to introduce a sysctl tunable parameter to specify duration of
the time slice extension in micro seconds(us), called 'sched_preempt_delay_us'.
Can take a value in the range 0 to 100. Default is set to 50us.
Setting this tunable to 0 disables the scheduler time slice extension feature.
[1] https://lore.kernel.org/lkml/20231025054219.1acaa3dd@gandalf.local.home/
[2] https://lore.kernel.org/lkml/1395767870-28053-1-git-send-email-khalid.aziz@oracle.com/
[3] https://lore.kernel.org/all/20250131225837.972218232@goodmis.org/
[4] https://lore.kernel.org/all/20241113000126.967713-1-prakash.sangappa@oracle.com/
[5] https://lore.kernel.org/lkml/20231030132949.GA38123@noisy.programming.kicks-ass.net/
[6] https://lore.kernel.org/all/1631147036-13597-1-git-send-email-prakash.sangappa@oracle.com/
Prakash Sangappa (3):
Sched: Scheduler time slice extension
Sched: Tunable to specify duration of time slice extension
Sched: Add scheduler stat for cpu time slice extension
include/linux/entry-common.h | 11 +++++--
include/linux/sched.h | 23 ++++++++++++++
include/uapi/linux/rseq.h | 5 +++
kernel/entry/common.c | 21 ++++++++++---
kernel/rseq.c | 59 ++++++++++++++++++++++++++++++++++++
kernel/sched/core.c | 37 ++++++++++++++++++++++
kernel/sched/debug.c | 1 +
kernel/sched/syscalls.c | 7 +++++
8 files changed, 156 insertions(+), 8 deletions(-)
--
2.43.5
Powered by blists - more mailing lists