[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250613051734.4023260-1-joelagnelf@nvidia.com>
Date: Fri, 13 Jun 2025 01:17:20 -0400
From: Joel Fernandes <joelagnelf@...dia.com>
To: linux-kernel@...r.kernel.org
Cc: Joel Fernandes <joelagnelf@...dia.com>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>,
Tejun Heo <tj@...nel.org>,
David Vernet <void@...ifault.com>,
Andrea Righi <arighi@...dia.com>,
Changwoo Min <changwoo@...lia.com>,
bpf@...r.kernel.org
Subject: [PATCH v3 00/10] Add a deadline server for sched_ext tasks
sched_ext tasks currently are starved by RT hoggers especially since RT
throttling was replaced by deadline servers to boost only CFS tasks. Several
users in the community have reported issues with RT stalling sched_ext tasks.
Add a sched_ext deadline server as well so that sched_ext tasks are also
boosted and do not suffer starvation.
A kselftest is also provided to verify the starvation issues are now fixed.
Btw, there is still something funky going on with CPU hotplug and the
relinquish patch. Sometimes the sched_ext's hotplug self-test locks up
(./runner -t hotplug). Reverting that patch fixes it, so I am suspecting
something is off in dl_server_remove_params() when it is being called on
offline CPUs.
v2->v3:
- Removed code duplication in debugfs. Made ext interface separate.
- Fixed issue where rq_lock_irqsave was not used in the relinquish patch.
- Fixed running bw accounting issue in dl_server_remove_params.
Link to v1: https://lore.kernel.org/all/20250315022158.2354454-1-joelagnelf@nvidia.com/
Link to v2: https://lore.kernel.org/all/20250602180110.816225-1-joelagnelf@nvidia.com/
Andrea Righi (1):
selftests/sched_ext: Add test for sched_ext dl_server
Joel Fernandes (9):
sched/debug: Fix updating of ppos on server write ops
sched/debug: Stop and start server based on if it was active
sched/deadline: Clear the defer params
sched: Add support to pick functions to take rf
sched: Add a server arg to dl_server_update_idle_time()
sched/ext: Add a DL server for sched_ext tasks
sched/debug: Add support to change sched_ext server params
sched/deadline: Add support to remove DL server bandwidth
sched/ext: Relinquish DL server reservations when not needed
include/linux/sched.h | 2 +-
kernel/sched/core.c | 19 +-
kernel/sched/deadline.c | 78 +++++--
kernel/sched/debug.c | 171 +++++++++++---
kernel/sched/ext.c | 108 ++++++++-
kernel/sched/fair.c | 15 +-
kernel/sched/idle.c | 4 +-
kernel/sched/rt.c | 2 +-
kernel/sched/sched.h | 13 +-
kernel/sched/stop_task.c | 2 +-
tools/testing/selftests/sched_ext/Makefile | 1 +
.../selftests/sched_ext/rt_stall.bpf.c | 23 ++
tools/testing/selftests/sched_ext/rt_stall.c | 213 ++++++++++++++++++
13 files changed, 579 insertions(+), 72 deletions(-)
create mode 100644 tools/testing/selftests/sched_ext/rt_stall.bpf.c
create mode 100644 tools/testing/selftests/sched_ext/rt_stall.c
--
2.34.1
Powered by blists - more mailing lists