[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251105122808.GK988547@noisy.programming.kicks-ass.net>
Date: Wed, 5 Nov 2025 13:28:08 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Fernand Sieber <sieberf@...zon.com>
Cc: kernel test robot <oliver.sang@...el.com>, oe-lkp@...ts.linux.dev,
lkp@...el.com, linux-kernel@...r.kernel.org, x86@...nel.org,
aubrey.li@...ux.intel.com, yu.c.chen@...el.com, jstultz@...gle.com
Subject: Re: [tip:sched/core] [sched/fair] 79104becf4:
BUG:kernel_NULL_pointer_dereference,address
On Tue, Nov 04, 2025 at 11:04:55PM +0200, Fernand Sieber wrote:
> Hi Peter,
>
> I spent some time today investigating this report. The crash happens when
> a proxy task yields.
>
> Since it probably doesn't make sense that a task blocking the best pick
> yields, a simple workaround is to ignore the yield in this case:
Well yield() as a whole doesn't make much sense outside of the strict RT
setting where it is well defined.
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8993,6 +8993,11 @@ static void yield_task_fair(struct rq *rq)
> if (unlikely(rq->nr_running == 1))
> return;
>
> + /* Don't yield if we're running a proxy task */
> + if (rq->donor && rq->donor != curr) {
> + return;
> + }
Well, yield_task_fair() should probably have:
struct task_struct curr = rq->donor;
But yeah, if the task holding your resource is doing yield() you're
'sad'. Basically a sched-fair yield() means: I've no fucking clue what
I'm doing and lets hope we can make progress a little later.
And it gets worse in the context of PI/proxy, because in that case your
fair task can deadlock the system through sheer incompetence.
Anyway, consider the PI case, we bump a fair task to FIFO and then
yield() would do the FIFO yield -- with all the possible problems.
And we want the same for proxy, if the boosting context is FIFO, we want
a FIFO yield.
So I'm thinking, for all things to be consistent, we want something
like:
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 6b8a9286e2fc..13112c680f92 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2143,7 +2143,7 @@ static void yield_task_dl(struct rq *rq)
* it and the bandwidth timer will wake it up and will give it
* new scheduling parameters (thanks to dl_yielded=1).
*/
- rq->curr->dl.dl_yielded = 1;
+ rq->donor->dl.dl_yielded = 1;
update_rq_clock(rq);
update_curr_dl(rq);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 273e2871b59e..f1d8eb350f59 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8980,7 +8980,7 @@ static void put_prev_task_fair(struct rq *rq, struct task_struct *prev, struct t
*/
static void yield_task_fair(struct rq *rq)
{
- struct task_struct *curr = rq->curr;
+ struct task_struct *curr = rq->donor;
struct cfs_rq *cfs_rq = task_cfs_rq(curr);
struct sched_entity *se = &curr->se;
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 1fd97f2d7ec6..f1867fe8e5c5 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1490,7 +1490,7 @@ static void requeue_task_rt(struct rq *rq, struct task_struct *p, int head)
static void yield_task_rt(struct rq *rq)
{
- requeue_task_rt(rq, rq->curr, 0);
+ requeue_task_rt(rq, rq->donor, 0);
}
static int find_lowest_rq(struct task_struct *task);
diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
index 8f0f603b530b..865ec5d6e824 100644
--- a/kernel/sched/syscalls.c
+++ b/kernel/sched/syscalls.c
@@ -1319,7 +1319,7 @@ static void do_sched_yield(void)
rq = this_rq_lock_irq(&rf);
schedstat_inc(rq->yld_count);
- current->sched_class->yield_task(rq);
+ rq->donor->sched_class->yield_task(rq);
preempt_disable();
rq_unlock_irq(rq, &rf);
Powered by blists - more mailing lists