[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z8gq6bWNPDtnUYsW@jlelli-thinkpadt14gen4.remote.csb>
Date: Wed, 5 Mar 2025 10:43:53 +0000
From: Juri Lelli <juri.lelli@...hat.com>
To: Harshit Agarwal <harshit@...anix.com>
Cc: Steven Rostedt <rostedt@...dmis.org>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Jon Kohler <jon@...anix.com>,
Gauri Patwardhan <gauri.patwardhan@...anix.com>,
Rahul Chunduru <rahul.chunduru@...anix.com>,
Will Ton <william.ton@...anix.com>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH v3] sched/rt: Fix race in push_rt_task
On 04/03/25 18:37, Harshit Agarwal wrote:
> Thanks Juri for pointing this out.
> I can send the fix for deadline as well.
> Is it okay if I do it in a separate patch?
Yes, we would need a separate patch.
> From taking a quick look at the code, I can see that the same fix won’t
> apply as is in case of deadline since it has two different callers for
> find_lock_later_rq.
Right, indeed.
> One is push_dl_task for which we can call pick_next_pushable_dl_task
> and make sure it is at the head. This is where we have the bug.
OK.
> Another one is dl_task_offline_migration which gets the task from
> dl_task_timer which in turn gets it from sched_dl_entity.
> I haven’t gone through the deadline code thoroughly but I think this race
> shouldn’t exist for the offline task (2nd) case. If that is true then the fix
> could be to check in push_dl_task if the task returned by find_lock_later_rq
> is still at the head of the queue or not.
I believe that won't work as dl_task_offline_migration() gets called in
case the replenishment timer for a task fires (to unthrottle it) and it
finds the old rq the task was running on has been offlined in the
meantime. The task is still throttled at this point and so it is not
enqueued in the dl_rq nor in the pushable task list/tree, so the check
you are adding won't work I am afraid. Maybe we can use dl_se->dl_throttled
to differentiate this different case.
Thanks,
Juri
Powered by blists - more mailing lists