[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <D7C6B8EE-7AFC-4F1E-8A59-E9573658146C@nutanix.com>
Date: Mon, 17 Feb 2025 16:36:47 +0000
From: Harshit Agarwal <harshit@...anix.com>
To: Steven Rostedt <rostedt@...dmis.org>
CC: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot
<vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin
Schneider <vschneid@...hat.com>,
"linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>,
Jon Kohler <jon@...anix.com>,
Gauri
Patwardhan <gauri.patwardhan@...anix.com>,
Rahul Chunduru
<rahul.chunduru@...anix.com>,
Will Ton <william.ton@...anix.com>
Subject: Re: [PATCH v2] sched/rt: Fix race in push_rt_task
Hi Steve,
Thanks for the information. I realized my mistake this after updating this
thread and then later I sent the patch as a separate thread here
https://lore.kernel.org/lkml/20250214170844.201692-1-harshit@nutanix.com/
and included the link to v1 along with the changes made.
I had also added a comment on is_migration_disabled like you suggested here.
This separate patch addresses your comments already.
Sorry for the confusion.
Regards,
Harshit
> On Feb 17, 2025, at 7:50 AM, Steven Rostedt <rostedt@...dmis.org> wrote:
>
>
> FYI,
>
> You should always send a new patch version as a separate thread. That's
> because they can get lost in the thread and makes it harder for maintainers
> to know what the next version of the patch is. I've picked the wrong patch
> version before because there was another version sent that I missed.
>
> On Thu, 13 Feb 2025 17:54:34 +0000
> Harshit Agarwal <harshit@...anix.com> wrote:
>
>> Solution
>> ========
>> The solution here is fairly simple. After obtaining the lock (at 4a),
>> the check is enhanced to make sure that the task is still at the head of
>> the pushable tasks list. If not, then it is anyway not suitable for
>> being pushed out. The fix also removes any conditions that are no longer
>> needed.
>>
>> Testing
>> =======
>> The fix is tested on a cluster of 3 nodes, where the panics due to this
>> are hit every couple of days. A fix similar to this was deployed on such
>> cluster and was stable for more than 30 days.
>
> May also want to add:
>
> Since 'is_migration_disabled()' a faster check than the others, it was moved
> to be the first check for consistency.
>
>>
>> Co-developed-by: Jon Kohler <jon@...anix.com>
>> Signed-off-by: Jon Kohler <jon@...anix.com>
>> Co-developed-by: Gauri Patwardhan <gauri.patwardhan@...anix.com>
>> Signed-off-by: Gauri Patwardhan <gauri.patwardhan@...anix.com>
>> Co-developed-by: Rahul Chunduru <rahul.chunduru@...anix.com>
>> Signed-off-by: Rahul Chunduru <rahul.chunduru@...anix.com>
>> Signed-off-by: Harshit Agarwal <harshit@...anix.com>
>> Tested-by: Will Ton <william.ton@...anix.com>
>> ---
>
> You can add here (after the above three dashes), how this version is
> different from the last version. The text below the dashes and before the
> patch is ignored by git, but is useful for reviewers. For instance:
>
> Changes since v1: https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_all_20250211054646.23987-2D1-2Dharshit-40nutanix.com_&d=DwICAg&c=s883GpUCOChKOHiocYtGcg&r=QTPVhNgH716-zU_kPmte39o3vGFVupnGmmfiVBpq9PU&m=YHnhqY1UaVeahgoVKVWuMaCw-TJQQg9Sdhif34WqstcHXYe_sUHfix5ImyJwyIHl&s=lhfXqBLkfeyyKLXqPABmYuQZIqCZzHx0-dQk2i3k49w&e=
>
> - Removed the redundant checks that task != pick_next_pushable_task() already has
>
>
> Notice I added a link to the previous version. This helps find the previous
> version without having to make this version a reply to it.
>
>> kernel/sched/rt.c | 54 +++++++++++++++++++++++------------------------
>> 1 file changed, 26 insertions(+), 28 deletions(-)
>>
>> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
>> index 4b8e33c615b1..4762dd3f50c5 100644
>> --- a/kernel/sched/rt.c
>> +++ b/kernel/sched/rt.c
>> @@ -1885,6 +1885,27 @@ static int find_lowest_rq(struct task_struct *task)
>> return -1;
>> }
>>
>
> Otherwise,
>
> Reviewed-by: Steven Rostedt (Google) <rostedt@...dmis.org>
>
> Peter, could you pick this up?
>
> -- Steve
Powered by blists - more mailing lists