[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250217105007.45ba8cb4@gandalf.local.home>
Date: Mon, 17 Feb 2025 10:50:07 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Harshit Agarwal <harshit@...anix.com>, Peter Zijlstra
<peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann
<dietmar.eggemann@....com>, Ben Segall <bsegall@...gle.com>, Mel Gorman
<mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
linux-kernel@...r.kernel.org, Jon Kohler <jon@...anix.com>, Gauri
Patwardhan <gauri.patwardhan@...anix.com>, Rahul Chunduru
<rahul.chunduru@...anix.com>, Will Ton <william.ton@...anix.com>
Subject: Re: [PATCH v2] sched/rt: Fix race in push_rt_task
FYI,
You should always send a new patch version as a separate thread. That's
because they can get lost in the thread and makes it harder for maintainers
to know what the next version of the patch is. I've picked the wrong patch
version before because there was another version sent that I missed.
On Thu, 13 Feb 2025 17:54:34 +0000
Harshit Agarwal <harshit@...anix.com> wrote:
> Solution
> ========
> The solution here is fairly simple. After obtaining the lock (at 4a),
> the check is enhanced to make sure that the task is still at the head of
> the pushable tasks list. If not, then it is anyway not suitable for
> being pushed out. The fix also removes any conditions that are no longer
> needed.
>
> Testing
> =======
> The fix is tested on a cluster of 3 nodes, where the panics due to this
> are hit every couple of days. A fix similar to this was deployed on such
> cluster and was stable for more than 30 days.
May also want to add:
Since 'is_migration_disabled()' a faster check than the others, it was moved
to be the first check for consistency.
>
> Co-developed-by: Jon Kohler <jon@...anix.com>
> Signed-off-by: Jon Kohler <jon@...anix.com>
> Co-developed-by: Gauri Patwardhan <gauri.patwardhan@...anix.com>
> Signed-off-by: Gauri Patwardhan <gauri.patwardhan@...anix.com>
> Co-developed-by: Rahul Chunduru <rahul.chunduru@...anix.com>
> Signed-off-by: Rahul Chunduru <rahul.chunduru@...anix.com>
> Signed-off-by: Harshit Agarwal <harshit@...anix.com>
> Tested-by: Will Ton <william.ton@...anix.com>
> ---
You can add here (after the above three dashes), how this version is
different from the last version. The text below the dashes and before the
patch is ignored by git, but is useful for reviewers. For instance:
Changes since v1: https://lore.kernel.org/all/20250211054646.23987-1-harshit@nutanix.com/
- Removed the redundant checks that task != pick_next_pushable_task() already has
Notice I added a link to the previous version. This helps find the previous
version without having to make this version a reply to it.
> kernel/sched/rt.c | 54 +++++++++++++++++++++++------------------------
> 1 file changed, 26 insertions(+), 28 deletions(-)
>
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 4b8e33c615b1..4762dd3f50c5 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -1885,6 +1885,27 @@ static int find_lowest_rq(struct task_struct *task)
> return -1;
> }
>
Otherwise,
Reviewed-by: Steven Rostedt (Google) <rostedt@...dmis.org>
Peter, could you pick this up?
-- Steve
Powered by blists - more mailing lists