lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <D7C6B8EE-7AFC-4F1E-8A59-E9573658146C@nutanix.com>
Date: Mon, 17 Feb 2025 16:36:47 +0000
From: Harshit Agarwal <harshit@...anix.com>
To: Steven Rostedt <rostedt@...dmis.org>
CC: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot
	<vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Valentin
 Schneider <vschneid@...hat.com>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>,
        Jon Kohler <jon@...anix.com>,
        Gauri
 Patwardhan <gauri.patwardhan@...anix.com>,
        Rahul Chunduru
	<rahul.chunduru@...anix.com>,
        Will Ton <william.ton@...anix.com>
Subject: Re: [PATCH v2] sched/rt: Fix race in push_rt_task

Hi Steve,

Thanks for the information. I realized my mistake this after updating this 
thread and then later I sent the patch as a separate thread here 
https://lore.kernel.org/lkml/20250214170844.201692-1-harshit@nutanix.com/
and included the link to v1 along with the changes made.

I had also added a comment on is_migration_disabled like you suggested  here. 
This separate patch addresses your comments already.
Sorry for the confusion.

Regards,
Harshit

> On Feb 17, 2025, at 7:50 AM, Steven Rostedt <rostedt@...dmis.org> wrote:
> 
> 
> FYI,
> 
> You should always send a new patch version as a separate thread. That's
> because they can get lost in the thread and makes it harder for maintainers
> to know what the next version of the patch is. I've picked the wrong patch
> version before because there was another version sent that I missed.
> 
> On Thu, 13 Feb 2025 17:54:34 +0000
> Harshit Agarwal <harshit@...anix.com> wrote:
> 
>> Solution
>> ========
>> The solution here is fairly simple. After obtaining the lock (at 4a),
>> the check is enhanced to make sure that the task is still at the head of
>> the pushable tasks list. If not, then it is anyway not suitable for
>> being pushed out. The fix also removes any conditions that are no longer
>> needed.
>> 
>> Testing
>> =======
>> The fix is tested on a cluster of 3 nodes, where the panics due to this
>> are hit every couple of days. A fix similar to this was deployed on such
>> cluster and was stable for more than 30 days.
> 
> May also want to add:
> 
>  Since 'is_migration_disabled()' a faster check than the others, it was moved
>  to be the first check for consistency.
> 
>> 
>> Co-developed-by: Jon Kohler <jon@...anix.com>
>> Signed-off-by: Jon Kohler <jon@...anix.com>
>> Co-developed-by: Gauri Patwardhan <gauri.patwardhan@...anix.com>
>> Signed-off-by: Gauri Patwardhan <gauri.patwardhan@...anix.com>
>> Co-developed-by: Rahul Chunduru <rahul.chunduru@...anix.com>
>> Signed-off-by: Rahul Chunduru <rahul.chunduru@...anix.com>
>> Signed-off-by: Harshit Agarwal <harshit@...anix.com>
>> Tested-by: Will Ton <william.ton@...anix.com>
>> ---
> 
> You can add here (after the above three dashes), how this version is
> different from the last version. The text below the dashes and before the
> patch is ignored by git, but is useful for reviewers. For instance:
> 
> Changes since v1: https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_all_20250211054646.23987-2D1-2Dharshit-40nutanix.com_&d=DwICAg&c=s883GpUCOChKOHiocYtGcg&r=QTPVhNgH716-zU_kPmte39o3vGFVupnGmmfiVBpq9PU&m=YHnhqY1UaVeahgoVKVWuMaCw-TJQQg9Sdhif34WqstcHXYe_sUHfix5ImyJwyIHl&s=lhfXqBLkfeyyKLXqPABmYuQZIqCZzHx0-dQk2i3k49w&e= 
> 
> - Removed the redundant checks that task != pick_next_pushable_task() already has
> 
> 
> Notice I added a link to the previous version. This helps find the previous
> version without having to make this version a reply to it.
> 
>> kernel/sched/rt.c | 54 +++++++++++++++++++++++------------------------
>> 1 file changed, 26 insertions(+), 28 deletions(-)
>> 
>> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
>> index 4b8e33c615b1..4762dd3f50c5 100644
>> --- a/kernel/sched/rt.c
>> +++ b/kernel/sched/rt.c
>> @@ -1885,6 +1885,27 @@ static int find_lowest_rq(struct task_struct *task)
>> return -1;
>> }
>> 
> 
> Otherwise,
> 
> Reviewed-by: Steven Rostedt (Google) <rostedt@...dmis.org>
> 
> Peter, could you pick this up?
> 
> -- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ