lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200311141135.kyonj52ogwns3rf3@e107158-lin.cambridge.arm.com>
Date:   Wed, 11 Mar 2020 14:11:36 +0000
From:   Qais Yousef <qais.yousef@....com>
To:     Pavan Kondeti <pkondeti@...eaurora.org>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 6/6] sched/rt: Fix pushing unfit tasks to a better CPU

On 03/11/20 16:23, Pavan Kondeti wrote:
> On Fri, Mar 06, 2020 at 05:51:13PM +0000, Qais Yousef wrote:
> > Hi Pavan
> > 
> > On 03/02/20 13:27, Qais Yousef wrote:
> > > If a task was running on unfit CPU we could ignore migrating if the
> > > priority level of the new fitting CPU is the *same* as the unfit one.
> > > 
> > > Add an extra check to select_task_rq_rt() to allow the push in case:
> > > 
> > > 	* p->prio == new_cpu.highest_priority
> > > 	* task_fits(p, new_cpu)
> > > 
> > > Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware")
> > > Signed-off-by: Qais Yousef <qais.yousef@....com>
> > > ---
> > 
> > Can you please confirm if you have any objection to this patch? Without it
> > I see large delays in the 2 tasks test like I outlined in [1]. It wasn't clear
> > from [2] whether you are in agreement now or not.
> > 
> > [1] https://lore.kernel.org/lkml/20200217135306.cjc2225wdlwqiicu@e107158-lin.cambridge.arm.com/
> > [2] https://lore.kernel.org/lkml/20200227033608.GN28029@codeaurora.org/
> > 
> 
> I am not very sure about this. Like we discussed, this patch is addressing a
> specific scenario i.e two equal prio tasks waking at the same time. We allow
> the packing so that task_woken_rt() spread the tasks. The enqueue operation
> is waste here.
> 
> At the same time, I can't think of a better alternative. Retrying
> find_lowest_rq() may still give the same result until the previous task
> is fully woken on the CPU.
> 
> btw, the commit description does not talk about the race at all. If there is
> no race, we won't even end up in this scenario i.e find_lowest_rq() may simply
> return -1.

Josh has a new API that can help fix both the thundering herd issue and this
one too.

https://lore.kernel.org/lkml/20200311010113.136465-1-joshdon@google.com/

I did try to have a stab at it but my implementation wasn't as good as Josh.

I think if we get this function in and make find_lowest_rq() use it then we
should be okay when multiple tasks wakeup simultaneously. It's not bullet
proof, but good enough, me thinks.

Thoughts?

Thanks

--
Qais Yousef

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ