lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181025000707.GR3109@worktop.c.hoisthospitality.com>
Date:   Thu, 25 Oct 2018 02:07:07 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
Cc:     Ingo Molnar <mingo@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Rik van Riel <riel@...riel.com>,
        Yi Wang <wang.yi59@....com.cn>, zhong.weidong@....com.cn,
        Yi Liu <liu.yi24@....com.cn>,
        Frederic Weisbecker <frederic@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v2] sched/core: Don't mix isolcpus and housekeeping CPUs

On Wed, Oct 24, 2018 at 04:00:02PM +0530, Srikar Dronamraju wrote:
> * Peter Zijlstra <peterz@...radead.org> [2018-10-24 12:03:23]:
> 
> > It appears to me the for_each_online_node() iteration in
> > task_numa_migrate() needs an addition test to see if the selected node
> > has any CPUs in the relevant sched_domain _at_all_.
> > 
> 
> Yes, this should work.
> Yi Wang does this extra check a little differently.
> 
> http://lkml.kernel.org/r/1540177516-38613-1-git-send-email-wang.yi59@zte.com.cn

That's completely broken. Nothing in the numa balancing path uses that
variable and afaict preemption is actually enabled where that's used, so
using that per-cpu variable at all is broken.

> However the last time I had posted you didn't like that approach.
> http://lkml.kernel.org/r/20170406073659.y6ubqriyshax4v4m@hirez.programming.kicks-ass.net

You again checked against isolated_map there, which is not the immediate
problem.

Both of you are fixing symptoms, not the cause.

> Further, I would think the number of times, we would be calling
> sched_setaffinity would be far less than task_numa_migrate().
> In the regular case, where we never have isolcpus, we add this extra check.

But it doesn't solve the problem.

You can create multiple partitions with cpusets but still have an
unbound task in the root cgroup. That would suffer the exact same
problems.

Thing is, load-balancing, of any kind, should respect sched_domains, and
currently numa balancing barely looks at it.

The proposed patch puts the minimal constraints on the numa balancer to
respect sched_domains; but doesn't yet correctly deal with hotplug.

isolcpus is just one case that goes wrong.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ