lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 9 Feb 2017 20:15:47 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Christoph Lameter <cl@...ux.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Vlastimil Babka <vbabka@...e.cz>,
        Dmitry Vyukov <dvyukov@...gle.com>, Tejun Heo <tj@...nel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        syzkaller <syzkaller@...glegroups.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: mm: deadlock between get_online_cpus/pcpu_alloc

On Thu 09-02-17 11:22:49, Cristopher Lameter wrote:
> On Thu, 9 Feb 2017, Thomas Gleixner wrote:
> 
> > You are just not getting it, really.
> >
> > The problem is that this for_each_online_cpu() is racy against a concurrent
> > hot unplug and therefor can queue stuff for a not longer online cpu. That's
> > what the mm folks tried to avoid by preventing a CPU hotplug operation
> > before entering that loop.
> 
> With a stop machine action it is NOT racy because the machine goes into a
> special kernel state that guarantees that key operating system structures
> are not touched. See mm/page_alloc.c's use of that characteristic to build
> zonelists. Thus it cannot be executing for_each_online_cpu and related
> tasks (unless one does not disable preempt .... but that is a given if a
> spinlock has been taken)..

Christoph, you are completely ignoring the reality and the code. There
is no need for stop_machine nor it is helping anything. As the matter
of fact there is a synchronization with the cpu hotplug needed if you
want to make a per-cpu specific operations. get_online_cpus is the
most straightforward and heavy weight way to do this synchronization
but not the only one. As the patch [1] describes we do not really need
get_online_cpus in drain_all_pages because we can do _better_. But
this is not in any way a generic thing applicable to other code paths.

If you disagree then you are free to post patches but hand waving you
are doing here is just wasting everybody's time. So please cut it here
unless you have specific proposals to improve the current situation.

Thanks!

[1] http://lkml.kernel.org/r/20170207201950.20482-1-mhocko@kernel.org
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ