linux-kernel - Re: mm: deadlock between get_online_cpus/pcpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170208122612.wasq72hbj4nkh7y3@techsingularity.net>
Date:   Wed, 8 Feb 2017 12:26:12 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Michal Hocko <mhocko@...nel.org>, Christoph Lameter <cl@...ux.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Dmitry Vyukov <dvyukov@...gle.com>, Tejun Heo <tj@...nel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        syzkaller <syzkaller@...glegroups.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: mm: deadlock between get_online_cpus/pcpu_alloc

On Wed, Feb 08, 2017 at 01:02:07PM +0100, Thomas Gleixner wrote:
> On Wed, 8 Feb 2017, Michal Hocko wrote:
> > On Tue 07-02-17 23:25:17, Thomas Gleixner wrote:
> > > On Tue, 7 Feb 2017, Christoph Lameter wrote:
> > > > On Tue, 7 Feb 2017, Michal Hocko wrote:
> > > > 
> > > > > I am always nervous when seeing hotplug locks being used in low level
> > > > > code. It has bitten us several times already and those deadlocks are
> > > > > quite hard to spot when reviewing the code and very rare to hit so they
> > > > > tend to live for a long time.
> > > > 
> > > > Yep. Hotplug events are pretty significant. Using stop_machine_XXXX() etc
> > > > would be advisable and that would avoid the taking of locks and get rid of all the
> > > > ocmplexity, reduce the code size and make the overall system much more
> > > > reliable.
> > > 
> > > Huch? stop_machine() is horrible and heavy weight. Don't go there, there
> > > must be simpler solutions than that.
> > 
> > Absolutely agreed. We are in the page allocator path so using the
> > stop_machine* is just ridiculous. And, in fact, there is a much simpler
> > solution [1]
> > 
> > [1] http://lkml.kernel.org/r/20170207201950.20482-1-mhocko@kernel.org
> 
> Well, yes. It's simple, but from an RT point of view I really don't like
> it as we have to fix it up again.
> 
> On RT we solved the problem of the page allocator differently which allows
> us to do drain_all_pages() from the caller CPU as a side effect. That's
> interesting not only for RT, it's also interesting for NOHZ FULL scenarios
> because you don't inflict the work on the other CPUs.
> 
> https://git.kernel.org/cgit/linux/kernel/git/rt/linux-rt-devel.git/commit/?h=linux-4.9.y-rt-rebase&id=d577a017da694e29a06af057c517f2a7051eb305
> 

It may be worth noting that patches in Andrew's tree no longer disable
interrupts in the per-cpu allocator and now per-cpu draining will
be from workqueue context. The reasoning was due to the overhead of
the page allocator with figures included. Interrupts will bypass the
per-cpu allocator and use the irq-safe zone->lock to allocate from
the core.  It'll collide with the RT patch. Primary patch of interest is
http://www.ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-only-use-per-cpu-allocator-for-irq-safe-requests.patch

The draining from workqueue context may be a problem for RT but one
option would be to move the drain to only drain for high-order pages
after direct reclaim combined with only draining for order-0 if
__alloc_pages_may_oom is about to be called.

-- 
Mel Gorman
SUSE Labs