lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171031154546.ouryhw4rtpbrch2f@dhcp22.suse.cz>
Date:   Tue, 31 Oct 2017 16:45:46 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Byungchul Park <byungchul.park@....com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        syzbot 
        <bot+e7353c7141ff7cbb718e4c888a14fa92de41ebaa@...kaller.appspotmail.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dan Williams <dan.j.williams@...el.com>,
        Johannes Weiner <hannes@...xchg.org>, Jan Kara <jack@...e.cz>,
        jglisse@...hat.com, LKML <linux-kernel@...r.kernel.org>,
        linux-mm@...ck.org, shli@...com, syzkaller-bugs@...glegroups.com,
        Thomas Gleixner <tglx@...utronix.de>,
        Vlastimil Babka <vbabka@...e.cz>, ying.huang@...el.com,
        kernel-team@....com, David Herrmann <dh.herrmann@...il.com>
Subject: Re: possible deadlock in lru_add_drain_all

[CC David Herrmann for shmem_wait_for_pins. The thread starts
 http://lkml.kernel.org/r/089e0825eec8955c1f055c83d476@google.com
 with the callchains explained http://lkml.kernel.org/r/20171030151009.ip4k7nwan7muouca@hirez.programming.kicks-ass.net
 for shmem_wait_for_pins involvement see below]

On Tue 31-10-17 16:25:32, Peter Zijlstra wrote:
> On Tue, Oct 31, 2017 at 02:13:33PM +0100, Michal Hocko wrote:
> 
> > > I can indeed confirm it's running old code; cpuhp_state is no more.
> > 
> > Does this mean the below chain is no longer possible with the current
> > linux-next (tip)?
> 
> I see I failed to answer this; no it will happen but now reads like:
> 
> 	s/cpuhp_state/&_up/
> 
> Where we used to have a single lock protecting the hotplug stuff, we now
> have 2, one for bringing stuff up and one for tearing it down.
> 
> This got rid of lock cycles that included cpu-up and cpu-down parts;
> those are false positives because we cannot do cpu-up and cpu-down
> concurrently.
> 
> But this report only includes a single (cpu-up) part and therefore is
> not affected by that change other than a lock name changing.

Hmm, OK. I have quickly glanced through shmem_wait_for_pins and I fail
to see why it needs lru_add_drain_all at all. All we should care about
is the radix tree and the lru cache only cares about the proper
placement on the LRU list which is not checked here. I might be missing
something subtle though. David?

We've had some MM vs. hotplug issues. See e.g. a459eeb7b852 ("mm,
page_alloc: do not depend on cpu hotplug locks inside the allocator"),
so I suspect we might want/need to do similar for lru_add_drain_all.
It feels like I've already worked on that but for the live of mine I
cannot remember.

Anyway, this lock dependecy is subtle as hell and I am worried that we
might have way too many of those. We have so many callers of
get_online_cpus that dependecies like this are just waiting to blow up.

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ