linux-kernel - Re: [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180824150858.GB4244@redhat.com>
Date:   Fri, 24 Aug 2018 11:08:59 -0400
From:   Jerome Glisse <jglisse@...hat.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Christian König <christian.koenig@....com>,
        kvm@...r.kernel.org,
        Radim Krčmář <rkrcmar@...hat.com>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
        Sudeep Dutt <sudeep.dutt@...el.com>,
        dri-devel@...ts.freedesktop.org, linux-mm@...ck.org,
        Andrea Arcangeli <aarcange@...hat.com>,
        "David (ChunMing) Zhou" <David1.Zhou@....com>,
        Dimitri Sivanich <sivanich@....com>,
        Jason Gunthorpe <jgg@...pe.ca>, linux-rdma@...r.kernel.org,
        amd-gfx@...ts.freedesktop.org, David Airlie <airlied@...ux.ie>,
        Doug Ledford <dledford@...hat.com>,
        David Rientjes <rientjes@...gle.com>,
        xen-devel@...ts.xenproject.org, intel-gfx@...ts.freedesktop.org,
        Jani Nikula <jani.nikula@...ux.intel.com>,
        Leon Romanovsky <leonro@...lanox.com>,
        Rodrigo Vivi <rodrigo.vivi@...el.com>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Juergen Gross <jgross@...e.com>,
        Mike Marciniszyn <mike.marciniszyn@...el.com>,
        Dennis Dalessandro <dennis.dalessandro@...el.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Ashutosh Dixit <ashutosh.dixit@...el.com>,
        Alex Deucher <alexander.deucher@....com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Felix Kuehling <felix.kuehling@....com>
Subject: Re: [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

On Fri, Aug 24, 2018 at 02:33:41PM +0200, Michal Hocko wrote:
> On Fri 24-08-18 14:18:44, Christian König wrote:
> > Am 24.08.2018 um 14:03 schrieb Michal Hocko:
> > > On Fri 24-08-18 13:57:52, Christian König wrote:
> > > > Am 24.08.2018 um 13:52 schrieb Michal Hocko:
> > > > > On Fri 24-08-18 13:43:16, Christian König wrote:
> > > [...]
> > > > > > That won't work like this there might be multiple
> > > > > > invalidate_range_start()/invalidate_range_end() pairs open at the same time.
> > > > > > E.g. the lock might be taken recursively and that is illegal for a
> > > > > > rw_semaphore.
> > > > > I am not sure I follow. Are you saying that one invalidate_range might
> > > > > trigger another one from the same path?
> > > > No, but what can happen is:
> > > > 
> > > > invalidate_range_start(A,B);
> > > > invalidate_range_start(C,D);
> > > > ...
> > > > invalidate_range_end(C,D);
> > > > invalidate_range_end(A,B);
> > > > 
> > > > Grabbing the read lock twice would be illegal in this case.
> > > I am sorry but I still do not follow. What is the context the two are
> > > called from?
> > 
> > I don't have the slightest idea.
> > 
> > > Can you give me an example. I simply do not see it in the
> > > code, mostly because I am not familiar with it.
> > 
> > I'm neither.
> > 
> > We stumbled over that by pure observation and after discussing the problem
> > with Jerome came up with this solution.
> > 
> > No idea where exactly that case comes from, but I can confirm that it indeed
> > happens.
> 
> Thiking about it some more, I can imagine that a notifier callback which
> performs an allocation might trigger a memory reclaim and that in turn
> might trigger a notifier to be invoked and recurse. But notifier
> shouldn't really allocate memory. They are called from deep MM code
> paths and this would be extremely deadlock prone. Maybe Jerome can come
> up some more realistic scenario. If not then I would propose to simplify
> the locking here. We have lockdep to catch self deadlocks and it is
> always better to handle a specific issue rather than having a code
> without a clear indication how it can recurse.

Multiple concurrent mmu notifier, for overlapping range or not, is
common (each concurrent threads can trigger some). So you might have
multiple invalidate_range_start() in flight for same mm and thus might
complete in different order (invalidate_range_end()). IIRC this is
what this lock was trying to protect against.

I can't think of a reason for recursive mmu notifier call right now.
I will ponder see if i remember something about it.

Cheers,
Jérôme