lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 10 Mar 2021 20:28:07 -0400
From:   Jason Gunthorpe <jgg@...pe.ca>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, David Rientjes <rientjes@...gle.com>,
        Ben Gardon <bgardon@...gle.com>,
        Michal Hocko <mhocko@...e.com>,
        Jérôme Glisse <jglisse@...hat.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Dimitri Sivanich <dimitri.sivanich@....com>
Subject: Re: [PATCH] mm/oom_kill: Ensure MMU notifier range_end() is paired
 with range_start()

On Wed, Mar 10, 2021 at 01:31:17PM -0800, Sean Christopherson wrote:
> Invoke the MMU notifier's .invalidate_range_end() callbacks even if one
> of the .invalidate_range_start() callbacks failed.  If there are multiple
> notifiers, the notifier that did not fail may have performed actions in
> its ...start() that it expects to unwind via ...end().  Per the
> mmu_notifier_ops documentation, ...start() and ...end() must be paired.

No this is not OK, if invalidate_start returns EBUSY invalidate_end
should *not* be called.

As you observed:
 
> The only in-kernel usage that is fatally broken is the SGI UV GRU driver,
> which effectively blocks and sleeps fault handlers during ...start(), and
> unblocks/wakes the handlers during ...end().  But, the only users that
> can fail ...start() are the i915 and Nouveau drivers, which are unlikely
> to collide with the SGI driver.

It used to be worse but I've since moved most of the other problematic
users to the itree notifier which doesn't have the problem.

> KVM is the only other user of ...end(), and while KVM also blocks fault
> handlers in ...start(), the fault handlers do not sleep and originate in

KVM will have its mmu_notifier_count become imbalanced:

static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
                                        const struct mmu_notifier_range *range)
{
        kvm->mmu_notifier_count++;

static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn,
                                        const struct mmu_notifier_range *range)
{
        kvm->mmu_notifier_count--;

Which I believe is fatal to kvm? These notifiers certainly do not only
happen at process exit.

So, both of the remaining _end users become corrupted with this patch!

I've tried to fix this before, the only thing that seems like it will
work is to sort the hlist and only call ends that have succeeded their
starts by comparing pointers with <.

This is because the hlist can have items removed concurrently under
SRCU so there is no easy way to compute the subset that succeeded in
calling start.

I had a prior effort to just ban more than 1 hlist notifier with end,
but it turns out kvm on ARM uses two all the time (IIRC)

> Found by inspection.  Verified by adding a second notifier in KVM
> that

AFAIK it is a non-problem in real life because kvm is not mixed with
notifier_start's that fail (and GRU is dead?). Everything else was
fixed by moving to itree.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ