lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.11.1807301410470.4805@eggly.anvils>
Date:   Mon, 30 Jul 2018 14:53:21 -0700 (PDT)
From:   Hugh Dickins <hughd@...gle.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
cc:     "Kirill A. Shutemov" <kirill@...temov.name>,
        Hugh Dickins <hughd@...gle.com>,
        Matthew Wilcox <willy@...radead.org>,
        Amit Pundir <amit.pundir@...aro.org>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Oleg Nesterov <oleg@...hat.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        John Stultz <john.stultz@...aro.org>,
        linux-mm <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        youling257@...il.com
Subject: Re: Linux 4.18-rc7

On Mon, 30 Jul 2018, Linus Torvalds wrote:
> On Mon, Jul 30, 2018 at 6:01 AM Kirill A. Shutemov <kirill@...temov.name> wrote:
> >
> > I think I missed vma_set_anonymous() somewhere, but I fail to see where.
> 
> Honestly, by now we just need to revert that commit.
> 
> It's not even clear that it was a good idea to begin with. The rest of
> the commits were cleanups, this one was driven by a incorrect
> VM_BUG_ON() that triggered, and that checked "vma_is_anonymous(vma)"
> without any explanations of wht it should matter.
> 
> I think the biggest problem with vma_is_anonymous() may be its name,
> not what it does.
> 
> What the code historically *did* (and what vma_is_anonymous() checks)
> is not "is this anonymous", but rather "does this have any special
> operations associated with it".
> 
> The two are similar. But people have grown opinions about exactly what
> "anonymous" means. If we had named it just "no_vm_ops()", we wouldn't
> have random crazy checks for "vma_is_anonymous()" in places where it
> makes no sense.
> 
> So what I think we want a real explanation for is why people who use
> "vma_is_anonymous()" care. Instead of trying to change its very
> historical meaning, we should look at the users, and perhaps change
> its name.

You make a very good point on the naming. Especially confusing when
layered on top of "we call shmem 'file' here, but 'anon' there".

I have no problem with reverting -rc7's vma_is_anonymous() series.

> 
> In this case, for example, I think the *real* problem was described by
> commit 684283988f70 ("huge pagecache: mmap_sem is unlocked when
> truncation splits pmd"), and the problem is that an existing check
> that required that mmap_sem was held was changed to say "only for
> anonymous mappings".
> 
> But the fact is, you can truncate mappings that don't have any ops just *fine*.
> 
> So maybe that original BUG() was entirely bogus to begin with, and it
> shouldn't exist at all?
> 
> Or maybe the code should test "do I have a vm_file" instead of testing
> "do I have vm_ops"?
> 
> What's the problem with just doing split_huge_pmd() there when it's a
> pmd_trans_huge or pmd_devmap pmd? Why is that VM_BUG_ON_VMA() there in
> the first place? Why are allegedly "anonymous" mappings so special
> here for locking?
> 
> Adding a few more people to the cc, they were involved the last that
> time VM_BUG_ON_VMA() was modified.
> 
> New people: see commit bfd40eaff5ab ("mm: fix vma_is_anonymous()
> false-positives") for details. Right now I think it's getting
> reverted, but the oops explanation in the commit is about that
> 
>             kernel BUG at mm/memory.c:1422!
> 
> which was/is debatable and seems to make no sense (and definitely is
> still triggerable despite that commit 684283988f70 ("huge pagecache:
> mmap_sem is unlocked when truncation splits pmd") that limited it a
> bit - but I think it didn't limit it enough.

I'm all for deleting that VM_BUG_ON_VMA() in zap_pmd_range(), it was
just a compromise with those who wanted to keep something there;
I don't think we even need a WARN_ON_ONCE() now.

It's historical: back in the day when only (hugetlbfs which never
gets there, and) anon THP used huge pmds for userspace, it did half-
document an obscure assumption underlying __split_huge_pmd(), and
IIRC was added in response to a trinity bug catch in that area.

(It remains quite interesting how exit_mmap() does not come that way,
and most syscalls split the vma beforehand in vma_adjust(): it's mostly
about madvise(,,MADV_DONTNEED), perhaps others now, which zap ptes
without prior splitting.)

Once DAX and huge tmpfs came in, the BUG had to be scrapped or weakened
because truncation and hole-punching can come that way without mmap_sem.
And perhaps other drivers since are taking advantage of huge pmds now,
with other assumptions.

But I have not yet taken a look to see where the -rc7 changes actually
go wrong: I'll spend a little while looking, but don't expect to find it -
certainly don't wait on me, I'll only speak up if I find something.

Hugh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ