lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 22 Dec 2020 16:20:47 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Yu Zhao <yuzhao@...gle.com>
Cc:     Andrea Arcangeli <aarcange@...hat.com>,
        Andy Lutomirski <luto@...nel.org>,
        Peter Xu <peterx@...hat.com>,
        Nadav Amit <nadav.amit@...il.com>,
        linux-mm <linux-mm@...ck.org>,
        lkml <linux-kernel@...r.kernel.org>,
        Pavel Emelyanov <xemul@...nvz.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        stable <stable@...r.kernel.org>,
        Minchan Kim <minchan@...nel.org>,
        Will Deacon <will@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

On Tue, Dec 22, 2020 at 3:50 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> The rule is that the TLB flush has to be done before the page table
> lock is released.

I take that back. I guess it's ok as long as the mmap_sem is held for
writing. Then the TLB flush can be delayed until just before releasing
the mmap_sem. I think.

The stale TLB entries still mean that somebody else can write through
them in another thread, but as long as anybody who actually unmaps the
page (and frees it - think rmap etc) is being careful, mprotect()
itself can probably afford to be a bit laissez-faire.

So mprotect() itself should be ok, I think, because it takes things for writing.

Even with the mmap_sem held for writing, truncate and friends can see
the read-only page table entries (because they can look things up
using the file i_mmap thing instead), but then they rely on the page
table lock and they'll also be careful if they then change that PTE
and will force their own TLB flushes.

So I think a pending TLB flush outside the page table lock is fine -
but once again only if you hold the mmap_sem for writing. Not for
reading, because then the page tables need to be synchronized with the
TLB so that other readers don't see the not-yet-synchronized state.

It once again looks like it's just userfaultfd that would trigger this
due to the read-lock on the mmap_sem. And mprotect() itself is fine.

Am I missing something?

But apparently Nadav sees problems even with that lock changed to a
write lock. Navad?

           Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ