lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <X+O49HrcK1fBDk0Q@redhat.com>
Date:   Wed, 23 Dec 2020 16:39:00 -0500
From:   Andrea Arcangeli <aarcange@...hat.com>
To:     Yu Zhao <yuzhao@...gle.com>
Cc:     Andy Lutomirski <luto@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Peter Xu <peterx@...hat.com>,
        Nadav Amit <nadav.amit@...il.com>,
        linux-mm <linux-mm@...ck.org>,
        lkml <linux-kernel@...r.kernel.org>,
        Pavel Emelyanov <xemul@...nvz.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        stable <stable@...r.kernel.org>,
        Minchan Kim <minchan@...nel.org>,
        Will Deacon <will@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect

On Tue, Dec 22, 2020 at 08:36:04PM -0700, Yu Zhao wrote:
> Thanks for the details.

I hope we can find a way put the page_mapcount back where there's a
page_count right now.

If you're so worried about having to maintain a all defined well
documented (or to be documented even better if you ACK it)
marker/catcher for userfaultfd_writeprotect, I can't see how you could
consider to maintain the page fault safe against any random code
leaving too permissive TLB entries out of sync of the more restrictive
pte permissions as it was happening with clear_refs_write, which
worked by luck until page_mapcount was changed to page_count.

page_count is far from optimal, but it is a feature it finally allowed
us to notice that various code (clear_refs_write included apparently
even after the fix) leaves stale too permissive TLB entries when it
shouldn't.

The question is only which way you prefer to fix clear_refs_write and
I don't think we can deviate from those 3 methods that already exist
today. So clear_refs_write will have to pick one of those and
currently it's not falling in the same category with mprotect even
after the fix.

I think if clear_refs_write starts to take the mmap_write_lock and
really start to operate like mprotect, only then we can consider to
make userfaultfd_writeprotect also operate like mprotect.

Even then I'd hope we can at least be allowed to make it operate like
KSM write_protect_page for len <= HPAGE_PMD_SIZE, something that
clear_refs_write cannot do since it works in O(N) and tends to scan
everything at once, so there would be no point to optimize not to
defer the flush, for a process with a tiny amount of virtual memory
mapped.

vm86 also should be fixed to fall in the same category with mprotect,
since performance there is irrelevant.

Thanks,
Andrea

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ