lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <X/icBLF59bREm97b@redhat.com>
Date:   Fri, 8 Jan 2021 12:53:08 -0500
From:   Andrea Arcangeli <aarcange@...hat.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Will Deacon <will@...nel.org>, Linux-MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Yu Zhao <yuzhao@...gle.com>, Andy Lutomirski <luto@...nel.org>,
        Peter Xu <peterx@...hat.com>,
        Pavel Emelyanov <xemul@...nvz.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        Minchan Kim <minchan@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Hugh Dickins <hughd@...gle.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Matthew Wilcox <willy@...radead.org>,
        Oleg Nesterov <oleg@...hat.com>, Jann Horn <jannh@...gle.com>,
        Kees Cook <keescook@...omium.org>,
        John Hubbard <jhubbard@...dia.com>,
        Leon Romanovsky <leonro@...dia.com>,
        Jason Gunthorpe <jgg@...pe.ca>, Jan Kara <jack@...e.cz>,
        Kirill Tkhai <ktkhai@...tuozzo.com>,
        Nadav Amit <nadav.amit@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 2/2] mm: soft_dirty: userfaultfd: introduce
 wrprotect_tlb_flush_pending

On Fri, Jan 08, 2021 at 09:39:56AM -0800, Linus Torvalds wrote:
> page_count() is simply the right and efficient thing to do.
> 
> You talk about all these theoretical inefficiencies for cases like
> zygote and page pinning, which have never ever been seen except as a
> possible attack vector.

Do you intend to eventually fix the zygote vmsplice case or not?
Because in current upstream it's not fixed currently using the
enterprise default config.

> Stop talking about irrelevant things. Stop trying to "optimize" things
> that never happen and don't matter.
> 
> Instead, what matters is the *NORMAL* VM flow.
> 
> Things like COW.
> 
> Things like "oh, now that we check just the page count, we don't even
> need the page lock for the common case any more".
> 
> > For the long term, I can't see how using page_count in do_wp_page is a
> > tenable proposition,
> 
> I think you should re-calibrate your expectations, and accept that
> page_count() is the right thing to do, and live with it.
> 
> And instead of worrying about irrelevant special-case code, start

Irrelevant special case as in: long term GUP pin on the memory?

Or irrelevant special case as in: causing secondary MMU to hit silent
data loss if a pte is ever wrprotected (arch code, vm86, whatever, all
under mmap_write_lock of course).

> worrying about the code that gets triggered tens of thousands of times
> a second, on regular loads, without anybody doing anything odd or
> special at all, just running plain and normal shell scripts or any
> other normal traditional load.
> 
> Those irrelevant special cases should be simple and work, not badly
> optimized to the point where they are buggy. And they are MUCH LESS
> IMPORTANT than the normal VM code, so if somebody does something odd,
> and it's slow, then that is the problem for the _odd_ case, not for
> the normal codepaths.
> 
> This is why I refuse to add crazy new special cases to core code. Make
> the rules simple and straightforward, and make the code VM work well.

New special cases? which new cases?

There's nothing new here besides the zygote that wasn't fully fixed
with 09854ba94c6aad7886996bfbee2530b3d8a7f4f4 and is actually the only
new case I can imagine where page_count actually isn't a regression.

All old cases that you seem to refer as irrelevant and are in
production in v4.18, I don't see anything new here.

Even for the pure COW case with zero GUP involvement an hugepage with
cows happening in different processes, would forever hit wp_copy_page
since count is always > 1 despite mapcount can be 1 for all
subpages. A simple app doing fork/exec would forever copy all memory
in the parent even after the exec is finished.

Thanks,
Andrea

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ