lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211221235402.GH1432915@nvidia.com>
Date:   Tue, 21 Dec 2021 19:54:02 -0400
From:   Jason Gunthorpe <jgg@...dia.com>
To:     David Hildenbrand <david@...hat.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Nadav Amit <namit@...are.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Hugh Dickins <hughd@...gle.com>,
        David Rientjes <rientjes@...gle.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        John Hubbard <jhubbard@...dia.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Yang Shi <shy828301@...il.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Matthew Wilcox <willy@...radead.org>,
        Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
        Michal Hocko <mhocko@...nel.org>,
        Rik van Riel <riel@...riel.com>,
        Roman Gushchin <guro@...com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Peter Xu <peterx@...hat.com>,
        Donald Dutile <ddutile@...hat.com>,
        Christoph Hellwig <hch@....de>,
        Oleg Nesterov <oleg@...hat.com>, Jan Kara <jack@...e.cz>,
        Linux-MM <linux-mm@...ck.org>,
        "open list:KERNEL SELFTEST FRAMEWORK" 
        <linux-kselftest@...r.kernel.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>
Subject: Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via
 FAULT_FLAG_UNSHARE (!hugetlb)

On Tue, Dec 21, 2021 at 04:19:33PM +0100, David Hildenbrand wrote:

> >> Note that I am trying to make also any kind of R/O pins on an anonymous
> >> page work as expected as well, to fix any kind of GUP after fork() and
> >> GUP before fork(). So taking a R/O pin on an !PageAnonExclusive() page
> >> similarly has to make sure that the page is exclusive -- even if it's
> >> mapped R/O (!).
> > 
> > Why? AFAIK we don't have bugs here. If the page is RO and has an
> > elevated refcount it cannot be 'PageAnonExclusive' and so any place
> > that wants to drop the WP just cannot. What is the issue?

> But what I think you actually mean is if we want to get R/O pins
> right.

What I ment was a page that is GUP'd RO, is not PageAnonExclusive and
has an elevated refcount. Those cannot be transformed to
PageAnonExclusive, or re-used during COW, but also they don't have
problems today. Either places are like O_DIRECT read and are tolerant
of a false COW, or they are broken like VFIO and should be using
FOLL_FORCE|FOLL_WRITE, which turns them into a WRITE and then we know
they get PageAnonExclusive.

So, the swap issue is fixed directly with PageAnonExclusive and no
change to READ GUP is required, at least in your #1 scenario, AFAICT..

> There are 2 models, leaving FOLL_FAULT_UNSHARE out of the picture for now:
> 
> 1) Whenever mapping an anonymous page R/W (after COW, during ordinary
> fault, on swapin), we mark the page exclusive. We must never lose the
> PageAnonExclusive bit, not during migration, not during swapout.

I prefer this one as well.

It allows us to keep Linus's simple logic that refcount == 1 means
always safe to re-use, no matter what.

And refcount != 1 goes on to consider the additional bit to decide
what to do. The simple bit really means 'we know this page has one PTE
so ignore the refcount for COW reuse decisions'.

> fork() will process the bit for each and every process, even if there
> was no GUP, and will copy if there are additional references.

Yes, just like it does today already for mapcount.

> 2) Whenever GUP wants to pin/ref a page, we try marking it exclusive. We
> can lose the PageAnonExclusive bit during migration and swapout, because
> that can only happen when there are no additional references.

I haven't thought of a way this is achievable.

At least not without destroying GUP fast..

Idea #2 is really a "this page is GUP'd" flag with some sneaky logic
to clear it.  That comes along with all the races too because as an
idea it is fundamentally about GUP which runs without locks.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ