lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wiVN-+P1vOCSMyfGwYQD3hF7A18OJyXgpiMwGDfMaU+8w@mail.gmail.com>
Date:   Tue, 11 Aug 2020 16:10:57 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Peter Xu <peterx@...hat.com>
Cc:     Andrea Arcangeli <aarcange@...hat.com>,
        Linux-MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Marty Mcfadden <mcfadden8@...l.gov>,
        "Maya B . Gokhale" <gokhale2@...l.gov>,
        Jann Horn <jannh@...gle.com>, Christoph Hellwig <hch@....de>,
        Oleg Nesterov <oleg@...hat.com>,
        Kirill Shutemov <kirill@...temov.name>, Jan Kara <jack@...e.cz>
Subject: Re: [PATCH v3] mm/gup: Allow real explicit breaking of COW

On Tue, Aug 11, 2020 at 2:43 PM Peter Xu <peterx@...hat.com> wrote:
>
> I don't know good enough on the reuse refactoring patch (which at least looks
> functionally correct), but... IMHO we still need the enforced cow logic no
> matter we refactor the page reuse logic or not, am I right?
>
> Example:
>
>   - Process A & B shares private anonymous page P0
>
>   - Process A does READ of get_user_pages() on page P0
>
>   - Process A (e.g., another thread of process A, or as long as process A still
>     holds the page P0 somehow) writes to page P0 which triggers cow, so for
>     process A the page P0 is replaced by P1 with identical content
>
> Then process A still keeps the reference to page P0 that potentially belongs to
> process B or others?

The COW from process A will indeed keep a reference to page P0 (for
whatever nefarious kernel use it did the GUP for). And yes, that page
is still mapped into process B.

HOWEVER.

Since the GUP will be incrementing the reference count of said page,
the actual problem has gone away. Because the GUP copy won't be
modifying the page (it's a read lookup), and as long as process B only
reads from the page, we're happily sharing a read-only page.

And if process B now starts writing to it, the "page_count()" check at
fault time will trigger, and process B will do a COW copy.

So now we'll have three copies of the page: the original one is being
kept for the GUP, and both A and B did their COW copies in their page
tables.

And that's exactly what we wanted - they are all now containing
different data, after all.

The problem with the *current* code is that we don't actually look at
the page count at all, only the mapping count, so the GUP reference
count is basically invisible.

And the reason we don't look too closely at the page count is that
there's a lot of incidental things that can affect it, like the whole
KSM reference, the swap cache reference, other GUP users etc etc. So
we historically have tried to maximize the amount of sharing we can
do.

But that "maximize sharing" is really complicated.

That's the big change of that simplification patch - it's basically
saying that "whenever _anything_ else has a reference to that page,
we'll just copy and not even try to share".

             Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ