lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 12 Jan 2021 09:07:31 -0800 From: Andy Lutomirski <luto@...nel.org> To: Linus Torvalds <torvalds@...ux-foundation.org> Cc: John Hubbard <jhubbard@...dia.com>, Andrea Arcangeli <aarcange@...hat.com>, Andrew Morton <akpm@...ux-foundation.org>, Linux-MM <linux-mm@...ck.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, Yu Zhao <yuzhao@...gle.com>, Andy Lutomirski <luto@...nel.org>, Peter Xu <peterx@...hat.com>, Pavel Emelyanov <xemul@...nvz.org>, Mike Kravetz <mike.kravetz@...cle.com>, Mike Rapoport <rppt@...ux.vnet.ibm.com>, Minchan Kim <minchan@...nel.org>, Will Deacon <will@...nel.org>, Peter Zijlstra <peterz@...radead.org>, Hugh Dickins <hughd@...gle.com>, "Kirill A. Shutemov" <kirill@...temov.name>, Matthew Wilcox <willy@...radead.org>, Oleg Nesterov <oleg@...hat.com>, Jann Horn <jannh@...gle.com>, Kees Cook <keescook@...omium.org>, Leon Romanovsky <leonro@...dia.com>, Jason Gunthorpe <jgg@...pe.ca>, Jan Kara <jack@...e.cz>, Kirill Tkhai <ktkhai@...tuozzo.com>, Nadav Amit <nadav.amit@...il.com>, Jens Axboe <axboe@...nel.dk> Subject: Re: [PATCH 0/1] mm: restore full accuracy in COW page reuse On Mon, Jan 11, 2021 at 2:18 PM Linus Torvalds <torvalds@...ux-foundation.org> wrote: > > On Mon, Jan 11, 2021 at 11:19 AM Linus Torvalds > <torvalds@...ux-foundation.org> wrote: > Actually, what I think might be a better model is to actually > strengthen the rules even more, and get rid of GUP_PIN_COUNTING_BIAS > entirely. > > What we could do is just make a few clear rules explicit (most of > which we already basically hold to). Starting from that basic > > (a) Anonymous pages are made writable (ie COW) only when they have a > page_count() of 1 Seems reasonable to me. > > That very simple rule then automatically results in the corollary > > (b) a writable page in a COW mapping always starts out reachable > _only_ from the page tables Seems reasonable. I guess that if the COW is triggered by GUP, then it starts out reachable only from the page tables but then because reachable through GUP very soon thereafter. > > and now we could have a couple of really simple new rules: > > (c) we never ever make a writable page in a COW mapping read-only > _unless_ it has a page_count() of 1 I don't love this. Having mprotect() fail in a multithreaded process because another thread happens to be doing a short-lived IO seems like it may result in annoying intermittent bugs. As I understand it, the issue is that the way we determine that we need to COW a COWable page is that we see that it's read-only. It would be nice if we could separately track "the VMA allows writes" and "this PTE points to a page that is private to the owning VMA", but maybe there's no bit available for the latter other than looking at RO vs RW directly. > > (d) we never create a swap cache page out of a writable COW mapping page > > Now, if you combine these rules, the whole need for the > GUP_PIN_COUNTING_BIAS basically goes away. > > Why? Because we know that the _only_ thing that can elevate the > refcount of a writable COW page is GUP - we'll just make sure nothing > else touches it. How common is !FOLL_WRITE GUP? We could potentially say that a short-term !FOLL_WRITE GUP is permitted on an RO COW page and that a subsequent COW on the page will wait for the GUP to go away. This might be too big a can of worms for the benefit it would provide, though. --Andy
Powered by blists - more mailing lists