[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <39d17db6-0f8a-0e54-289b-85b9baf1e936@redhat.com>
Date: Wed, 16 Feb 2022 20:24:12 +0100
From: David Hildenbrand <david@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
Oded Gabbay <oded.gabbay@...il.com>
Cc: Jason Gunthorpe <jgg@...pe.ca>, Jan Kara <jack@...e.cz>,
John Hubbard <jhubbard@...dia.com>,
Leon Romanovsky <leonro@...dia.com>,
Linux-MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"Maya B . Gokhale" <gokhale2@...l.gov>,
Yang Shi <yang.shi@...ux.alibaba.com>,
Marty Mcfadden <mcfadden8@...l.gov>,
Kirill Shutemov <kirill@...temov.name>,
Oleg Nesterov <oleg@...hat.com>, Jann Horn <jannh@...gle.com>,
Kirill Tkhai <ktkhai@...tuozzo.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Christoph Hellwig <hch@....de>,
Andrew Morton <akpm@...ux-foundation.org>,
Daniel Vetter <daniel.vetter@...ll.ch>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Peter Xu <peterx@...hat.com>
Subject: Re: [PATCH 1/4] mm: Trial do_wp_page() simplification
On 16.02.22 20:04, Linus Torvalds wrote:
> [ Added David Hildenbrand to the participants. David, see
>
> https://bugzilla.kernel.org/show_bug.cgi?id=215616
>
> for details ]
>
Thanks for sharing.
> On Wed, Feb 16, 2022 at 8:59 AM Oded Gabbay <oded.gabbay@...il.com> wrote:
>>
>> All the details are in the bug, but the bottom line is that somehow,
>> this patch causes corruption when the numa balancing feature is
>> enabled AND we don't use process affinity AND we use GUP to pin pages
>> so our accelerator can DMA to/from system memory.
>
> Hmm. I thought all the remaining issues were related to THP - and
> David Hildenbrand had a series to fix those up.
What I shared so far recently [1] was part 1 of my COW fixes to fix the
COW security issues -- missed COW. This fixes 1) of [2].
Part 2 is around fixing "wrong COW" for FOLL_PIN a.k.a. memory
corruption. That's essentially what PageAnonExclusive() will be all
about, making sure that we don't lose synchronicity between GUP and mm
due to a wrong COW. This will fix 3) of [2]
Part 3 is converting O_DIRECT to use FOLL_PIN instead of FOLL_GET to
similarly fix "wrong COW" for O_DIRECT. John is working on that. This
will fix 2) of [2].
>
> The fact that it also shows up with numa balancing is a bit
> unfortunate, because I think that means that that patch series may not
> have caught that case.
>
> That said - what does "we use GUP to pin pages" mean? Does it actually
> use the pinning logic, or just regular old GUP?
If it uses FOLL_PIN it might be handled by part 2, if it uses O_DIRECT
magic it might be covered by part 3. If neither of both, more work might
be needed to convert it to FOLL_PIN, as with the new COW logic we won't
be able to have the same guarantees for FOLL_GET as we'll have for
FOLL_PIN (which is a difference to our original plans to fix it all [3]).
[1] https://lkml.kernel.org/r/20220126095557.32392-1-david@redhat.com
[2]
https://lore.kernel.org/all/3ae33b08-d9ef-f846-56fb-645e3b9b4c66@redhat.com/
[3] https://lore.kernel.org/all/20211217113049.23850-1-david@redhat.com/T/#u
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists