[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <96e49bf1-cd00-318b-c5a5-41279e223f27@nvidia.com>
Date: Sat, 29 Oct 2022 13:42:23 -0700
From: John Hubbard <jhubbard@...dia.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Nadav Amit <nadav.amit@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
Jann Horn <jannh@...gle.com>, X86 ML <x86@...nel.org>,
Matthew Wilcox <willy@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
kernel list <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>,
Andrea Arcangeli <aarcange@...hat.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
jroedel@...e.de, ubizjak@...il.com,
Alistair Popple <apopple@...dia.com>
Subject: Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment
On 10/29/22 13:30, Linus Torvalds wrote:
>> I can think of three options:
>>
>> (a) filesystems just deal with it
>>
>> (b) we could move the "page_remove_rmap()" into the "flush-and-free" path too
>>
>> (c) we could actually add a spinlock (hashed on the page?) for this
>>
>> I think (a) is basically our current expectation.
>
> Side note: anybody doing gup + set_page_dirty() won't be fixed by b/c
> anyway, so I think (a) is basically the only thing.
>
> And that's true even if you do a page pinning gup, since the source of
> the gup may be actively unmapped after the gup.
I was just now writing a response that favored (c) over (b), precisely
because of that, yes. :)
>
> So a filesystem that thinks that only write, or a rmap-accessible mmap
> can turn the page dirty really seems to be fundamentally broken.
>
> And I think that has always been the case, it's just that filesystem
> writers may not have been happy with it, and may not have had
> test-cases for it.
>
> It's not surprising that the filesystem people then try to blame users.
>
> Linus
Yes, lots of unhappy debates about this over the years.
However, I remain intrigued by (c), because if we had a "dirty page lock"
that is looked up by page (much like looking up the ptl), it seems like
a building block that would potentially help solve the whole thing.
The above points about "file system needs to coordinate with mm about
what's allowed to be dirtied, including gup/dma cases", those are still
true and not yet solved, yes. But having a solid point of synchronization
for this, definitely looks interesting.
Of course, without working through this more thoroughly, it's not fair
to impose this constraint on the current discussion, understood. :)
thanks,
--
John Hubbard
NVIDIA
Powered by blists - more mailing lists