[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <81cf0943-e258-494c-812a-0c00b11cf807@redhat.com>
Date: Fri, 20 Oct 2023 12:02:19 +0200
From: David Hildenbrand <david@...hat.com>
To: Peter Xu <peterx@...hat.com>
Cc: Lokesh Gidra <lokeshgidra@...gle.com>,
Suren Baghdasaryan <surenb@...gle.com>,
akpm@...ux-foundation.org, viro@...iv.linux.org.uk,
brauner@...nel.org, shuah@...nel.org, aarcange@...hat.com,
hughd@...gle.com, mhocko@...e.com, axelrasmussen@...gle.com,
rppt@...nel.org, willy@...radead.org, Liam.Howlett@...cle.com,
jannh@...gle.com, zhangpeng362@...wei.com, bgeffon@...gle.com,
kaleshsingh@...gle.com, ngeoffray@...gle.com, jdduke@...gle.com,
linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
kernel-team@...roid.com
Subject: Re: [PATCH v3 2/3] userfaultfd: UFFDIO_MOVE uABI
On 19.10.23 21:53, Peter Xu wrote:
> On Thu, Oct 19, 2023 at 05:41:01PM +0200, David Hildenbrand wrote:
>> That's not my main point. It can easily become a maintenance burden without
>> any real use cases yet that we are willing to support.
>
> That's why I requested a few times that we can discuss the complexity of
> cross-mm support already here, and I'm all ears if I missed something on
> the "maintenance burden" part..
>
> I started by listing what I think might be different, and we can easily
> speedup single-mm with things like "if (ctx->mm != mm)" checks with
> e.g. memcg, just like what this patch already did with pgtable depositions.
>
> We keep saying "maintenance burden" but we refuse to discuss what is that..
Let's recap
(1) We have person A up-streaming code written by person B, whereby B is
not involved in the discussions nor seems to be active to maintain that
code.
Worse, the code that is getting up-streamed was originally based on a
different kernel version that has significant differences in some key
areas -- for example, page pinning, exclusive vs. shared.
I claim that nobody here fully understands the code at hand (just look
at the previous discussions), and reviewers have to sort out the mess
that was created by the very way this stuff is getting upstreamed here.
We're already struggling to get the single-mm case working correctly.
(2) Cross-mm was not even announced anywhere nor mentioned which use it
would have; I had to stumble over this while digging through the code.
Further, is it even *tested*? AFAIKS in patch #3 no. Why do we have to
make the life of reviewers harder by forcing them to review code that
currently *nobody* on this earth needs?
(3) You said "What else we can benefit from single mm? One less mmap
read lock, but probably that's all we can get;" and I presented two
non-obvious issues. I did not even look any further because I really
have better things to do than review complicated code without real use
cases at hand. As I said "maybe that works as expected, I
don't know and I have no time to spare on reviewing features with no
real use cases)"; apparently I was right by just guessing that memcg
handling is missing.
The sub-feature in question (cross-mm) has no solid use cases; at this
point I am not even convinced the use case you raised requires
*userfaultfd*; for the purpose of moving a whole VMA worth of pages
between two processes; I don't see the immediate need to get userfaultfd
involved and move individual pages under page lock etc.
>
> I'll leave that to Suren and Lokesh to decide. For me the worst case is
> one more flag which might be confusing, which is not the end of the world..
> Suren, you may need to work more thoroughly to remove cross-mm implications
> if so, just like when renaming REMAP to MOVE.
I'm asking myself why you are pushing so hard to include complexity
"just because we can"; doesn't make any sense to me, honestly.
Maybe you have some other real use cases that ultimately require
userfaultfd for cross-mm that you cannot share?
Will the world end when we have to use a separate flag so we can open
this pandora's box when really required?
Again, moving anon pages within a process is a known thing; we do that
already via mremap; the only difference here really is, that we have to
get the rmap right because we don't adjust VMAs. It's a shame we don't
try to combine both code paths, maybe it's not easily possible like we
did with mprotect vs. uffd-wp.
Moving anon pages between process is currently only done via COW, where
all things (page pinning, memcg, ...) have been figured out and are
simply working as expected. Making uffd special by coding-up their own
thing does not sound compelling to me.
I am clearly against any unwarranted features+complexity. Again, I will
stop arguing further, the whole thing of "include it just because we
can" to avoid a flag (that we might never even see) doesn't make any
sense to me and likely never will.
The whole way this feature is getting upstreamed is just messed up IMHO
and I the reasoning used in this thread to stick
as-close-as-possible to some code person B wrote some years ago (e.g.,
naming, sub-features) is far out of my comprehension.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists