[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <64d0da08-6ffd-4bce-bc66-5097913937b4@kernel.org>
Date: Sat, 21 May 2022 15:19:24 -0700
From: Andy Lutomirski <luto@...nel.org>
To: Matthew Wilcox <willy@...radead.org>,
David Hildenbrand <david@...hat.com>
Cc: Chih-En Lin <shiyn.lin@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Christian Brauner <brauner@...nel.org>,
Vlastimil Babka <vbabka@...e.cz>,
William Kucharski <william.kucharski@...cle.com>,
John Hubbard <jhubbard@...dia.com>,
Yunsheng Lin <linyunsheng@...wei.com>,
Arnd Bergmann <arnd@...db.de>,
Suren Baghdasaryan <surenb@...gle.com>,
Colin Cross <ccross@...gle.com>,
Feng Tang <feng.tang@...el.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Mike Rapoport <rppt@...nel.org>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Anshuman Khandual <anshuman.khandual@....com>,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
Daniel Axtens <dja@...ens.net>,
Jonathan Marek <jonathan@...ek.ca>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Pasha Tatashin <pasha.tatashin@...een.com>,
Peter Xu <peterx@...hat.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Fenghua Yu <fenghua.yu@...el.com>,
linux-kernel@...r.kernel.org, Kaiyang Zhao <zhao776@...due.edu>,
Huichun Feng <foxhoundsk.tw@...il.com>,
Jim Huang <jserv.tw@...il.com>
Subject: Re: [RFC PATCH 0/6] Introduce Copy-On-Write to Page Table
On 5/21/22 13:12, Matthew Wilcox wrote:
> On Sat, May 21, 2022 at 06:07:27PM +0200, David Hildenbrand wrote:
>> I'm missing the most important point: why do we care and why should we
>> care to make our COW/fork implementation even more complicated?
>>
>> Yes, we might save some page tables and we might reduce the fork() time,
>> however, which specific workload really benefits from this and why do we
>> really care about that workload? Without even hearing about an example
>> user in this cover letter (unless I missed it), I naturally wonder about
>> relevance in practice.
>
> As I get older (and crankier), I get less convinced that fork() is
> really the right solution for implementing system(). I feel that a
> better model is to create a process with zero threads, but have an fd
> to it. Then manipulate the child process through its fd (eg mmap
> ld.so, open new fds in that process's fdtable, etc). Closing the fd
> launches a new thread in the process (ensuring nobody has an fd to a
> running process, particularly one which is setuid).
Heh, I learned serious programming on Windows, and I thought fork() was
entertaining, cool, and a bad idea when I first learned about it. (I
admit I did think the fact that POSIX fork and exec had many fewer
arguments than CreateProcess was a good thing.) Don't even get me
started on setuid -- if I had my way, distros would set NO_NEW_PRIVS on
boot for the entire system.
I can see a rather different use for this type of shared-pagetable
technology, though: monstrous MAP_SHARED mappings. For database and
some VM users, multiple processes will map the same file. If there was
a way to ensure appropriate alignment (or at least encourage it) and a
way to handle mappings that don't cover the whole file, then having
multiple mappings share the same page tables could be a decent
efficiently gain. This doesn't even need COW -- it's "just" pagetable
sharing.
It's probably a pipe dream, but I like to imagine that the bookkeeping
that would enable this would also enable a much less ad-hoc concept of
who owns which pagetable page. Then things like x86's KPTI LDT mappings
would be less disgusting under the hood.
Android would probably like a similar feature for MAP_ANONYMOUS or that
could otherwise enable Zygote to share paging structures (ideally
without fork(), although that's my dream, not necessarily Android's).
This is more complex, since COW is involved. Also possibly less
valuable -- possibly the entire benefit and then some would be achieved
by using huge pages for Zygote and arranging for CoWing one normal-size
page out of a hugepage COW mapping to only COW the one page.
--Andy
Powered by blists - more mailing lists