[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <db2268f0-7885-471d-94a3-8ae4641ba2e5@redhat.com>
Date: Tue, 3 Jun 2025 20:37:29 +0200
From: David Hildenbrand <david@...hat.com>
To: Matthew Wilcox <willy@...radead.org>, Jann Horn <jannh@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka
<vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
linux-mm@...ck.org, Peter Xu <peterx@...hat.com>,
linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH 1/2] mm/memory: ensure fork child sees coherent memory
snapshot
On 03.06.25 20:29, Matthew Wilcox wrote:
> On Tue, Jun 03, 2025 at 08:21:02PM +0200, Jann Horn wrote:
>> When fork() encounters possibly-pinned pages, those pages are immediately
>> copied instead of just marking PTEs to make CoW happen later. If the parent
>> is multithreaded, this can cause the child to see memory contents that are
>> inconsistent in multiple ways:
>>
>> 1. We are copying the contents of a page with a memcpy() while userspace
>> may be writing to it. This can cause the resulting data in the child to
>> be inconsistent.
>> 2. After we've copied this page, future writes to other pages may
>> continue to be visible to the child while future writes to this page are
>> no longer visible to the child.
>>
>> This means the child could theoretically see incoherent states where
>> allocator freelists point to objects that are actually in use or stuff like
>> that. A mitigating factor is that, unless userspace already has a deadlock
>> bug, userspace can pretty much only observe such issues when fancy lockless
>> data structures are used (because if another thread was in the middle of
>> mutating data during fork() and the post-fork child tried to take the mutex
>> protecting that data, it might wait forever).
>
> Um, OK, but isn't that expected behaviour? POSIX says:
>
> : A process shall be created with a single thread. If a multi-threaded
> : process calls fork(), the new process shall contain a replica of the
> : calling thread and its entire address space, possibly including the
> : states of mutexes and other resources. Consequently, the application
> : shall ensure that the child process only executes async-signal-safe
> : operations until such time as one of the exec functions is successful.
>
> It's always been my understanding that you really, really shouldn't call
> fork() from a multithreaded process.
I have the same recollection, but rather because of concurrent O_DIRECT
and locking (pthread_atfork ...).
Using the allocator above example: what makes sure that no other thread
is halfway through modifying allocator state? You really have to sync
somehow before calling fork() -- e.g., grabbing allocator locks in
pthread_atfork().
For Linux we document in the man page
"After a fork() in a multithreaded program, the child can safely call
only async-signal-safe functions (see signal-safety(7)) until such time
as it calls execve(2)."
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists