[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a3bcac19-78b7-4918-81b3-641a65a19a9d@suse.cz>
Date: Thu, 30 Oct 2025 22:48:18 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Pedro Falcato <pfalcato@...e.de>,
 Andrew Morton <akpm@...ux-foundation.org>, Jonathan Corbet <corbet@....net>,
 David Hildenbrand <david@...hat.com>,
 "Liam R . Howlett" <Liam.Howlett@...cle.com>, Mike Rapoport
 <rppt@...nel.org>, Suren Baghdasaryan <surenb@...gle.com>,
 Michal Hocko <mhocko@...e.com>, Steven Rostedt <rostedt@...dmis.org>,
 Masami Hiramatsu <mhiramat@...nel.org>,
 Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Jann Horn <jannh@...gle.com>, linux-kernel@...r.kernel.org,
 linux-fsdevel@...r.kernel.org, linux-doc@...r.kernel.org,
 linux-mm@...ck.org, linux-trace-kernel@...r.kernel.org,
 linux-kselftest@...r.kernel.org, Andrei Vagin <avagin@...il.com>,
 Barry Song <21cnbao@...il.com>
Subject: Re: [PATCH 1/3] mm: introduce VM_MAYBE_GUARD and make visible for
 guard regions
On 10/30/25 20:47, Lorenzo Stoakes wrote:
> On Thu, Oct 30, 2025 at 07:47:34PM +0100, Vlastimil Babka wrote:
>> >
>> > Could we use MADVISE_VMA_READ_LOCK mode (would be actually an improvement
>> > over the current MADVISE_MMAP_READ_LOCK), together with the atomic flag
>> > setting? I think the places that could race with us to cause RMW use vma
>> > write lock so that would be excluded. Fork AFAICS unfortunately doesn't (for
>> > the oldmm) and it probably would't make sense to start doing it. Maybe we
>> > could think of something to deal with this special case...
>>
>> During discussion with Pedro off-list I realized fork takes mmap lock for
>> write on the old mm, so if we kept taking mmap sem for read, then vma lock
>> for read in addition (which should be cheap enough, also we'd only need it
>> in case VM_MAYBE_GUARD is not yet set), and set the flag atomicaly, perhaps
>> that would cover all non-bening races?
>>
>>
> 
> We take VMA write lock in dup_mmap() on each mpnt (old VMA).
Ah yes I thought it was the new one.
> We take the VMA write lock (vma_start_write()) for each mpnt.
> 
> We then vm_area_dup() the mpnt to the new VMA before calling:
> 
> copy_page_range()
> -> vma_needs_copy()
> 
> Which is where the check is done.
> 
> So we are holding the VMA write lock, so a VMA read lock should suffice no?
Yeah, even better!
> For belts + braces we could atomically read the flag in vma_needs_copy(),
> though note it's intended VM_COPY_ON_FORK could have more than one flag.
> 
> We could drop that for now and be explicit.
Great!
Powered by blists - more mailing lists
 
