[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250801162930.GB184255@nvidia.com>
Date: Fri, 1 Aug 2025 13:29:30 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Jens Axboe <axboe@...nel.dk>,
Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
Amir Goldstein <amir73il@...il.com>, Kees Cook <kees@...nel.org>,
Josef Bacik <josef@...icpanda.com>,
Matthew Wilcox <willy@...radead.org>,
Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
Pedro Falcato <pfalcato@...e.de>, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
David Hildenbrand <david@...hat.com>
Subject: Re: [PATCH 00/10] convert the majority of file systems to
mmap_prepare
On Fri, Aug 01, 2025 at 03:12:48PM +0100, Lorenzo Stoakes wrote:
> > I would like to suggest we add a vma->prepopulate() callback which is
> > where the remap_pfn should go. Once the VMA is finalized and fully
> > operational the vma_ops have the opportunity to prepopulate any PTEs.
>
> I assume you mean vma->vm_ops->prepopulate ?
Yes
> We also have to think about other places where we prepopulate also, for
> instance the perf mmap call now prepopulates (ahem that was me).
Yes, vfio would also like to do this but can't due to the below issue.
> > This could then actually be locked properly so it is safe with
> > concurrent unmap_mapping_range() (current mmap callback is not safe)
>
> Which lock in particular is problematic? You'd want to hold an rmap write
> lock to avoid racing zap?
I have forgotten, but there was a race with how the current mmap op
was called relative to when the VMA was tracked.
ie we should be able to do
CPU0 CPU1
vm_ops_prepopulate()
mutex_lock()
if (!is_mapping_valid)
return -EINVAL;
<fill ptes>
mutex_unlock()
mutex_lock()
is_mapping_valid = false
unmap_mapping_range()
mutex_unlock()
And be sure there are no races. Use the lock of your choice for the
mutex.
The above is not true today under mmap, IIRC the VMA is not added to
the lists that unmap_mapping_range walks until after mmap() returns.
Jason
Powered by blists - more mailing lists