[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <pk4yq5kt5csh5yqd7f2s66eux6yvluxv3rcwrskaagyxcgo34j@6wei37bgqlye>
Date: Fri, 15 Aug 2025 15:23:35 +1000
From: Alistair Popple <apopple@...dia.com>
To: Mika Penttilä <mpenttil@...hat.com>
Cc: Jason Gunthorpe <jgg@...dia.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, David Hildenbrand <david@...hat.com>,
Leon Romanovsky <leonro@...dia.com>, Balbir Singh <balbirs@...dia.com>
Subject: Re: [RFC PATCH 1/4] mm: use current as mmu notifier's owner
On Thu, Aug 14, 2025 at 08:45:43PM +0300, Mika Penttilä wrote:
>
> On 8/14/25 20:20, Jason Gunthorpe wrote:
>
> > On Thu, Aug 14, 2025 at 08:00:01PM +0300, Mika Penttilä wrote:
> >> as well as hmm test module with :
> >>
> >> * Ignore invalidation callbacks for device private pages since
> >> * the invalidation is handled as part of the migration process.
> >> */
> >> if (range->event == MMU_NOTIFY_MIGRATE &&
> >> range->owner == dmirror->mdevice)
> >> return true;
> > If I recall this was about a very specific case where migration does a
> > number of invalidations and some of the earlier ones are known to be
> > redundant in this specific case. Redundant means it can be ignored
> > without causing an inconsistency.
> >
> > Alistair would know, but I assumed this works OK because the above
> > invalidation doesn't actually go on to free any pages but keeps them
> > around until a later invalidation?
Right, the pages don't actually get freed because a reference is taken on them
during migrate_vma_setup(). However other device MMU's still need invalidating
because the driver will go on to copy the page after this step. It's just
assumed that the driver is able to be consistent with itself (ie. it will unmap/
invalidate it's own MMU prior to initiating the copy).
In practice I suspect what Mika is running into is that the page table
synchronisation for migration works slightly differently for migrate_vma_*().
Instead of using mmu_interval_notifier's which have a sequence number drivers
typically use normal mmu_notifier's and take a device specific lock to block
page table downgrades (eg. RW -> RO). This ensures it's safe to update the
device page tables with the PFNs/permissions collected in migrate_vma_setup()
(or the new PFN) by blocking other threads from updating the page table.
The ususal problem with this approach is that when migrate_vma_setup() calls
the mmu_notifier it deadlocks on the device specific lock in the notifier
callback because it already holds the lock, which it can't drop before calling
migrate_vma_setup().
I think one of the main benefits of a series which consolidates these two
page-table mirroring techniques into common code would also be to make the
mirroring/invalidation logic the same for migration as hmm_range_fault(). Ie. to
move to mmu_interval notifers with sequence numbers for migration, perhaps with
filtering if required/safe and retries.
- Alistair
> > This is nothing like what your case is talking about.
>
> This one is actually pretty similar, MMU_NOTIFY_CLEAR is also fired in migration process
> (split case) and invalidation handled part of the migration process.
>
> But I have already a working version without any of that.
>
> >
> > Jason
> >
> --Mika
>
>
Powered by blists - more mailing lists