[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <65cf6d71-9440-423a-9a70-b6b40622440e@nvidia.com>
Date: Fri, 16 Jan 2026 10:16:07 +1100
From: Balbir Singh <balbirs@...dia.com>
To: mpenttil@...hat.com, linux-mm@...ck.org
Cc: linux-kernel@...r.kernel.org, David Hildenbrand <david@...hat.com>,
Jason Gunthorpe <jgg@...dia.com>, Leon Romanovsky <leonro@...dia.com>,
Alistair Popple <apopple@...dia.com>, Zi Yan <ziy@...dia.com>,
Matthew Brost <matthew.brost@...el.com>
Subject: Re: [PATCH 0/3] Migrate on fault for device pages
On 1/14/26 20:19, mpenttil@...hat.com wrote:
> From: Mika Penttilä <mpenttil@...hat.com>
>
> Currently, the way device page faulting and migration works
> is not optimal, if you want to do both fault handling and
> migration at once.
>
> Being able to migrate not present pages (or pages mapped with incorrect
> permissions, eg. COW) to the GPU requires doing either of the
> following sequences:
>
> 1. hmm_range_fault() - fault in non-present pages with correct permissions, etc.
> 2. migrate_vma_*() - migrate the pages
>
> Or:
>
> 1. migrate_vma_*() - migrate present pages
> 2. If non-present pages detected by migrate_vma_*():
> a) call hmm_range_fault() to fault pages in
> b) call migrate_vma_*() again to migrate now present pages
>
> The problem with the first sequence is that you always have to do two
> page walks even when most of the time the pages are present or zero page
> mappings so the common case takes a performance hit.
>
> The second sequence is better for the common case, but far worse if
> pages aren't present because now you have to walk the page tables three
> times (once to find the page is not present, once so hmm_range_fault()
> can find a non-present page to fault in and once again to setup the
> migration). It is also tricky to code correctly. One page table walk
> could costs over 1000 cpu cycles on X86-64, which is a significant hit.
>
> We should be able to walk the page table once, faulting
> pages in as required and replacing them with migration entries if
> requested.
>
> Add a new flag to HMM APIs, HMM_PFN_REQ_MIGRATE,
> which tells to prepare for migration also during fault handling.
> Also, for the migrate_vma_setup() call paths, a flags, MIGRATE_VMA_FAULT,
> is added to tell to add fault handling to migrate.
>
> Tested in X86-64 VM with HMM test device, passing the selftests.
> Tested also rebased on the
> "Remove device private pages from physical address space" series:
> https://lore.kernel.org/linux-mm/20260107091823.68974-1-jniethe@nvidia.com/
> plus a small patch to adjust with no problems.
>
> Changes from RFC:
> - rebase on 6.19-rc5
> - adjust for the device THP
> - changes from feedback
>
> Revisions:
> - RFC https://lore.kernel.org/linux-mm/20250814072045.3637192-1-mpenttil@redhat.com/
>
> Cc: David Hildenbrand <david@...hat.com>
> Cc: Jason Gunthorpe <jgg@...dia.com>
> Cc: Leon Romanovsky <leonro@...dia.com>
> Cc: Alistair Popple <apopple@...dia.com>
> Cc: Balbir Singh <balbirs@...dia.com>
> Cc: Zi Yan <ziy@...dia.com>
> Cc: Matthew Brost <matthew.brost@...el.com>
> Suggested-by: Alistair Popple <apopple@...dia.com>
> Signed-off-by: Mika Penttilä <mpenttil@...hat.com>
>
> Mika Penttilä (3):
> mm: unified hmm fault and migrate device pagewalk paths
> mm: add new testcase for the migrate on fault case
> mm:/migrate_device.c: remove migrate_vma_collect_*() functions
>
> include/linux/hmm.h | 17 +-
> include/linux/migrate.h | 6 +-
> lib/test_hmm.c | 100 +++-
> lib/test_hmm_uapi.h | 19 +-
> mm/hmm.c | 657 +++++++++++++++++++++++--
> mm/migrate_device.c | 589 +++-------------------
> tools/testing/selftests/mm/hmm-tests.c | 54 ++
> 7 files changed, 869 insertions(+), 573 deletions(-)
>
I see some kernel test robot failures, I assume there will be a new version
for review?
Balbir
Powered by blists - more mailing lists