lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aR6HtvxhmVxUvd+h@lstrano-desk.jf.intel.com>
Date: Wed, 19 Nov 2025 19:15:02 -0800
From: Matthew Brost <matthew.brost@...el.com>
To: Balbir Singh <balbirs@...dia.com>
CC: Andrew Morton <akpm@...ux-foundation.org>, <linux-kernel@...r.kernel.org>,
	<dri-devel@...ts.freedesktop.org>, <linux-mm@...ck.org>, David Hildenbrand
	<david@...hat.com>, Zi Yan <ziy@...dia.com>, Joshua Hahn
	<joshua.hahnjy@...il.com>, Rakie Kim <rakie.kim@...com>, Byungchul Park
	<byungchul@...com>, Gregory Price <gourry@...rry.net>, Ying Huang
	<ying.huang@...ux.alibaba.com>, Alistair Popple <apopple@...dia.com>, "Oscar
 Salvador" <osalvador@...e.de>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
	Baolin Wang <baolin.wang@...ux.alibaba.com>, "Liam R. Howlett"
	<Liam.Howlett@...cle.com>, Nico Pache <npache@...hat.com>, Ryan Roberts
	<ryan.roberts@....com>, Dev Jain <dev.jain@....com>, Barry Song
	<baohua@...nel.org>, Lyude Paul <lyude@...hat.com>, Danilo Krummrich
	<dakr@...nel.org>, David Airlie <airlied@...il.com>, Simona Vetter
	<simona@...ll.ch>, Ralph Campbell <rcampbell@...dia.com>, Mika
 Penttilä <mpenttil@...hat.com>, Francois Dugast
	<francois.dugast@...el.com>
Subject: Re: [v7 00/16] mm: support device-private THP

On Thu, Nov 20, 2025 at 01:59:09PM +1100, Balbir Singh wrote:
> On 11/20/25 13:50, Balbir Singh wrote:
> > On 11/20/25 13:40, Matthew Brost wrote:
> >> On Wed, Nov 12, 2025 at 10:52:43AM +1100, Balbir Singh wrote:
> >>> On 11/12/25 10:43, Andrew Morton wrote:
> >>>> On Thu, 9 Oct 2025 03:33:33 -0700 Matthew Brost <matthew.brost@...el.com> wrote:
> >>>>
> >>>>>>>> This patch series introduces support for Transparent Huge Page
> >>>>>>>> (THP) migration in zone device-private memory. The implementation enables
> >>>>>>>> efficient migration of large folios between system memory and
> >>>>>>>> device-private memory
> >>>>>>>
> >>>>>>> Lots of chatter for the v6 series, but none for v7.  I hope that's a
> >>>>>>> good sign.
> >>>>>>>
> >>>>>>
> >>>>>> I hope so too, I've tried to address the comments in v6.
> >>>>>>
> >>>>>
> >>>>> Circling back to this series, we will itegrate and test this version.
> >>>>
> >>>> How'd it go?
> >>>>
> >>
> >> My apologies for the delay—I got distracted by other tasks in Xe (my
> >> driver) and was out for a bit. Unfortunately, this series breaks
> >> something in the existing core MM code for the Xe SVM implementation. I
> >> have an extensive test case that hammers on SVM, which fully passes
> >> prior to applying this series, but fails randomly with the series
> >> applied (to drm-tip-rc6) due to the below kernel lockup.
> >>
> >> I've tried to trace where the migration PTE gets installed but not
> >> removed or isolate a test case which causes this failure but no luck so
> >> far. I'll keep digging as I have time.
> >>
> >> Beyond that, if I enable Xe SVM + THP, it seems to mostly work (though
> >> the same issue as above eventually occurs), but I do need two additional
> >> core MM patches—one is new code required for Xe, and the other could be
> >> considered a bug fix. Those patches can included when Xe merges SVM THP
> >> support but we need at least not break Xe SVM before this series merges.
> >>
> >> Stack trace:
> >>
> >> INFO: task kworker/u65:2:1642 blocked for more than 30
> >> seconds.
> >> [  212.624286]       Tainted: G S      W           6.18.0-rc6-xe+ #1719
> >> [  212.630561] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >> disables this message.
> >> [  212.638285] task:kworker/u65:2   state:D stack:0     pid:1642
> >> tgid:1642  ppid:2      task_flags:0x4208060 flags:0x00080000
> >> [  212.638288] Workqueue: xe_page_fault_work_queue
> >> xe_pagefault_queue_work [xe]
> >> [  212.638323] Call Trace:
> >> [  212.638324]  <TASK>
> >> [  212.638325]  __schedule+0x4b0/0x990
> >> [  212.638330]  schedule+0x22/0xd0
> >> [  212.638331]  io_schedule+0x41/0x60
> >> [  212.638333]  migration_entry_wait_on_locked+0x1d8/0x2d0
> >> [  212.638336]  ? __pfx_wake_page_function+0x10/0x10
> >> [  212.638339]  migration_entry_wait+0xd2/0xe0
> >> [  212.638341]  hmm_vma_walk_pmd+0x7c9/0x8d0
> >> [  212.638343]  walk_pgd_range+0x51d/0xa40
> >> [  212.638345]  __walk_page_range+0x75/0x1e0
> >> [  212.638347]  walk_page_range_mm+0x138/0x1f0
> >> [  212.638349]  hmm_range_fault+0x59/0xa0
> >> [  212.638351]  drm_gpusvm_get_pages+0x194/0x7b0 [drm_gpusvm_helper]
> >> [  212.638354]  drm_gpusvm_range_get_pages+0x2d/0x40 [drm_gpusvm_helper]
> >> [  212.638355]  __xe_svm_handle_pagefault+0x259/0x900 [xe]
> >> [  212.638375]  ? update_load_avg+0x7f/0x6c0
> >> [  212.638377]  ? update_curr+0x13d/0x170
> >> [  212.638379]  xe_svm_handle_pagefault+0x37/0x90 [xe]
> >> [  212.638396]  xe_pagefault_queue_work+0x2da/0x3c0 [xe]
> >> [  212.638420]  process_one_work+0x16e/0x2e0
> >> [  212.638422]  worker_thread+0x284/0x410
> >> [  212.638423]  ? __pfx_worker_thread+0x10/0x10
> >> [  212.638425]  kthread+0xec/0x210
> >> [  212.638427]  ? __pfx_kthread+0x10/0x10
> >> [  212.638428]  ? __pfx_kthread+0x10/0x10
> >> [  212.638430]  ret_from_fork+0xbd/0x100
> >> [  212.638433]  ? __pfx_kthread+0x10/0x10
> >> [  212.638434]  ret_from_fork_asm+0x1a/0x30
> >> [  212.638436]  </TASK>
> >>
> > 
> > Hi, Matt
> > 
> > Thanks for the report, two questions
> > 
> > 1. Are you using mm/mm-unstable, we've got some fixes in there (including fixes to remove_migration_pmd())

remove_migration_pmd - This is a PTE migration entry.

> >    - Generally a left behind migration entry is a symptom of a failed migration that did not clean up
> >      after itself.

I'm on drm-tip as I generally need the latest version of my driver
because of the speed we move at.

Yes, I agree it looks like somehow a migration PTE is not getting
properly removed.

I'm happy to cherry pick any patches that you think might be helpful
into my tree.

> > 2. The stack trace is from hmm_range_fault(), not something that this code touches.
> > 

Agree this is a symptom of the above issue.

> > The stack trace shows your code is seeing a migration entry and waiting on it.
> > Can you please provide a reproducer for the issue? In the form of a test in hmm-tests.c
> > 

That will be my plan. Right now I'm opening my test up which runs 1000s
of variations of SVM tests and the test that hangs is not consistent.
Some of these are threaded or multi-process so it might possibly be a
timing issue which could be hard to reproduce in hmm-tests.c. I'll do my
best here.

> > Have you been able to bisect the issue?
> 

That is my next step along with isolating a test case.

> Also could you please try with 10b9feee2d0d ("mm/hmm: populate PFNs from PMD swap entry")
> reverted?
> 

I can try but I highly doubt this is related. The hanging HMM code in is
PTE walk step after this, also I am not even enabling THP device pages
in my SVM code to reproduce this.

Matt

> > 
> > Balbir
> > 
> > 
> >> Matt 
> >>
> >>>> Balbir, what's the status here?  It's been a month and this series
> >>>> still has a "needs a new version" feeling to it.  If so, very soon
> >>>> please.
> >>>>
> >>>
> >>> I don't think this needs a new revision, I've been testing frequently
> >>> at my end to see if I can catch any regressions. I have a patch update for
> >>> mm-migrate_device-add-thp-splitting-during-migration.patch, it can be applied
> >>> on top or I can send a new version of the patch. I was waiting
> >>> on any feedback before I sent the patch out, but I'll do it now.
> >>>
> >>>> TODOs which I have noted are
> >>>>
> >>>> https://lkml.kernel.org/r/aOePfeoDuRW+prFq@lstrano-desk.jf.intel.com
> >>>
> >>> This was a clarification on the HMM patch mentioned in the changelog
> >>>
> >>>> https://lkml.kernel.org/r/CABzRoyZZ8QLF5PSeDCVxgcnQmF9kFQ3RZdNq0Deik3o9OrK+BQ@mail.gmail.com
> >>>
> >>> That's a minor comment on not using a temporary declaration, I don't think we need it, let me know if you feel strongly
> >>>
> >>>> https://lkml.kernel.org/r/D2A4B724-E5EF-46D3-9D3F-EBAD9B22371E@nvidia.com
> >>>
> >>> I have a patch for this, which I posted, I can do an update and resend it if required (the one mentioned above)
> >>>
> >>>> https://lkml.kernel.org/r/62073ca1-5bb6-49e8-b8d4-447c5e0e582e@
> >>>>
> >>>
> >>> I can't seem to open this
> >>>
> >>>> plus a general re-read of the
> >>>> mm-migrate_device-add-thp-splitting-during-migration.patch review
> >>>> discussion.
> >>>>
> >>> That's the patch I have
> >>>
> >>> Thanks for following up
> >>> Balbir
> > 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ