lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20260126091749.307-1-guojinhui.liam@bytedance.com>
Date: Mon, 26 Jan 2026 17:17:49 +0800
From: "Jinhui Guo" <guojinhui.liam@...edance.com>
To: <dan.j.williams@...el.com>
Cc: <alexanderduyck@...com>, <bhelgaas@...gle.com>, <bvanassche@....org>, 
	<dakr@...nel.org>, <frederic@...nel.org>, <gregkh@...uxfoundation.org>, 
	<guojinhui.liam@...edance.com>, <helgaas@...nel.org>, 
	<linux-kernel@...r.kernel.org>, <linux-pci@...r.kernel.org>, 
	<rafael@...nel.org>, <tj@...nel.org>
Subject: Re: [PATCH v2 0/3] Add NUMA-node-aware synchronous probing to driver core

On Fri Jan 23, 2026 17:04:27 -0800, Dan Williams wrote:
> Jinhui Guo wrote:
> > Hi all,
> > 
> > ** Overview **
> > 
> > This patchset introduces NUMA-node-aware synchronous probing.
> > 
> > Drivers can initialize and allocate memory on the device’s local
> > node without scattering kmalloc_node() calls throughout the code.
> > NUMA-aware probing was added to PCI drivers in 2005 and has
> > benefited them ever since.
> > 
> > The asynchronous probe path already supports NUMA-node-aware
> > probing via async_schedule_dev() in the driver core. Since NUMA
> > affinity is orthogonal to sync/async probing, this patchset adds
> > NUMA-node-aware support to the synchronous probe path.
> > 
> > ** Background **
> > 
> > The idea arose from a discussion with Bjorn and Danilo about a
> > PCI-probe issue [1]:
> > 
> > when PCI devices on the same NUMA node are probed asynchronously,
> > pci_call_probe() calls work_on_cpu(), pins every probe worker to
> > the same CPU inside that node, and forces the probes to run serially.
> > 
> > Testing three NVMe devices on the same NUMA node of an AMD EPYC 9A64
> > 2.4 GHz processor (all on CPU 0):
> > 
> >   nvme 0000:01:00.0: CPU: 0, COMM: kworker/0:1, probe cost: 53372612 ns
> >   nvme 0000:02:00.0: CPU: 0, COMM: kworker/0:2, probe cost: 49532941 ns
> >   nvme 0000:03:00.0: CPU: 0, COMM: kworker/0:3, probe cost: 47315175 ns
> > 
> > Since the driver core already provides NUMA-node-aware asynchronous
> > probing, we can extend the same capability to the synchronous probe
> > path. This solves the issue and lets other drivers benefit from
> > NUMA-local initialization as well.
> 
> I like that from a global benefit perspective, but not necessarily from
> a regression perspective. Is there a minimal fix to PCI to make its
> current workqueue unbound, then if that goes well come back and move all
> devices into this scheme?

Hi Dan,

Thank you for your time, and apologies for the delayed reply.

I understand your concerns about stability and hope for better PCI regression
handling. However, I believe introducing NUMA-node awareness to the driver
core's asynchronous probe path is the better solution:

1. The asynchronous path already uses async_schedule_dev() with queue_work_node()
   to bind workers to specific NUMA nodes—this causes no side effects to driver
   probing.
2. I initially submitted a PCI-only fix [1], but handling asynchronous probing in
   PCI driver proved difficult. Using current_is_async() works but feels fragile.
   After discussions with Bjorn and Danilo [2][3], moving the solution to driver
   core makes distinguishing async/sync probing straightforward. Testing shows
   minimal impact on synchronous probe time.
3. If you prefer a PCI-only approach, we could add a flag in struct device_driver
   (default false) that PCI sets during registration. This limits the new path to
   PCI devices while others retain existing behavior. The extra code is ~10 lines
   and can be removed once confidence is established.
4. I'm committed to supporting this: I'll include "Fixes:" tags for any fallout
   and provide patches within a month of any report. Since the logic mirrors the
   core async helper, risk should be low—but I'll take full responsibility
   regardless.

Please let me know if you have other concerns.

[1] https://lore.kernel.org/all/20251230142736.1168-1-guojinhui.liam@bytedance.com/
[2] https://lore.kernel.org/all/20251231165503.GA159243@bhelgaas/
[3] https://lore.kernel.org/all/DFFXIZR1AGTV.2WZ1G2JAU0HFQ@kernel.org/

Best Regards,
Jinhui

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ