lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250905134219.GH616306@nvidia.com>
Date: Fri, 5 Sep 2025 10:42:19 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: ankita@...dia.com
Cc: alex.williamson@...hat.com, yishaih@...dia.com, skolothumtho@...dia.com,
	kevin.tian@...el.com, yi.l.liu@...el.com, zhiw@...dia.com,
	aniketa@...dia.com, cjia@...dia.com, kwankhede@...dia.com,
	targupta@...dia.com, vsethi@...dia.com, acurrid@...dia.com,
	apopple@...dia.com, jhubbard@...dia.com, danw@...dia.com,
	anuaggarwal@...dia.com, mochs@...dia.com, kjaju@...dia.com,
	dnigam@...dia.com, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC 14/14] vfio/nvgrace-gpu: Add link from pci to EGM

On Thu, Sep 04, 2025 at 04:08:28AM +0000, ankita@...dia.com wrote:
> From: Ankit Agrawal <ankita@...dia.com>
> 
> To replicate the host EGM topology in the VM in terms of
> the GPU affinity, the userspace need to be aware of which
> GPUs belong to the same socket as the EGM region.
> 
> Expose the list of GPUs associated with an EGM region
> through sysfs. The list can be queried from the auxiliary
> device path.
> 
> On a 2-socket, 4 GPU Grace Blackwell setup, it shows up as the following:
> /sys/devices/pci0008:00/0008:00:00.0/0008:01:00.0/nvgrace_gpu_vfio_pci.egm.4
> /sys/devices/pci0009:00/0009:00:00.0/0009:01:00.0/nvgrace_gpu_vfio_pci.egm.4
> pointing to egm4.
> 
> /sys/devices/pci0018:00/0018:00:00.0/0018:01:00.0/nvgrace_gpu_vfio_pci.egm.5
> /sys/devices/pci0019:00/0019:00:00.0/0019:01:00.0/nvgrace_gpu_vfio_pci.egm.5
> pointing to egm5.
> 
> Moreover
> /sys/devices/pci0008:00/0008:00:00.0/0008:01:00.0/nvgrace_gpu_vfio_pci.egm.4
> /sys/devices/pci0009:00/0009:00:00.0/0009:01:00.0/nvgrace_gpu_vfio_pci.egm.4
> lists links to both the 0008:01:00.0 & 0009:01:00.0 GPU devices.
> 
> and
> /sys/devices/pci0018:00/0018:00:00.0/0018:01:00.0/nvgrace_gpu_vfio_pci.egm.5
> /sys/devices/pci0019:00/0019:00:00.0/0019:01:00.0/nvgrace_gpu_vfio_pci.egm.5
> lists links to both the 0018:01:00.0 & 0019:01:00.0.

This seems backwards, I would rather the egm chardev itself have a
directory of links to the PCI devices not have EGM manipulate the
sysfs belonging to some other driver and subsystem..

Jason


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ