lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250119202252.4fcd2c49.alex.williamson@redhat.com>
Date: Sun, 19 Jan 2025 20:22:52 -0700
From: Alex Williamson <alex.williamson@...hat.com>
To: Ankit Agrawal <ankita@...dia.com>
Cc: Jason Gunthorpe <jgg@...dia.com>, Yishai Hadas <yishaih@...dia.com>,
 "shameerali.kolothum.thodi@...wei.com"
 <shameerali.kolothum.thodi@...wei.com>, "kevin.tian@...el.com"
 <kevin.tian@...el.com>, Zhi Wang <zhiw@...dia.com>, Aniket Agashe
 <aniketa@...dia.com>, Neo Jia <cjia@...dia.com>, Kirti Wankhede
 <kwankhede@...dia.com>, "Tarun Gupta (SW-GPU)" <targupta@...dia.com>,
 Vikram Sethi <vsethi@...dia.com>, Andy Currid <acurrid@...dia.com>,
 Alistair Popple <apopple@...dia.com>, John Hubbard <jhubbard@...dia.com>,
 Dan Williams <danw@...dia.com>, "Anuj Aggarwal (SW-GPU)"
 <anuaggarwal@...dia.com>, Matt Ochs <mochs@...dia.com>,
 "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "linux-kernel@...r.kernel.org"
 <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 3/3] vfio/nvgrace-gpu: Check the HBM training and C2C
 link status

On Sun, 19 Jan 2025 20:12:32 -0700
Alex Williamson <alex.williamson@...hat.com> wrote:

> On Mon, 20 Jan 2025 02:24:14 +0000
> Ankit Agrawal <ankita@...dia.com> wrote:
> 
> > >> +EXPORT_SYMBOL_GPL(vfio_pci_memory_lock_and_enable);
> > >>
> > >>  void vfio_pci_memory_unlock_and_restore(struct vfio_pci_core_device *vdev, u16 cmd)
> > >>  {
> > >>       pci_write_config_word(vdev->pdev, PCI_COMMAND, cmd);
> > >>       up_write(&vdev->memory_lock);
> > >>  }
> > >> +EXPORT_SYMBOL_GPL(vfio_pci_memory_unlock_and_restore);
> > >>
> > >>  static unsigned long vma_to_pfn(struct vm_area_struct *vma)
> > >>  {    
> > >
> > > The access is happening before the device is exposed to the user, the
> > > above are for handling conditions while there may be races with user
> > > access, this is totally unnecessary.    
> > 
> > Right. What I could do to reuse the code is to take out the part
> > related to locking/unlocking as new functions and export that.
> > The current vfio_pci_memory_lock_and_enable() would take the lock
> > and call the new function. Same for vfio_pci_memory_unlock_and_restore().
> > The nvgrace module could also call that new function. Does that sound
> > reasonable?  
> 
> No, this is standard PCI driver stuff, everything you need is already
> there.  Probably pci_enable_device() and some variant of
> pci_request_regions().
> 
> > > Does this delay even need to happen in the probe function, or could it
> > > happen in the open_device callback?  That would still be before user
> > > access, but if we expect it to generally work, it would allow the
> > > training to happen in the background up until the user tries to open
> > > the device.  Thanks,
> > >
> > > Alex    
> > 
> > The thought process is that since it is purely bare metal coming to proper
> > state while boot, the nvgrace module should probably wait for the startup
> > to complete during probe() instead of delaying until open() time.  
> 
> If the driver is statically loaded, that might mean you're willing to
> stall boot for up to 30s.  In practice is this ever actually going to
> fail?  Thanks,

On second thought, I guess a vfio-pci variant driver can't
automatically bind to a device, whether statically built or not, so
maybe this isn't a concern.  I'm not sure if there are other concerns
with busy waiting for up to 30s at driver probe.  Thanks,

Alex


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ