lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241203163045.3e068562.alex.williamson@redhat.com>
Date: Tue, 3 Dec 2024 16:30:45 -0700
From: Alex Williamson <alex.williamson@...hat.com>
To: Mitchell Augustin <mitchell.augustin@...onical.com>
Cc: linux-pci@...r.kernel.org, kvm@...r.kernel.org, Bjorn Helgaas
 <bhelgaas@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: drivers/pci: (and/or KVM): Slow PCI initialization during VM
 boot with passthrough of large BAR Nvidia GPUs on DGX H100

On Tue, 3 Dec 2024 17:09:07 -0600
Mitchell Augustin <mitchell.augustin@...onical.com> wrote:

> Thanks for the suggestions!
> 
> > The calling convention of __pci_read_base() is already changing if we're having the caller disable decoding  
> 
> The way I implemented that in my initial patch draft[0] still allows
> for __pci_read_base() to be called independently, as it was
> originally, since (as far as I understand) the encode disable/enable
> is just a mask - so I didn't need to remove the disable/enable inside
> __pci_read_base(), and instead just added an extra one in
> pci_read_bases(), turning the __pci_read_base() disable/enable into a
> no-op when called from pci_read_bases(). In any case...
> 
> > I think maybe another alternative that doesn't hold off the console would be to split the BAR sizing and resource processing into separate steps.  
> 
> This seems like a potentially better option, so I'll dig into that approach.
> 
> 
> Providing some additional info you requested last week, just for more context:
> 
> > Do you have similar logs from that [hotplug] operation  
> 
> Attached [1] is the guest boot output (boot was quick, since no GPUs
> were attached at boot time)

I think what's happening here is that decode is already disabled on the
hot-added device (vs enabled by the VM firmware on cold-plug), so in
practice it's similar to your nested disable solution.  Thanks,

Alex


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ