linux-kernel - Re: drivers/pci: (and/or KVM): Slow PCI initialization during VM boot with passthrough of large BAR Nvidia GPUs on DGX H100

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20241127102243.57cddb78.alex.williamson@redhat.com>
Date: Wed, 27 Nov 2024 10:22:43 -0700
From: Alex Williamson <alex.williamson@...hat.com>
To: Mitchell Augustin <mitchell.augustin@...onical.com>
Cc: linux-pci@...r.kernel.org, kvm@...r.kernel.org, Bjorn Helgaas
 <bhelgaas@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: drivers/pci: (and/or KVM): Slow PCI initialization during VM
 boot with passthrough of large BAR Nvidia GPUs on DGX H100

On Tue, 26 Nov 2024 19:12:35 -0600
Mitchell Augustin <mitchell.augustin@...onical.com> wrote:

> Thanks for the breakdown!
> 
> > That alone calls __pci_read_base() three separate times, each time
> > disabling and re-enabling decode on the bridge. [...] So we're
> > really being bitten that we toggle decode-enable/memory enable
> > around reading each BAR size  
> 
> That makes sense to me. Is this something that could theoretically be
> done in a less redundant way, or is there some functional limitation
> that would prevent that or make it inadvisable? (I'm still new to pci
> subsystem debugging, so apologies if that's a bit vague.)

The only requirement is that decode should be disabled while sizing
BARs, the fact that we repeat it around each BAR is, I think, just the
way the code is structured.  It doesn't take into account that toggling
the command register bit is not a trivial operation in a virtualized
environment.  IMO we should push the command register manipulation up a
layer so that we only toggle it once per device rather than once per
BAR.  Thanks,

Alex