lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHTA-uaGZkQ6rEMcRq6JiZn8v9nZPn80NyucuSTEXuPfy+0ccw@mail.gmail.com>
Date: Mon, 2 Dec 2024 13:36:25 -0600
From: Mitchell Augustin <mitchell.augustin@...onical.com>
To: Alex Williamson <alex.williamson@...hat.com>
Cc: linux-pci@...r.kernel.org, kvm@...r.kernel.org, 
	Bjorn Helgaas <bhelgaas@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: drivers/pci: (and/or KVM): Slow PCI initialization during VM boot
 with passthrough of large BAR Nvidia GPUs on DGX H100

Thanks!

This approach makes sense to me - the only concern I have is that I
see this restriction in a comment in __pci_read_base():

`/* No printks while decoding is disabled! */`

At the end of __pci_read_base(), we do have several pci_info() and
pci_err() calls - so I think we would need to also print that info one
level up after the new decode enable if we do decide to move decode
disable/enable one level up. Let me know if you agree, or if there is
a more straightforward alternative that I am missing.

- Mitchell Augustin


On Wed, Nov 27, 2024 at 11:22 AM Alex Williamson
<alex.williamson@...hat.com> wrote:
>
> On Tue, 26 Nov 2024 19:12:35 -0600
> Mitchell Augustin <mitchell.augustin@...onical.com> wrote:
>
> > Thanks for the breakdown!
> >
> > > That alone calls __pci_read_base() three separate times, each time
> > > disabling and re-enabling decode on the bridge. [...] So we're
> > > really being bitten that we toggle decode-enable/memory enable
> > > around reading each BAR size
> >
> > That makes sense to me. Is this something that could theoretically be
> > done in a less redundant way, or is there some functional limitation
> > that would prevent that or make it inadvisable? (I'm still new to pci
> > subsystem debugging, so apologies if that's a bit vague.)
>
> The only requirement is that decode should be disabled while sizing
> BARs, the fact that we repeat it around each BAR is, I think, just the
> way the code is structured.  It doesn't take into account that toggling
> the command register bit is not a trivial operation in a virtualized
> environment.  IMO we should push the command register manipulation up a
> layer so that we only toggle it once per device rather than once per
> BAR.  Thanks,
>
> Alex
>


-- 
Mitchell Augustin
Software Engineer - Ubuntu Partner Engineering

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ