[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a4e2fdae-0db3-46de-b95d-bf6ef7b61b33@amd.com>
Date: Wed, 17 Jul 2024 14:30:57 -0400
From: Stewart Hildebrand <stewart.hildebrand@....com>
To: David Laight <David.Laight@...LAB.COM>, Bjorn Helgaas
<bhelgaas@...gle.com>, Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar
<mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, Dave Hansen
<dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>, Michael
Ellerman <mpe@...erman.id.au>, Nicholas Piggin <npiggin@...il.com>,
Christophe Leroy <christophe.leroy@...roup.eu>, "Naveen N. Rao"
<naveen.n.rao@...ux.ibm.com>, Thomas Zimmermann <tzimmermann@...e.de>, Arnd
Bergmann <arnd@...db.de>, Sam Ravnborg <sam@...nborg.org>, Yongji Xie
<elohimes@...il.com>, Ilpo Järvinen
<ilpo.jarvinen@...ux.intel.com>
CC: "x86@...nel.org" <x86@...nel.org>, "linux-pci@...r.kernel.org"
<linux-pci@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linuxppc-dev@...ts.ozlabs.org"
<linuxppc-dev@...ts.ozlabs.org>
Subject: Re: [PATCH v2 0/8] PCI: Align small (<4k) BARs
On 7/17/24 09:15, David Laight wrote:
> From: Stewart Hildebrand
>> Sent: 16 July 2024 20:33
>>
>> This series sets the default minimum resource alignment to 4k for memory
>> BARs. In preparation, it makes an optimization and addresses some corner
>> cases observed when reallocating BARs. I consider the prepapatory
>> patches to be prerequisites to changing the default BAR alignment.
>
> Should the BARs be page aligned on systems with large pages?
> At least as an option for hypervisor pass-through and any than can be mmap()ed
> into userspace.
It is sort of an option already using the pci=resource_alignment=
option, but you'd need to spell out every device and manually set the
alignment value, and you'd still end up with fake BAR sizes. I had
actually prepared locally a patch to make this less painful to do and
preserve the BAR size (introduce "pci=resource_alignment=all" option),
but I'd like Bjorn's opinion before sending since there has been some
previous reluctance to making changes to the pci=resource_alignment=
option [2].
[2] https://lore.kernel.org/linux-pci/20160929115422.GA31048@localhost/
Anyway, 4k is more defensible because that is what the PCIe 6.1 spec
calls out, and that is better than the current situation of no default
minimum alignment.
I feel PAGE_SIZE is also justified, and that is why the actual patch now
says max(SZ_4K, PAGE_SIZE) as you pointed out elsewhere. This is a
change from v1 that simply had 4k (sorry, I forgot to update the cover
letter). PowerNV has been using PAGE_SIZE since commits 0a701aa63784 and
382746376993, I think with 64k page size. I don't have a strong opinion
whether the common default should be max(SZ_4K, PAGE_SIZE) or simply
SZ_4K.
> Does any hardware actually have 'silly numbers' of small memory BARs?
>
> I have a vague memory of some ethernet controllers having lots of (?)
> virtual devices that might have separate registers than can be mapped
> out to a hypervisor.
> Expanding those to a large page might be problematic - but needed for security.
This series does not change alignment of SRIOV / VF BARs. See commits
62d9a78f32d9, ea629d873f3e, and PCIe 6.1 spec section 9.2.1.1.1.
> For more normal hardware just ensuring that two separate targets don't share
> a page while allowing (eg) two 1k BAR to reside in the same 64k page would
> give some security.
Allow me to understand this better, with an example:
PCI Device A
BAR 1 (1k)
BAR 2 (1k)
PCI Device B
BAR 1 (1k)
BAR 2 (1k)
We align all BARs to 4k. Additionally, are you saying it would be ok to
let both device A BARs to reside in the same 64k page, while device B
BARs would need to reside in a separate 64k page? I.e. having two levels
of alignment: PAGE_SIZE on a per-device basis, and 4k on a per-BAR
basis?
If I understand you correctly, there's currently no logic in the PCI
subsystem to easily support this, so that is a rather large ask. I'm
also not sure that it's necessary.
> Aligning a small MSIX BAR is unlikely to have any effect on the address
> space utilisation (for PCIe) since the bridge will assign a power of two
> sized block - with a big pad (useful for generating pcie errors!)
>
> David
Powered by blists - more mailing lists