[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250401223503.GA1686962@bhelgaas>
Date: Tue, 1 Apr 2025 17:35:03 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Heiner Kallweit <hkallweit1@...il.com>
Cc: Pavan Chebbi <pavan.chebbi@...adcom.com>,
Michael Chan <mchan@...adcom.com>,
Potnuri Bharat Teja <bharat@...lsio.com>, netdev@...r.kernel.org,
linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
Leon Romanovsky <leon@...nel.org>
Subject: Re: PCI VPD checksum ambiguity
On Tue, Apr 01, 2025 at 11:51:07PM +0200, Heiner Kallweit wrote:
> On 01.04.2025 22:55, Bjorn Helgaas wrote:
> > Hi,
> >
> > The PCIe spec is ambiguous about how the VPD checksum should be
> > computed, and resolving this ambiguity might break drivers.
> >
> > PCIe r6.0 sec 6.27 says only the VPD-R list should be included in the
> > checksum:
> >
> > One VPD-R (10h) tag is used as a header for the read-only keywords.
> > The VPD-R list (including tag and length) must checksum to zero.
>
> This requirement I don't find in the PCI 3.0 spec, not sure with which
> version it was added.
I think it's there; that same text is in PCI r3.0, Appendix I, about
five lines before I.1.
> Interestingly this doesn't even require the presence of a RV
> keyword. Or the presence of the RV keyword is implicitly assumed.
I.3.1.1 says RV is required, and I guess it has to be last in VPD-R to
cover any reserved space (as needed, I suppose to align to the VPD-W
area, which might be in a different chip).
> Maybe this part isn't meant literally. I can imagine they wanted to
> clarify that checksum calculation excludes the VPD-W area.
> And unfortunately they weren't precise enough, and introduced the
> ambiguity you found.
>
> > But sec 6.27.2.2 says "all bytes in VPD ... up to the checksum byte":
> >
> > RV The first byte of this item is a checksum byte. The checksum is
> > correct if the sum of all bytes in VPD (from VPD address 0 up
> > to and including this byte) is zero.
>
> This one can be found identically in the PCI v3.0 spec already:
>
> The checksum is correct if the sum of all bytes in VPD (from
> VPD address 0 up to and including this byte) is zero.
>
> I don't think they want to break backwards-compatibility, therefore
> this requirement should still be valid.
Yes, and I think the RV description is more specific and is what I
would have used to implement it.
> > These are obviously different unless VPD-R happens to be the first
> > item in VPD. But sec 6.27 and 6.27.2.1 suggest that the Identifier
> > String item should be the first item, preceding the VPD-R list:
> >
> > The first VPD tag is the Identifier String (02h) and provides the
> > product name of the device. [6.27]
> >
> > Large resource type Identifier String (02h)
> >
> > This tag is the first item in the VPD storage component. It
> > contains the name of the add-in card in alphanumeric characters.
> > [6.27.2.1, Table 6-23]
> >
> > I think pci_vpd_check_csum() follows sec 6.27.2.2: it sums all the
> > bytes in the buffer up to and including the checksum byte of the RV
> > keyword. The range starts at 0, not at the beginning of the VPD-R
> > read-only list, so it likely includes the Identifier String.
> >
> > As far as I can tell, only the broadcom/tg3 and chelsio/cxgb4/t4
> > drivers use pci_vpd_check_csum(). Of course, other drivers might
> > compute the checksum themselves.
> >
> > Any thoughts on how this spec ambiguity should be resolved?
> >
> > Any idea how devices in the field populate their VPD?
> >
> > Can you share any VPD dumps from devices that include an RV keyword
> > item?
>
> I have only very dated devices which likely date back to before
> the existence of PCIe r6.0. So their VPD dump may not really help.
>
> IIRC there's an ongoing discussion regarding making VPD content
> user-readable on mlx5 devices. Maybe check with the Mellanox/Nvidia
> guys how they interpret the spec and implemented VPD checksumming.
Good idea, cc'd.
Powered by blists - more mailing lists