lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 20 Dec 2021 11:28:48 -0600
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     "David E. Box" <david.e.box@...ux.intel.com>
Cc:     nirmal.patel@...ux.intel.com, jonathan.derrick@...ux.dev,
        lorenzo.pieralisi@....com, hch@...radead.org, kw@...ux.com,
        robh@...nel.org, bhelgaas@...gle.com,
        michael.a.bottini@...ux.intel.com, rafael@...nel.org,
        me@...ityamohan.in, linux-pci@...r.kernel.org,
        linux-kernel@...r.kernel.org, Rajat Jain <rajatja@...gle.com>
Subject: Re: [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices

On Thu, Dec 16, 2021 at 01:24:00PM -0800, David E. Box wrote:
> On Thu, 2021-12-16 at 11:26 -0600, Bjorn Helgaas wrote:
> > On Wed, Dec 15, 2021 at 09:56:00PM -0800, David E. Box wrote:
> > > From: Michael Bottini <michael.a.bottini@...ux.intel.com>
> > > 
> > > On Tiger Lake and Alder Lake platforms, VMD controllers do not have ASPM
> > > enabled nor LTR values set by BIOS. This leads high power consumption on
> > > these platforms when VMD is enabled as reported in bugzilla [1].  Enable
> > > these features in the VMD driver using pcie_aspm_policy_override() to set
> > > the ASPM policy for the root ports.
> > > ...

> > > To do this, add an additional flag in VMD features to specify
> > > devices that must have their respective policies overridden.
> > 
> > I'm not clear on why you want this to apply to only certain VMDs
> > and not others.  Do some BIOSes configure ASPM for devices below
> > some VMDs?
> 
> Not currently. But the plan is for future devices to move back to
> having BIOS do the programming.

Since this is apparently a BIOS design choice, it seems wrong to base
the functionality on the Device ID instead of some signal that tells
us what the BIOS is doing.

> > > + * Override the BIOS ASPM policy and set the LTR value for PCI storage
> > > + * devices on the VMD bride.
> > 
> > I don't think there's any BIOS "policy" here.  At this point BIOS
> > is no longer involved at all, so all that's left is whatever ASPM
> > config the BIOS did or did not do.
> > 
> > Why only storage?
> 
> Only storage devices will be on these root ports.

How do you know this?  You say below that there's an M.2 slot, so
surely the slot could contain a non-storage device?  Couldn't somebody
build a platform with a VMD root port connected to a regular PCIe x4
slot?  Couldn't such a slot support hotplug?

It would be very unusual to hard-code topology knowledge like this
into the kernel, since plug-and-play has always been a major goal of
PCI.

> > > +static int vmd_enable_aspm(struct pci_dev *pdev, void *userdata)
> > > +{
> > > +       int features = *(int *)userdata, pos;
> > > +
> > > +       if (!(features & VMD_FEAT_QUIRK_OVERRIDE_ASPM) ||
> > > +           pdev->class != PCI_CLASS_STORAGE_EXPRESS)
> > > +               return 0;

> > > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, 0x1003);
> > > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_NOSNOOP_LAT, 0x1003);
> > 
> > 1) Where did this magic 0x1003 value come from?  Does that depend
> > on the VMD device?  The endpoint?  The circuit design?  The path
> > between endpoint and VMD?  What if there are switches in the path?
> 
> The number comes from the BIOS team. They are tied to the SoC. I
> don't believe there can be switches in the path but Nirmal and
> Jonathan should know for sure. From what I've seen these root ports
> are wired directly to M.2 slots on boards that are intended for
> storage devices.

I guess you're saying that 0x1003 is determined by the SoC.  If so, I
think this value should be in your .driver_data (which would mean
converting it from a scalar to a struct, as many other drivers do).
The current code suggests that 0x1003 is the correct value for *all*
VMDs and all configurations.

I don't understand LTR well enough to be convinced that this static
value would be the correct value for all possible hierarchies and
devices that could appear below a VMD root port.  I would love to be
educated about this.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ