[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20231026220216.GA1752508@bhelgaas>
Date: Thu, 26 Oct 2023 17:02:16 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
Cc: "Rafael J . Wysocki" <rafael@...nel.org>,
linux-pci@...r.kernel.org,
Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
Rob Herring <robh@...nel.org>,
Krzysztof Wilczyński <kw@...ux.com>,
Lukas Wunner <lukas@...ner.de>,
Heiner Kallweit <hkallweit1@...il.com>,
Emmanuel Grumbach <emmanuel.grumbach@...el.com>,
LKML <linux-kernel@...r.kernel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
ath10k@...ts.infradead.org, ath11k@...ts.infradead.org,
ath12k@...ts.infradead.org, intel-wired-lan@...ts.osuosl.org,
linux-arm-kernel@...ts.infradead.org,
linux-bluetooth@...r.kernel.org,
linux-mediatek@...ts.infradead.org, linux-rdma@...r.kernel.org,
linux-wireless@...r.kernel.org, Netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH v2 03/13] PCI/ASPM: Disable ASPM when driver requests it
On Mon, Oct 16, 2023 at 05:27:37PM +0300, Ilpo Järvinen wrote:
> On Fri, 13 Oct 2023, Bjorn Helgaas wrote:
> > On Thu, Oct 12, 2023 at 01:56:16PM +0300, Ilpo Järvinen wrote:
> > > On Wed, 11 Oct 2023, Bjorn Helgaas wrote:
> > > > On Mon, Sep 18, 2023 at 04:10:53PM +0300, Ilpo Järvinen wrote:
> > > > > PCI core/ASPM service driver allows controlling ASPM state
> > > > > through pci_disable_link_state() and pci_enable_link_state()
> > > > > API. It was decided earlier (see the Link below), to not
> > > > > allow ASPM changes when OS does not have control over it but
> > > > > only log a warning about the problem (commit 2add0ec14c25
> > > > > ("PCI/ASPM: Warn when driver asks to disable ASPM, but we
> > > > > can't do it")). Similarly, if ASPM is not enabled through
> > > > > config, ASPM cannot be disabled.
> > > ...
> >
> > > > This disables *all* ASPM states, unlike the version when
> > > > CONFIG_PCIEASPM is enabled. I suppose there's a reason, and
> > > > maybe a comment could elaborate on it?
> > > >
> > > > When CONFIG_PCIEASPM is not enabled, I don't think we actively
> > > > *disable* ASPM in the hardware; we just leave it as-is, so
> > > > firmware might have left it enabled.
> > >
> > > This whole trickery is intended for drivers that do not want to
> > > have ASPM because the devices are broken with it. So leaving it
> > > as-is is not really an option (as demonstrated by the custom
> > > workarounds).
> >
> > Right.
> >
> > > > Conceptually it seems like the LNKCTL updates here should be
> > > > the same whether CONFIG_PCIEASPM is enabled or not (subject to
> > > > the question above).
> > > >
> > > > When CONFIG_PCIEASPM is enabled, we might need to do more
> > > > stuff, but it seems like the core should be the same.
> > >
> > > So you think it's safer to partially disable ASPM (as per
> > > driver's request) rather than disable it completely? I got the
> > > impression that the latter might be safer from what Rafael said
> > > earlier but I suppose I might have misinterpreted him since he
> > > didn't exactly say that it might be safer to _completely_
> > > disable it.
> >
> > My question is whether the state of the device should depend on
> > CONFIG_PCIEASPM. If the driver does this:
> >
> > pci_disable_link_state(PCIE_LINK_STATE_L0S)
> >
> > do we want to leave L1 enabled when CONFIG_PCIEASPM=y but disable L1
> > when CONFIG_PCIEASPM is unset?
> >
> > I can see arguments both ways. My thought was that it would be nice
> > to end up with a single implementation of pci_disable_link_state()
> > with an #ifdef around the CONFIG_PCIEASPM-enabled stuff because it
> > makes the code easier to read.
Responding to myself here, I think we should do the partial disables
because it matches what the drivers did previously by hand, we can
reduce the number of code paths, and the resulting device state will
be the same regardless of CONFIG_PCIEASPM.
> I think there's still one important thing to discuss and none of the
> comments have covered that area so far.
>
> The drivers that have workaround are not going to turn more
> dangerous than they're already without this change, so we're mostly
> within charted waters there even with what you propose. However, I
> think the bigger catch and potential source of problems, with both
> this v2 and your alternative, are the drivers that do not have the
> workarounds around CONFIG_PCIEASPM=n and/or _OSC permissions. Those
> code paths just call pci_disable_link_state() and do nothing else.
>
> Do you think it's okay to alter the behavior for those drivers too
> (disable ASPM where it previously was a no-op)?
Yes. I assume the reason those drivers call pci_disable_link_state()
is because some hardware defect means ASPM doesn't work correctly.
This change means pci_disable_link_state() will disable ASPM even when
the OS doesn't own ASPM or CONFIG_PCIEASPM is unset. I think those
cases are unusual and probably not well tested, and I suspect that if
we *did* test them, we'd find that ASPM doesn't work with the current
kernel.
So I think this is more likely to *fix* something than to break it.
Bjorn
Powered by blists - more mailing lists