[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e639b361-785e-d39b-3c3f-957bcdc54fcd@linux.intel.com>
Date: Thu, 24 Apr 2025 15:37:38 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Lukas Wunner <lukas@...ner.de>
cc: Bjorn Helgaas <bhelgaas@...gle.com>, linux-pci@...r.kernel.org,
LKML <linux-kernel@...r.kernel.org>,
"Maciej W. Rozycki" <macro@...am.me.uk>
Subject: Re: [PATCH v2 1/1] PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN
flag
On Thu, 24 Apr 2025, Lukas Wunner wrote:
> On Wed, Apr 23, 2025 at 02:37:11PM +0300, Ilpo Järvinen wrote:
> > On Wed, 23 Apr 2025, Lukas Wunner wrote:
> > > On Tue, Apr 22, 2025 at 02:55:47PM +0300, Ilpo Järvinen wrote:
> > > > +void pcie_reset_lbms(struct pci_dev *port)
> > > > {
> > > > - struct pcie_bwctrl_data *data;
> > > > -
> > > > - guard(rwsem_read)(&pcie_bwctrl_lbms_rwsem);
> > > > - data = port->link_bwctrl;
> > > > - if (data)
> > > > - atomic_set(&data->lbms_count, 0);
> > > > - else
> > > > - pcie_capability_write_word(port, PCI_EXP_LNKSTA,
> > > > - PCI_EXP_LNKSTA_LBMS);
> > > > + clear_bit(PCI_LINK_LBMS_SEEN, &port->priv_flags);
> > > > + pcie_capability_write_word(port, PCI_EXP_LNKSTA, PCI_EXP_LNKSTA_LBMS);
> > > > }
> > >
> > > Hm, previously the LBMS bit was only cleared in the Link Status register
> > > if the bandwith controller hadn't probed yet. Now it's cleared
> > > unconditionally. I'm wondering if this changes the logic somehow?
> >
> > Hmm, that's a good question and I hadn't thought all the implications.
> > I suppose leaving if (!port->link_bwctrl) there would retain the existing
> > behavior better allowing bwctrl to pick the link speed changes more
> > reliably.
>
> I think the only potential issue with clearing the LBMS bit in the register
> is that the bandwidth controller's irq handler won't see the bit and may
> return with IRQ_NONE.
>
> However, looking at the callers of pcie_reset_lbms(), that doesn't seem
> to be a real issue. There are only two of them:
>
> - pcie_retrain_link() calls the function after the link was retrained.
> I guess the LBMS bit in the register may be set as a side-effect of
> the link retraining?
Retraining does set LBMS, whether the speed was same before doesn't
matter. I think it's because LTSSM-wise, retraining transitions through
Recovery.
(I don't know why, but in most tests I've done LBMS is actually asserted
not only once but twice with only one Link Retraining event).
> The only concern here is whether the cached
> link speed is updated. pcie_bwctrl_change_speed() does call
> pcie_update_link_speed() after calling pcie_retrain_link(), so that
> looks fine. But there's a second caller of pcie_retrain_link():
> pcie_aspm_configure_common_clock(). It doesn't update the cached
> link speed after calling pcie_retrain_link(). Not sure if this can
> lead to a change in link speed and therefore the cached link speed
> should be updated? The Target Link Speed isn't changed, but maybe
> the link fails to retrain to the same speed for electrical reasons?
I've never seen that to happen but it would seem odd if that is forbidden
(as the alternative is probably that the link remains down).
Perhaps pcie_reset_lbms() should just call pcie_update_link_speed() as the
last step, then the irq handler returning IRQ_NONE doesn't matter.
> - pciehp's remove_board() calls the function after bringing down the slot
> to avoid a stale PCI_LINK_LBMS_SEEN flag. No real harm in clearing the
> bit in the register at this point I guess. But I do wonder, is the link
> speed updated somewhere when a new board is added? The replacement
> device may not support the same speeds as the previous device.
The supported speeds are always recalculated using dev->supported_speeds.
A new board implies a new pci_dev structure with newly read supported
speeds. Also, bringing the link up with the replacement device will also
trigger LBMS so the new Link Speed should be picked up by that.
Racing LBMS reset from remove_board() with LBMS due to the replacement
board shouldn't result in stale Link Speed because of:
board_added()
pciehp_check_link_status()
__pcie_update_link_speed()
> > Given this flag is only for the purposes of the quirk, it seems very much
> > out of proportions.
>
> Yes, let's try to minimize the amount of locking, flags and code to support
> the quirk. Keep it as simple as possible. So in that sense, the solution
> you've chosen is probably fine.
>
>
> > > > static bool pcie_lbms_seen(struct pci_dev *dev, u16 lnksta)
> > > > {
> > > > - unsigned long count;
> > > > - int ret;
> > > > -
> > > > - ret = pcie_lbms_count(dev, &count);
> > > > - if (ret < 0)
> > > > - return lnksta & PCI_EXP_LNKSTA_LBMS;
> > > > + if (test_bit(PCI_LINK_LBMS_SEEN, &dev->priv_flags))
> > > > + return true;
> > > >
> > > > - return count > 0;
> > > > + return lnksta & PCI_EXP_LNKSTA_LBMS;
> > > > }
> > >
> > > Another small logic change here: Previously pcie_lbms_count()
> > > returned a negative value if the bandwidth controller hadn't
> > > probed yet or wasn't compiled into the kernel.
> > > Only in those two cases was the LBMS flag in the lnksta variable
> > > returned.
> > >
> > > Now the LBMS flag is also returned if the bandwidth controller
> > > is compiled into the kernel and has probed, but its irq handler
> > > hasn't recorded a seen LBMS bit yet.
> > >
> > > I'm guessing this can happen if the quirk races with the irq
> > > handler and wins the race, so this safety net is needed?
> >
> > The main reason why this check is here is for the boot when bwctrl is not
> > yet probed when the quirk runs. But the check just seems harmless, or
> > even somewhat useful, in the case when bwctrl has already probed. LBMS
> > being asserted should result in PCI_LINK_LBMS_SEEN even if the irq
> > handler has not yet done its job to transfer it into priv_flags.
>
> Okay I'm convinced that the logic change in pcie_lbms_seen() is fine.
>
> Thanks,
>
> Lukas
>
--
i.
Powered by blists - more mailing lists