lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20221107200637.heoakrpzob4yz7c5@pali>
Date:   Mon, 7 Nov 2022 21:06:37 +0100
From:   Pali Rohár <pali@...nel.org>
To:     Nathan Rossi <nathan@...hanrossi.com>
Cc:     linux-pci@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org, Nathan Rossi <nathan.rossi@...i.com>,
        Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
        Lorenzo Pieralisi <lpieralisi@...nel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>
Subject: Re: [PATCH] PCI: mvebu: Set Target Link Speed for 2.5GT downstream
 devices

On Monday 07 November 2022 19:10:02 Nathan Rossi wrote:
> On Mon, 7 Nov 2022 at 18:43, Pali Rohár <pali@...nel.org> wrote:
> >
> > On Monday 07 November 2022 08:13:27 Nathan Rossi wrote:
> > > From: Nathan Rossi <nathan.rossi@...i.com>
> > >
> > > There is a known issue with the mvebu PCIe controller when triggering
> > > retraining of the link (via Link Control) where the link is dropped
> > > completely causing significant delay in the renegotiation of the link.
> > > This occurs only when the downstream device is 2.5GT and the upstream
> > > port is configured to support both 2.5GT and 5GT.
> > >
> > > It is possible to prevent this link dropping by setting the associated
> > > link speed in Target Link Speed of the Link Control 2 register. This
> > > only needs to be done when the downstream is specifically 2.5GT.
> > >
> > > This change applies the required Target Link Speed value during
> > > mvebu_pcie_setup_hw conditionally depending on the current link speed
> > > from the Link Status register, only applying the change when the link
> > > is configured to 2.5GT already.
> > >
> > > Signed-off-by: Nathan Rossi <nathan.rossi@...i.com>
> > > ---
> > >  drivers/pci/controller/pci-mvebu.c | 18 +++++++++++++++++-
> > >  1 file changed, 17 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/controller/pci-mvebu.c
> > > index 1ced73726a..6a869a33ba 100644
> > > --- a/drivers/pci/controller/pci-mvebu.c
> > > +++ b/drivers/pci/controller/pci-mvebu.c
> > > @@ -248,7 +248,7 @@ static void mvebu_pcie_setup_wins(struct mvebu_pcie_port *port)
> > >
> > >  static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port)
> > >  {
> > > -     u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl;
> > > +     u32 ctrl, lnkcap, cmd, dev_rev, unmask, sspl, lnksta, lnkctl2;
> > >
> > >       /* Setup PCIe controller to Root Complex mode. */
> > >       ctrl = mvebu_readl(port, PCIE_CTRL_OFF);
> > > @@ -339,6 +339,22 @@ static void mvebu_pcie_setup_hw(struct mvebu_pcie_port *port)
> > >       unmask |= PCIE_INT_INTX(0) | PCIE_INT_INTX(1) |
> > >                 PCIE_INT_INTX(2) | PCIE_INT_INTX(3);
> > >       mvebu_writel(port, unmask, PCIE_INT_UNMASK_OFF);
> > > +
> > > +     /*
> > > +      * Set Target Link Speed within the Link Control 2 register when the
> > > +      * linked downstream device is connected at 2.5GT. This is configured
> > > +      * in order to avoid issues with the controller when the upstream port
> > > +      * is configured to support 2.5GT and 5GT and the downstream device is
> > > +      * linked at 2.5GT, retraining the link in this case causes the link to
> > > +      * drop taking significant time to retrain.
> > > +      */
> > > +     lnksta = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL) >> 16;
> > > +     if ((lnksta & PCI_EXP_LNKSTA_CLS) == PCI_EXP_LNKSTA_CLS_2_5GB) {
> >
> > This code does not work because at this stage endpoint device does not
> > have to be ready and therefore link is not established yet.
> >
> > Also this code is not running when kernel issue PCIe Hot Reset via
> > PCI Secondary Bus Reset bit.
> >
> > And it does not handle possible hot-plug situation.
> >
> > That check that code below has to be done _after_ kernel enumerate
> > device. PCI core code has already logic to handle delays for "slow"
> > devices.
> >
> > And reverse operation (setting lnkctl2 target speed to original value)
> > has to be called after unplugging device - when link goes down.
> >
> > If you want to work on this stuff, I can try to find my notes which I
> > done during investigation of this issue... where is probably the best
> > place in kernel pci core code for handling this issue.
> 
> Some notes/direction for implementation would be very appreciated. I
> am not particularly familiar with the pci core code, so I don't have a
> good idea on how to best implement this workaround.

Ok, I have checked and seems that I have removed my notes :-(

So trying to reconstruct information from my memory...

Target link speed in Root port's lnkctl2 register must be set to
_correct_ value before configuring ASPM. Because link retraining (part
of ASPM configuration) fails. ASPM is initialized by calling function
pcie_aspm_init_link_state() from _non-endpoint_ device and it is called
at the end of function pci_scan_slot().

Look also at the tree-traversal functions pci_scan_child_bus_extend()
and pci_scan_bridge_extend() and try to find the best place where should
be this "fix" called.

Because same issue as you are trying to fix is also in pci-aardvark.c
hardware (Marvell too), I think that you can introduce some flag in
struct pci_host_bridge, set it in pci-mvebu.c (later I can do same in
pci-aardvark.c) and then in core pci code (in some of above mentioned
function when you find the proper place in tree traversal) add code
which "fixes" lnkctl2 register.

Because both pci hotplug and static initialization calls those pci core
scan functions, this should fix init-probe part.

Second thing is fixing unplugging part. Because in hotplug setup you can
connect 2.5GT/s GEN1 card (which requires this workaround), then
disconnect it and connect some 5GT/s GEN2 card, it is needed to set
target link back to 5GT/s to use full speed of GEN2 card.

For this second part, I think that it is needed to change target link
speed back to 5GT/s after card is disconnected. As a good candidates
where to do it is probably pci_stop_dev() or pci_destroy_dev() function.
Beware that it is needed to change link speed of device on the other end
of link - not the device which is being removed/unregistered. And check
if it is the last kernel device being unregistered from the bus
(endpoint card may be multifunction device).

I hope that this information would help you. I'm really sorry that I do
not have my notes about this issue where I documented it. Anyway I would
try to provide other information if needed.

> Thanks,
> Nathan
> 
> >
> > > +             lnkctl2 = mvebu_readl(port, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
> > > +             lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS;
> > > +             lnkctl2 |= PCI_EXP_LNKCTL2_TLS_2_5GT;
> > > +             mvebu_writel(port, lnkctl2, PCIE_CAP_PCIEXP + PCI_EXP_LNKCTL2);
> > > +     }
> > >  }
> > >
> > >  static struct mvebu_pcie_port *mvebu_pcie_find_port(struct mvebu_pcie *pcie,
> > > ---
> > > 2.37.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ