lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210311143027.mwzdgme3omkj3ld5@pali>
Date:   Thu, 11 Mar 2021 15:30:27 +0100
From:   Pali Rohár <pali@...nel.org>
To:     Vidya Sagar <vidyas@...dia.com>
Cc:     Alex Williamson <alex.williamson@...hat.com>,
        Bjorn Helgaas <helgaas@...nel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        Marek Behún <kabel@...nel.org>,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
        Amey Narkhede <ameynarkhede02@...il.com>
Subject: Re: RFC: sysfs node for Secondary PCI bus reset (PCIe Hot Reset)

On Thursday 11 March 2021 19:53:26 Vidya Sagar wrote:
> On 3/3/2021 11:26 PM, Pali Rohár wrote:
> > External email: Use caution opening links or attachments
> > 
> > 
> > On Tuesday 02 March 2021 12:58:29 Alex Williamson wrote:
> > > On Mon, 1 Mar 2021 14:28:17 -0600
> > > Bjorn Helgaas <helgaas@...nel.org> wrote:
> > > 
> > > > [+cc Alex, reset expert]
> > > > 
> > > > On Mon, Mar 01, 2021 at 06:12:21PM +0100, Pali Rohár wrote:
> > > > > Hello!
> > > > > 
> > > > > PCIe card can be reset via in-band Hot Reset signal which can be
> > > > > triggered by PCIe bridge via Secondary Bus Reset bit in PCI config
> > > > > space.
> > > > > 
> > > > > Kernel already exports sysfs node "reset" for triggering Functional
> > > > > Reset of particular function of PCI device. But in some cases Functional
> > > > > Reset is not enough and Hot Reset is required.
> > > > > 
> > > > > Following RFC patch exports sysfs node "reset_bus" for PCI bridges which
> > > > > triggers Secondary Bus Reset and therefore for PCIe bridges it resets
> > > > > connected PCIe card.
> > > > > 
> > > > > What do you think about it?
> > > > > 
> > > > > Currently there is userspace script which can trigger PCIe Hot Reset by
> > > > > modifying PCI config space from userspace:
> > > > > 
> > > > > https://alexforencich.com/wiki/en/pcie/hot-reset-linux
> > > > > 
> > > > > But because kernel already provides way how to trigger Functional Reset
> > > > > it could provide also way how to trigger PCIe Hot Reset.
> > > 
> > > What that script does and what this does, or what the existing reset
> > > attribute does, are very different.  The script finds the upstream
> > > bridge for a given device, removes the device (ignoring that more than
> > > one device might be affected by the bus reset), uses setpci to trigger
> > > a secondary bus reset, then rescans devices.  The below only triggers
> > > the secondary bus reset, neither saving and restoring affected device
> > > state like the existing function level reset attribute, nor removing
> > > and rescanning as the script does.  It simply leaves an entire
> > > hierarchy of PCI devices entirely un-programmed yet still has struct
> > > pci_devs attached to them for untold future misery.
> > > 
> > > In fact, for the case of a single device affected by the bus reset, as
> > > intended by the script, the existing reset attribute will already do
> > > that if the device supports no other reset mechanism.  There's actually
> > > a running LFX mentorship project that aims to allow the user to control
> > > the type of reset performed by the existing reset attribute such that a
> > > user could force the bus reset behavior over other reset methods.
> > 
> > Hello Alex? Do you have a link for this "reset" project? I'm interesting
> > in it as I'm dealing with Compex wifi cards which are causing problems.
> > 
> > For correct initialization I need to issue PCIe Warm Reset for these
> > cards (Warm Reset is done via PERST# pin which most linux controller
> > drivers controls via GPIO subsystem). And for now there is no way to
> > trigger PCIe Warm Reset for particular PCIe device from userspace. As
> > there is no userspace <--> kernel API for it.
> > 
> > > There might be some justification for an attribute that actually
> > > implements the referenced script correctly, perhaps in kernel we could
> > > avoid races with bus rescans, but simply triggering an SBR to quietly
> > > de-program all downstream devices with no state restore or device
> > > rescan is not it.  Any affected device would be unusable.  Was this
> > > tested?  Thanks,
> > 
> > I have tested my change. First I called 'remove' attribute for PCIe
> > card, then I called this 'bus_reset' on parent PCIe bridge and later I
> > called 'rescan' attribute on bridge. It correctly rested tested ath9k
> > card. So I did something similar as in above script. But I agree that
> > there are race conditions and basically lot of other calls needs to be
> > done to restore state.
> > 
> > So I see that to make it 'usable' we need to do it automatically in
> > kernel and also rescan/restore state of PCIe devices behind bridge after
> > reset...
> But, is save-restore alone going to be enough? I mean what is the state of
> the device-driver going to be when the device is going through the reset
> process? Isn't remove-rescan the correct thing to do here rather than
> save/restore?

I think that remove-reset-rescan (if done atomically with correct
timeouts between them and applying kernel quirks) should be correct
here.

> - Vidya Sagar
> > 
> > > Alex
> > > 
> > > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> > > > > index 50fcb62d59b5..f5e11c589498 100644
> > > > > --- a/drivers/pci/pci-sysfs.c
> > > > > +++ b/drivers/pci/pci-sysfs.c
> > > > > @@ -1321,6 +1321,30 @@ static ssize_t reset_store(struct device *dev, struct device_attribute *attr,
> > > > > 
> > > > >   static DEVICE_ATTR(reset, 0200, NULL, reset_store);
> > > > > 
> > > > > +static ssize_t reset_bus_store(struct device *dev, struct device_attribute *attr,
> > > > > +                        const char *buf, size_t count)
> > > > > +{
> > > > > + struct pci_dev *pdev = to_pci_dev(dev);
> > > > > + unsigned long val;
> > > > > + ssize_t result = kstrtoul(buf, 0, &val);
> > > > > +
> > > > > + if (result < 0)
> > > > > +         return result;
> > > > > +
> > > > > + if (val != 1)
> > > > > +         return -EINVAL;
> > > > > +
> > > > > + pm_runtime_get_sync(dev);
> > > > > + result = pci_bridge_secondary_bus_reset(pdev);
> > > > > + pm_runtime_put(dev);
> > > > > + if (result < 0)
> > > > > +         return result;
> > > > > +
> > > > > + return count;
> > > > > +}
> > > > > +
> > > > > +static DEVICE_ATTR(reset_bus, 0200, NULL, reset_bus_store);
> > > > > +
> > > > >   static int pci_create_capabilities_sysfs(struct pci_dev *dev)
> > > > >   {
> > > > >    int retval;
> > > > > @@ -1332,8 +1356,15 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev)
> > > > >            if (retval)
> > > > >                    goto error;
> > > > >    }
> > > > > + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
> > > > > +         retval = device_create_file(&dev->dev, &dev_attr_reset_bus);
> > > > > +         if (retval)
> > > > > +                 goto error_reset_bus;
> > > > > + }
> > > > >    return 0;
> > > > > 
> > > > > +error_reset_bus:
> > > > > + device_remove_file(&dev->dev, &dev_attr_reset);
> > > > >   error:
> > > > >    pcie_vpd_remove_sysfs_dev_files(dev);
> > > > >    return retval;
> > > > > @@ -1414,6 +1445,8 @@ static void pci_remove_capabilities_sysfs(struct pci_dev *dev)
> > > > >            device_remove_file(&dev->dev, &dev_attr_reset);
> > > > >            dev->reset_fn = 0;
> > > > >    }
> > > > > + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
> > > > > +         device_remove_file(&dev->dev, &dev_attr_reset_bus);
> > > > >   }
> > > > > 
> > > > >   /**
> > > > 
> > > 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ