lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <85786a3a-c7d4-95cc-170f-26f89c2b337e@nvidia.com>
Date:   Thu, 11 Mar 2021 19:53:26 +0530
From:   Vidya Sagar <vidyas@...dia.com>
To:     Pali Rohár <pali@...nel.org>,
        Alex Williamson <alex.williamson@...hat.com>
CC:     Bjorn Helgaas <helgaas@...nel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        Marek Behún <kabel@...nel.org>,
        <linux-pci@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        Amey Narkhede <ameynarkhede02@...il.com>
Subject: Re: RFC: sysfs node for Secondary PCI bus reset (PCIe Hot Reset)



On 3/3/2021 11:26 PM, Pali Rohár wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Tuesday 02 March 2021 12:58:29 Alex Williamson wrote:
>> On Mon, 1 Mar 2021 14:28:17 -0600
>> Bjorn Helgaas <helgaas@...nel.org> wrote:
>>
>>> [+cc Alex, reset expert]
>>>
>>> On Mon, Mar 01, 2021 at 06:12:21PM +0100, Pali Rohár wrote:
>>>> Hello!
>>>>
>>>> PCIe card can be reset via in-band Hot Reset signal which can be
>>>> triggered by PCIe bridge via Secondary Bus Reset bit in PCI config
>>>> space.
>>>>
>>>> Kernel already exports sysfs node "reset" for triggering Functional
>>>> Reset of particular function of PCI device. But in some cases Functional
>>>> Reset is not enough and Hot Reset is required.
>>>>
>>>> Following RFC patch exports sysfs node "reset_bus" for PCI bridges which
>>>> triggers Secondary Bus Reset and therefore for PCIe bridges it resets
>>>> connected PCIe card.
>>>>
>>>> What do you think about it?
>>>>
>>>> Currently there is userspace script which can trigger PCIe Hot Reset by
>>>> modifying PCI config space from userspace:
>>>>
>>>> https://alexforencich.com/wiki/en/pcie/hot-reset-linux
>>>>
>>>> But because kernel already provides way how to trigger Functional Reset
>>>> it could provide also way how to trigger PCIe Hot Reset.
>>
>> What that script does and what this does, or what the existing reset
>> attribute does, are very different.  The script finds the upstream
>> bridge for a given device, removes the device (ignoring that more than
>> one device might be affected by the bus reset), uses setpci to trigger
>> a secondary bus reset, then rescans devices.  The below only triggers
>> the secondary bus reset, neither saving and restoring affected device
>> state like the existing function level reset attribute, nor removing
>> and rescanning as the script does.  It simply leaves an entire
>> hierarchy of PCI devices entirely un-programmed yet still has struct
>> pci_devs attached to them for untold future misery.
>>
>> In fact, for the case of a single device affected by the bus reset, as
>> intended by the script, the existing reset attribute will already do
>> that if the device supports no other reset mechanism.  There's actually
>> a running LFX mentorship project that aims to allow the user to control
>> the type of reset performed by the existing reset attribute such that a
>> user could force the bus reset behavior over other reset methods.
> 
> Hello Alex? Do you have a link for this "reset" project? I'm interesting
> in it as I'm dealing with Compex wifi cards which are causing problems.
> 
> For correct initialization I need to issue PCIe Warm Reset for these
> cards (Warm Reset is done via PERST# pin which most linux controller
> drivers controls via GPIO subsystem). And for now there is no way to
> trigger PCIe Warm Reset for particular PCIe device from userspace. As
> there is no userspace <--> kernel API for it.
> 
>> There might be some justification for an attribute that actually
>> implements the referenced script correctly, perhaps in kernel we could
>> avoid races with bus rescans, but simply triggering an SBR to quietly
>> de-program all downstream devices with no state restore or device
>> rescan is not it.  Any affected device would be unusable.  Was this
>> tested?  Thanks,
> 
> I have tested my change. First I called 'remove' attribute for PCIe
> card, then I called this 'bus_reset' on parent PCIe bridge and later I
> called 'rescan' attribute on bridge. It correctly rested tested ath9k
> card. So I did something similar as in above script. But I agree that
> there are race conditions and basically lot of other calls needs to be
> done to restore state.
> 
> So I see that to make it 'usable' we need to do it automatically in
> kernel and also rescan/restore state of PCIe devices behind bridge after
> reset...
But, is save-restore alone going to be enough? I mean what is the state 
of the device-driver going to be when the device is going through the 
reset process? Isn't remove-rescan the correct thing to do here rather 
than save/restore?

- Vidya Sagar
> 
>> Alex
>>
>>>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>>>> index 50fcb62d59b5..f5e11c589498 100644
>>>> --- a/drivers/pci/pci-sysfs.c
>>>> +++ b/drivers/pci/pci-sysfs.c
>>>> @@ -1321,6 +1321,30 @@ static ssize_t reset_store(struct device *dev, struct device_attribute *attr,
>>>>
>>>>   static DEVICE_ATTR(reset, 0200, NULL, reset_store);
>>>>
>>>> +static ssize_t reset_bus_store(struct device *dev, struct device_attribute *attr,
>>>> +                        const char *buf, size_t count)
>>>> +{
>>>> + struct pci_dev *pdev = to_pci_dev(dev);
>>>> + unsigned long val;
>>>> + ssize_t result = kstrtoul(buf, 0, &val);
>>>> +
>>>> + if (result < 0)
>>>> +         return result;
>>>> +
>>>> + if (val != 1)
>>>> +         return -EINVAL;
>>>> +
>>>> + pm_runtime_get_sync(dev);
>>>> + result = pci_bridge_secondary_bus_reset(pdev);
>>>> + pm_runtime_put(dev);
>>>> + if (result < 0)
>>>> +         return result;
>>>> +
>>>> + return count;
>>>> +}
>>>> +
>>>> +static DEVICE_ATTR(reset_bus, 0200, NULL, reset_bus_store);
>>>> +
>>>>   static int pci_create_capabilities_sysfs(struct pci_dev *dev)
>>>>   {
>>>>    int retval;
>>>> @@ -1332,8 +1356,15 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev)
>>>>            if (retval)
>>>>                    goto error;
>>>>    }
>>>> + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
>>>> +         retval = device_create_file(&dev->dev, &dev_attr_reset_bus);
>>>> +         if (retval)
>>>> +                 goto error_reset_bus;
>>>> + }
>>>>    return 0;
>>>>
>>>> +error_reset_bus:
>>>> + device_remove_file(&dev->dev, &dev_attr_reset);
>>>>   error:
>>>>    pcie_vpd_remove_sysfs_dev_files(dev);
>>>>    return retval;
>>>> @@ -1414,6 +1445,8 @@ static void pci_remove_capabilities_sysfs(struct pci_dev *dev)
>>>>            device_remove_file(&dev->dev, &dev_attr_reset);
>>>>            dev->reset_fn = 0;
>>>>    }
>>>> + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
>>>> +         device_remove_file(&dev->dev, &dev_attr_reset_bus);
>>>>   }
>>>>
>>>>   /**
>>>
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ