lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55b3a469-c306-acf1-f97e-f07f40054974@linux.intel.com>
Date:   Tue, 26 May 2020 20:06:01 -0700
From:   "Kuppuswamy, Sathyanarayanan" 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
To:     Oliver O'Halloran <oohall@...il.com>
Cc:     Yicong Yang <yangyicong@...ilicon.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        jay.vosburgh@...onical.com, linux-pci@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        ashok.raj@...el.com
Subject: Re: [PATCH v1 1/1] PCI/ERR: Handle fatal error recovery for
 non-hotplug capable devices

Hi,

On 5/26/20 8:00 PM, Oliver O'Halloran wrote:
> On Wed, May 27, 2020 at 12:00 PM Kuppuswamy, Sathyanarayanan
> <sathyanarayanan.kuppuswamy@...ux.intel.com> wrote:
>>
>> Hi,
>>
>> On 5/21/20 7:56 PM, Yicong Yang wrote:
>>>
>>>
>>> On 2020/5/22 3:31, Kuppuswamy, Sathyanarayanan wrote:
>>>>
>>> Not exactly. In pci_bus_error_reset(), we call pci_slot_reset() only if it's
>>> hotpluggable. But we always call pci_bus_reset() to perform a secondary bus
>>> reset for the bridge. That's what I think is unnecessary for a normal link,
>>> and that's what reset link indicates us to do. The slot reset is introduced
>>> in the process only to solve side effects. (c4eed62a2143, PCI/ERR: Use slot reset if available)
>>
>> IIUC, pci_bus_reset() will do slot reset if its supported (hot-plug
>> capable slots). If its not supported then it will attempt secondary
>> bus reset. So secondary bus reset will be attempted only if slot
>> reset is not supported.
>>
>> Since reported_error_detected() requests us to do reset, we will have
>> to attempt some kind of reset before we call ->slot_reset() right?
> 
> Yes, the driver returns PCI_ERS_RESULT_NEED_RESET from
> ->error_detected() to indicate that it doesn't know how to recover
> from the error. How that reset is performed doesn't really matter, but
> it does need to happen.
> 
> 
>>> PCI_ERS_RESULT_NEED_RESET indicates that the driver
>>> wants a platform-dependent slot reset and its ->slot_reset() method to be called then.
>>> I don't think it's same as slot reset mentioned above, which is only for hotpluggable
>>> ones.
>> What you think is the correct reset implementation ? Is it something
>> like this?
>>
>> if (hotplug capable)
>>      try_slot_reset()
>> else
>>      do_nothing()
> 
> Looks broken to me, but all the reset handling is a rat's nest so
> maybe I'm missing something. In the case of a DPC trip the link is
> disabled which has the side-effect of hot-resetting the downstream
> device. Maybe it's fine?
Yes, in case of DPC (Fatal errors) link is already reset. So we
don't need any special handling. This reset logic is mainly for
non-fatal errors.
> 
> As an aside, why do we have both ->slot_reset() and ->reset_done() in
> the error handling callbacks? Seems like their roles are almost
> identical.
Not sure.I think reset_done() is final cleanup.
> 
> Oliver
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ