lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eb2810113f98c2ff8201da1a0f827493@codeaurora.org>
Date:   Mon, 12 Mar 2018 21:04:47 +0530
From:   poza@...eaurora.org
To:     Keith Busch <keith.busch@...el.com>
Cc:     Sinan Kaya <okaya@...eaurora.org>,
        Bjorn Helgaas <helgaas@...nel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Philippe Ombredanne <pombredanne@...b.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Kate Stewart <kstewart@...uxfoundation.org>,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
        Dongdong Liu <liudongdong3@...wei.com>,
        Wei Zhang <wzhang@...com>, Timur Tabi <timur@...eaurora.org>,
        linux-pci-owner@...r.kernel.org
Subject: Re: [PATCH v12 0/6] Address error and recovery for AER and DPC

On 2018-03-12 20:28, Keith Busch wrote:
> On Mon, Mar 12, 2018 at 08:16:38PM +0530, poza@...eaurora.org wrote:
>> On 2018-03-12 19:55, Keith Busch wrote:
>> > On Sun, Mar 11, 2018 at 11:03:58PM -0400, Sinan Kaya wrote:
>> > > On 3/11/2018 6:03 PM, Bjorn Helgaas wrote:
>> > > > On Wed, Feb 28, 2018 at 10:34:11PM +0530, Oza Pawandeep wrote:
>> > >
>> > > > That difference has been there since the beginning of DPC, so it has
>> > > > nothing to do with *this* series EXCEPT for the fact that it really
>> > > > complicates the logic you're adding to reset_link() and
>> > > > broadcast_error_message().
>> > > >
>> > > > We ought to be able to simplify that somehow because the only real
>> > > > difference between AER and DPC should be that DPC automatically
>> > > > disables the link and AER does it in software.
>> > >
>> > > I agree this should be possible. Code execution path should be almost
>> > > identical to fatal error case.
>> > >
>> > > Is there any reason why you went to stop driver path, Keith?
>> >
>> > The fact is the link is truly down during a DPC event. When the link
>> > is enabled again, you don't know at that point if the device(s) on the
>> > other side have changed. Calling a driver's error handler for the wrong
>> > device in an unknown state may have undefined results. Enumerating the
>> > slot from scratch should be safe, and will assign resources, tune bus
>> > settings, and bind to the matching driver.
>> >
>> > Per spec, DPC is the recommended way for handling surprise removal
>> > events and even recommends DPC capable slots *not* set 'Surprise'
>> > in Slot Capabilities so that removals are always handled by DPC. This
>> > service driver was developed with that use in mind.
>> 
>> Now it begs the question, that
>> 
>> after DPC trigger
>> 
>> should we enumerate the devices, ?
>> or
>> error handling callbacks, followed by stop devices followed by 
>> enumeration ?
>> or
>> error handling callbacks, followed by enumeration ? (no stop devices)
> 
> I'm not sure I understand. The link is disabled while DPC is triggered,
> so if anything, you'd want to un-enumerate everything below the 
> contained
> port (that's what it does today).
> 
> After releasing a slot from DPC, the link is allowed to retrain. If 
> there
> is a working device on the other side, a link up event occurs. That
> event is handled by the pciehp driver, and that schedules enumeration
> no matter what you do to the DPC driver.

yes, that is what i current, but this patch-set makes DPC aware of error 
handling driver callbacks.

besides, in absence of pciehp there is nobody to do enumeration.

And, I was talking about pci_stop_and_remove_bus_device() in dpc.
if DPC calls driver's error callbacks, is it required to stop the 
devices  ?




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ