[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e5af8e6e-ce37-eba4-330e-94b43bd0adb7@linux.intel.com>
Date: Tue, 24 Mar 2020 18:17:44 -0700
From: "Kuppuswamy, Sathyanarayanan"
<sathyanarayanan.kuppuswamy@...ux.intel.com>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
ashok.raj@...el.com
Subject: Re: [PATCH v18 03/11] PCI/DPC: Fix DPC recovery issue in non hotplug
case
Hi Bjorn,
On 3/24/20 4:49 PM, Bjorn Helgaas wrote:
> I don't understand why hotplug is relevant here. This path
> (dpc_reset_link()) is only used for downstream ports that support DPC.
> DPC has already disabled the link, which resets everything below the
> port, regardless of whether the port supports hotplug.
>
> I do see that PCI_ERS_RESULT_NEED_RESET seems to promise a lot more
> than it actually*does*. The doc (pci-error-recovery.rst) says
> .error_detected() can return PCI_ERS_RESULT_NEED_RESET to*request* a
> slot reset. But if that happens, pcie_do_recovery() doesn't do a
> reset at all. It calls the driver's .slot_reset() method, which tells
> the driver "we've reset your device; please re-initialize the
> hardware."
>
> I guess this abuses PCI_ERS_RESULT_NEED_RESET by taking advantage of
> that implementation deficiency in pcie_do_recovery(): we know the
> downstream devices have already been reset via DPC, and returning
> PCI_ERS_RESULT_NEED_RESET means we'll call .slot_reset() to tell the
> driver about that reset.
>
> I can see how this achieves the desired result, but if/when we fix
> pcie_do_recovery() to actually*do* the reset promised by
> PCI_ERS_RESULT_NEED_RESET, we will be doing*two* resets: the first
> via DPC and a second via whatever slot reset mechanism
> pcie_do_recovery() would use.
When we fix this issue, if we make sure the reset logic is
implemented before we call .reset_link callback we should be
able to avoid resetting the device twice. Before we call DPC
.reset_link callback, the device link will not up and hence we
should not able to reset it.
>
> So I guess the real issue (as you allude to in the commit log) is that
> we rely on hotplug to unbind/rebind the driver, and without hotplug we
> need to at least tell the driver the device was reset.
Agree
>
> I'll try to expand the comment here so it reminds me what's going on
> when we have to look at this again:) Let me know if I'm on the right
> track.
Yes, your understanding is correct.
Powered by blists - more mailing lists