[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201002172913.GA2809822@bjorn-Precision-5520>
Date: Fri, 2 Oct 2020 12:29:13 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Ethan Zhao <haifeng.zhao@...el.com>
Cc: bhelgaas@...gle.com, oohall@...il.com, ruscur@...sell.cc,
lukas@...ner.de, andriy.shevchenko@...ux.intel.com,
stuart.w.hayes@...il.com, mr.nuke.me@...il.com,
mika.westerberg@...ux.intel.com, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org, ashok.raj@...ux.intel.com,
sathyanarayanan.kuppuswamy@...el.com, xerces.zhao@...il.com,
Sinan Kaya <okaya@...nel.org>
Subject: Re: [PATCH v6 4/5] PCI: only return true when dev io state is really
changed
[+cc Sinan]
On Wed, Sep 30, 2020 at 03:05:36AM -0400, Ethan Zhao wrote:
> When uncorrectable error happens, AER driver and DPC driver interrupt
> handlers likely call
>
> pcie_do_recovery()
> ->pci_walk_bus()
> ->report_frozen_detected()
>
> with pci_channel_io_frozen the same time.
> If pci_dev_set_io_state() return true even if the original state is
> pci_channel_io_frozen, that will cause AER or DPC handler re-enter
> the error detecting and recovery procedure one after another.
> The result is the recovery flow mixed between AER and DPC.
> So simplify the pci_dev_set_io_state() function to only return true
> when dev->error_state is changed.
>
> Signed-off-by: Ethan Zhao <haifeng.zhao@...el.com>
> Tested-by: Wen Jin <wen.jin@...el.com>
> Tested-by: Shanshan Zhang <ShanshanX.Zhang@...el.com>
> Reviewed-by: Alexandru Gagniuc <mr.nuke.me@...il.com>
> Reviewed-by: Andy Shevchenko <andy.shevchenko@...il.com>
> ---
> Changnes:
> v2: revise description and code according to suggestion from Andy.
> v3: change code to simpler.
> v4: no change.
> v5: no change.
> v6: no change.
>
> drivers/pci/pci.h | 37 +++++--------------------------------
> 1 file changed, 5 insertions(+), 32 deletions(-)
>
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 455b32187abd..f2beeaeda321 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -359,39 +359,12 @@ struct pci_sriov {
> static inline bool pci_dev_set_io_state(struct pci_dev *dev,
> pci_channel_state_t new)
> {
> - bool changed = false;
> -
> device_lock_assert(&dev->dev);
> - switch (new) {
> - case pci_channel_io_perm_failure:
> - switch (dev->error_state) {
> - case pci_channel_io_frozen:
> - case pci_channel_io_normal:
> - case pci_channel_io_perm_failure:
> - changed = true;
> - break;
> - }
> - break;
> - case pci_channel_io_frozen:
> - switch (dev->error_state) {
> - case pci_channel_io_frozen:
> - case pci_channel_io_normal:
> - changed = true;
> - break;
> - }
> - break;
> - case pci_channel_io_normal:
> - switch (dev->error_state) {
> - case pci_channel_io_frozen:
> - case pci_channel_io_normal:
> - changed = true;
> - break;
> - }
> - break;
> - }
> - if (changed)
> - dev->error_state = new;
> - return changed;
> + if (dev->error_state == new)
> + return false;
> +
> + dev->error_state = new;
> + return true;
> }
IIUC this changes the behavior of the function, but it's difficult to
analyze because it does a lot of simplification at the same time.
Please consider the following, which is intended to simplify the
function while preserving the behavior (but please verify; it's been a
long time since I looked at this). Then maybe see how your patch
could be done on top of this?
Alternatively, come up with your own simplification patch + the
functionality change.
commit 983d9b1f8177 ("PCI/ERR: Simplify pci_dev_set_io_state()")
Author: Bjorn Helgaas <bhelgaas@...gle.com>
Date: Tue May 19 12:28:57 2020 -0500
PCI/ERR: Simplify pci_dev_set_io_state()
Truth table:
requested new state
current ------------------------------------------
state normal frozen perm_failure
------------ + ------------- ------------- ------------
normal | normal frozen perm_failure
frozen | normal frozen perm_failure
perm_failure | perm_failure* perm_failure* perm_failure
* "not changed", returns false
No functional change intended.
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 6d3f75867106..81408552f7c9 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -358,39 +358,21 @@ struct pci_sriov {
static inline bool pci_dev_set_io_state(struct pci_dev *dev,
pci_channel_state_t new)
{
- bool changed = false;
-
device_lock_assert(&dev->dev);
- switch (new) {
- case pci_channel_io_perm_failure:
- switch (dev->error_state) {
- case pci_channel_io_frozen:
- case pci_channel_io_normal:
- case pci_channel_io_perm_failure:
- changed = true;
- break;
- }
- break;
- case pci_channel_io_frozen:
- switch (dev->error_state) {
- case pci_channel_io_frozen:
- case pci_channel_io_normal:
- changed = true;
- break;
- }
- break;
- case pci_channel_io_normal:
- switch (dev->error_state) {
- case pci_channel_io_frozen:
- case pci_channel_io_normal:
- changed = true;
- break;
- }
- break;
+
+ /* Can always put a device in perm_failure state */
+ if (new == pci_channel_io_perm_failure) {
+ dev->error_state == pci_channel_io_perm_failure;
+ return true;
}
- if (changed)
- dev->error_state = new;
- return changed;
+
+ /* If already in perm_failure, can't set to normal or frozen */
+ if (dev->error_state == pci_channel_io_perm_failure)
+ return false;
+
+ /* Can always change normal to frozen or vice versa */
+ dev->error_state = new;
+ return true;
}
static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
Powered by blists - more mailing lists