lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130520224824.GA31740@google.com>
Date:	Mon, 20 May 2013 16:48:24 -0600
From:	Bjorn Helgaas <bhelgaas@...gle.com>
To:	"Zhang, LongX" <longx.zhang@...el.com>
Cc:	"linasvepstas@...il.com" <linasvepstas@...il.com>,
	"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"yanmin_zhang@...ux.intel.com" <yanmin_zhang@...ux.intel.com>,
	"Joseph.Liu@...lex.Com" <Joseph.Liu@...lex.Com>
Subject: Re: Subject : [ PATCH ]
 pci-reset-error_state-to-pci_channel_io_normal-at-report_slot_reset

On Fri, Apr 26, 2013 at 06:28:59AM +0000, Zhang, LongX wrote:
> From: Zhang Long <longx.zhang@...el.com>
> 
> Specific pci device drivers might have many functions to call
> pci_channel_offline to check device states. When slot_reset happens,
> drivers' slot_reset callback might call such functions and eventually
> abort the reset.
> 
> The patch resets pdev->error_state to pci_channel_io_normal at
> the begining of report_slot_reset.
> 
> Thank Liu Joseph for pointing it out.
> 
> Signed-off-by: Zhang Yanmin <yanmin_zhang@...ux.intel.com>
> Signed-off-by: Zhang Long <longx.zhang@...el.com>
> ---
>  drivers/pci/pcie/aer/aerdrv_core.c |    1 +
>  drivers/pci/pcie/portdrv_pci.c     |   12 +++++-------
>  2 files changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> index 564d97f..c61fd44 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -286,6 +286,7 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
>  	result_data = (struct aer_broadcast_data *) data;
>  
>  	device_lock(&dev->dev);
> +	dev->error_state = pci_channel_io_normal;
>  	if (!dev->driver ||
>  		!dev->driver->err_handler ||
>  		!dev->driver->err_handler->slot_reset)
> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> index ed4d094..7abefd9 100644
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
>  	pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
>  	int retval;
>  
> -	/* If fatal, restore cfg space for possible link reset at upstream */
> -	if (dev->error_state == pci_channel_io_frozen) {
> -		dev->state_saved = true;
> -		pci_restore_state(dev);
> -		pcie_portdrv_restore_config(dev);
> -		pci_enable_pcie_error_reporting(dev);
> -	}
> +	/* restore cfg space for possible link reset at upstream */
> +	dev->state_saved = true;
> +	pci_restore_state(dev);
> +	pcie_portdrv_restore_config(dev);
> +	pci_enable_pcie_error_reporting(dev);
>  
>  	/* get true return value from &status */
>  	retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);

I think this patch changes the behavior in the case of a non-fatal error
where one of the .error_detected() methods returned
PCI_ERS_RESULT_NEED_RESET.  In that case, pcie_portdrv_slot_reset()
previously did not restore config space, but after your patch, it *will*
restore it.  We need an explanation of why this is safe.

I think you should split this into two patches: the first would remove the
"if (dev->error_state == pci_channel_io_frozen)" test from portdrv_pci.c
and explain the reason, and the second would make the aerdrv_core.c change.

I'm also concerned that in that same case (a non-fatal error where one of
the .error_detected() methods returned PCI_ERS_RESULT_NEED_RESET), I don't
think we actually *do* any kind of device reset.  This isn't related to
your patch, of course, so if you resolve the config space restore question,
we can deal with the reset question later.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ