lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1401890782.24099.10.camel@oc7383187364.ibm.com>
Date:	Wed, 04 Jun 2014 16:06:22 +0200
From:	Frank Haverkamp <haver@...ux.vnet.ibm.com>
To:	Kleber Sacilotto de Souza <klebers@...ux.vnet.ibm.com>
Cc:	linux-kernel@...r.kernel.org, gregkh@...uxfoundation.org
Subject: Re: [PATCH 3/4] GenWQE: Improve hardware error recovery

Am Mittwoch, den 04.06.2014, 10:57 -0300 schrieb Kleber Sacilotto de
Souza:
> Currently, in the event of a fatal hardware error, the driver tries a
> recovery procedure that calls pci_reset_function() to reset the card.
> This is not sufficient in some cases, needing a fundamental reset to
> bring the card back.
> 
> This patch implements a call to the platform fundamental reset procedure
> on the error recovery path if GENWQE_PLATFORM_ERROR_RECOVERY is enabled.
> This is implemented by default only on PPC64, since this can cause
> problems on other archs, e.g. zSeries, where the platform has its own
> recovery procedures, leading to a potencial race conditition. For these
> cases, the recovery is kept as it was before.
> 
> Signed-off-by: Kleber Sacilotto de Souza <klebers@...ux.vnet.ibm.com>
> ---
>  drivers/misc/genwqe/card_base.c |   45 +++++++++++++++++++++++++++++++++++++++
>  1 files changed, 45 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/misc/genwqe/card_base.c b/drivers/misc/genwqe/card_base.c
> index 87ebaba..abb7961 100644
> --- a/drivers/misc/genwqe/card_base.c
> +++ b/drivers/misc/genwqe/card_base.c
> @@ -797,6 +797,41 @@ static int genwqe_pci_fundamental_reset(struct pci_dev *pci_dev)
>  	return rc;
>  }
> 
> +
> +static int genwqe_platform_recovery(struct genwqe_dev *cd)
> +{
> +	struct pci_dev *pci_dev = cd->pci_dev;
> +	int rc;
> +
> +	dev_info(&pci_dev->dev,
> +		 "[%s] resetting card for error recovery\n", __func__);
> +
> +	/* Clear out error injection flags */
> +	cd->err_inject &= ~(GENWQE_INJECT_HARDWARE_FAILURE |
> +			    GENWQE_INJECT_GFIR_FATAL |
> +			    GENWQE_INJECT_GFIR_INFO);
> +
> +	genwqe_stop(cd);
> +
> +	/* Try recoverying the card with fundamental reset */
> +	rc = genwqe_pci_fundamental_reset(pci_dev);
> +	if (!rc) {
> +		rc = genwqe_start(cd);
> +		if (!rc)
> +			dev_info(&pci_dev->dev,
> +				 "[%s] card recovered\n", __func__);
> +		else
> +			dev_err(&pci_dev->dev,
> +				"[%s] err: cannot start card services! (err=%d)\n",
> +				__func__, rc);
> +	} else {
> +		dev_err(&pci_dev->dev,
> +			"[%s] card reset failed\n", __func__);
> +	}
> +
> +	return rc;
> +}
> +
>  /*
>   * genwqe_reload_bistream() - reload card bitstream
>   *
> @@ -875,6 +910,7 @@ static int genwqe_health_thread(void *data)
>  	struct pci_dev *pci_dev = cd->pci_dev;
>  	u64 gfir, gfir_masked, slu_unitcfg, app_unitcfg;
> 
> + health_thread_begin:
>  	while (!kthread_should_stop()) {
>  		rc = wait_event_interruptible_timeout(cd->health_waitq,
>  			 (genwqe_health_check_cond(cd, &gfir) ||
> @@ -960,6 +996,15 @@ static int genwqe_health_thread(void *data)
>  		/* We do nothing if the card is going over PCI recovery */
>  		if (pci_channel_offline(pci_dev))
>  			return -EIO;
> +
> +		/*
> +		 * If it's supported by the platform, we try a fundamental reset
> +		 * to recover from a fatal error. Otherwise, we continue to wait
> +		 * for an external recovery procedure to take care of it.
> +		 */
> +		rc = genwqe_platform_recovery(cd);
> +		if (!rc)
> +			goto health_thread_begin;
>  	}
> 
>  	dev_err(&pci_dev->dev,

Thanks for contributing those additions to our driver.

Acked-by: Frank Haverkamp <haver@...ux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ