lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5300421B.1020901@acm.org>
Date:	Sat, 15 Feb 2014 22:44:11 -0600
From:	Corey Minyard <minyard@....org>
To:	Xie XiuQi <xiexiuqi@...wei.com>
CC:	Hushiyuan <hushiyuan@...wei.com>,
	openipmi-developer@...ts.sourceforge.net,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Rocky Craig <rocky.craig@...com>
Subject: Re: [Openipmi-developer] [PATCH] ipmi: fix BT reset for a while when
 cmd timeout

I don't really understand the error that is happening.  I see that it
continues to time out, but I don't know why.  If you can get in to this
situation here, it makes me worried that there is some other issue. 
issuing the warm reset, even if the command is not supported, should be
harmless.  Maybe the warm reset actually happens and it takes longer
than 5 seconds?

The following patch is certainly not the right fix.  I would actually
prefer to just remove the reset operation from the driver, but I'd
really like to fix the fundamental issue.  To me this looks like a bug
in the BMC.

I'm copying Rocky Craig, who wrote this state machine.

-corey

On 02/11/2014 04:26 AM, Xie XiuQi wrote:
> I fould a problem: when a cmd timeout and just
> in that time bt->seq < 2, system will alway keep
> retrying and we can't send any cmd to bmc.
>
> the error message is like this:
> [  530.908621] IPMI BT: timeout in RD_WAIT [ ] 1 retries left
> [  582.661329] IPMI BT: timeout in RD_WAIT [ ]
> [  582.661334] failed 2 retries, sending error response
> [  582.661337] IPMI: BT reset (takes 5 secs)
> [  693.335307] IPMI BT: timeout in RD_WAIT [ ]
> [  693.335312] failed 2 retries, sending error response
> [  693.335315] IPMI: BT reset (takes 5 secs)
> [  804.825161] IPMI BT: timeout in RD_WAIT [ ]
> [  804.825166] failed 2 retries, sending error response
> [  804.825169] IPMI: BT reset (takes 5 secs)
> ...
>
> When BT reset, a cmd "warm reset" will be sent to bmc, but this cmd
> is Optional in spec(refer to ipmi-interface-spec-v2). Some machines
> don't support this cmd.
>
> So, bt->init is introduced. Only during insmod, we do BT reset when
> response timeout to avoid system crash.
>
> Reported-by: Hu Shiyuan <hushiyuan@...wei.com>
> Signed-off-by: Xie XiuQi <xiexiuqi@...wei.com>
> Cc: stable@...r.kernel.org	# 3.4+
> ---
>  drivers/char/ipmi/ipmi_bt_sm.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/char/ipmi/ipmi_bt_sm.c b/drivers/char/ipmi/ipmi_bt_sm.c
> index a22a7a5..b4a7b2a 100644
> --- a/drivers/char/ipmi/ipmi_bt_sm.c
> +++ b/drivers/char/ipmi/ipmi_bt_sm.c
> @@ -107,6 +107,7 @@ struct si_sm_data {
>  	int		BT_CAP_outreqs;
>  	long		BT_CAP_req2rsp;
>  	int		BT_CAP_retries;	/* Recommended retries */
> +	int		init;
>  };
>
>  #define BT_CLR_WR_PTR	0x01	/* See IPMI 1.5 table 11.6.4 */
> @@ -438,8 +439,8 @@ static enum si_sm_result error_recovery(struct si_sm_data *bt,
>  	if (!bt->nonzero_status)
>  		printk(KERN_ERR "IPMI BT: stuck, try power cycle\n");
>
> -	/* this is most likely during insmod */
> -	else if (bt->seq <= (unsigned char)(bt->BT_CAP_retries & 0xFF)) {
> +	/* only during insmod */
> +	else if (!bt->init) {
>  		printk(KERN_WARNING "IPMI: BT reset (takes 5 secs)\n");
>  		bt->state = BT_STATE_RESET1;
>  		return SI_SM_CALL_WITHOUT_DELAY;
> @@ -589,6 +590,10 @@ static enum si_sm_result bt_event(struct si_sm_data *bt, long time)
>  			BT_STATE_CHANGE(BT_STATE_READ_WAIT,
>  					SI_SM_CALL_WITHOUT_DELAY);
>  		bt->state = bt->complete;
> +
> +		if (!bt->init && bt->seq)
> +			bt->init = 1;
> +
>  		return bt->state == BT_STATE_IDLE ?	/* where to next? */
>  			SI_SM_TRANSACTION_COMPLETE :	/* normal */
>  			SI_SM_CALL_WITHOUT_DELAY;	/* Startup magic */

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ