lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 7 Sep 2020 19:34:12 -0500
From:   Corey Minyard <minyard@....org>
To:     Markus Boehme <markubo@...zon.com>
Cc:     openipmi-developer@...ts.sourceforge.net,
        Arnd Bergmann <arnd@...db.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-kernel@...r.kernel.org, Stefan Nuernberger <snu@...zon.com>,
        SeongJae Park <sjpark@...zon.com>, Amit Shah <aams@...zon.com>
Subject: Re: [PATCH 3/3] ipmi: Add timeout waiting for channel information

On Mon, Sep 07, 2020 at 06:25:37PM +0200, Markus Boehme wrote:
> We have observed hosts with misbehaving BMCs that receive a Get Channel
> Info command but don't respond. This leads to an indefinite wait in the
> ipmi_msghandler's __scan_channels function, showing up as hung task
> messages for modprobe.
> 
> Add a timeout waiting for the channel scan to complete. If the scan
> fails to complete within that time, treat that like IPMI 1.0 and only
> assume the presence of the primary IPMB channel at channel number 0.

This patch is a significant rewrite of the function.  This makes me a
little uncomfortable.  It's generally better to fix the bug in a minimal
patch.  It would be easier to read if you had two patches, one to
restructure the code and one to fix the bug.

One comment inline, but it doesn't matter, because...

While thinking about this, I realized an issue with these patches.
There should be timers in the lower layers that detect that the BMC does
not respond and should return an error response.  This is supposed to be
guaranteed by the lower layer, you shouldn't need timers here.  In fact,
if you abort with a timer here, you should get a lower layer reponds
later, causing other issues.

So, this is wrong.  If you are never getting a response, there is a bug
in the lower layer.  If you are not getting the error response as
quickly as you would like, I'm not sure what to do about that.

The first patch, of course, is an obvious bug fix.  I'll include that.

-corey

> 
> Signed-off-by: Stefan Nuernberger <snu@...zon.com>
> Signed-off-by: Markus Boehme <markubo@...zon.com>
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 72 ++++++++++++++++++++-----------------
>  1 file changed, 39 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
> index 2a2e8b2..9de9ba6 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -3315,46 +3315,52 @@ channel_handler(struct ipmi_smi *intf, struct ipmi_recv_msg *msg)
>   */
>  static int __scan_channels(struct ipmi_smi *intf, struct ipmi_device_id *id)
>  {
> -	int rv;
> +	long rv;
> +	unsigned int set;
>  
> -	if (ipmi_version_major(id) > 1
> -			|| (ipmi_version_major(id) == 1
> -			    && ipmi_version_minor(id) >= 5)) {
> -		unsigned int set;
> +	if (ipmi_version_major(id) == 1 && ipmi_version_minor(id) < 5) {

This is incorrect, it will not correctly handle IPMI 0.x BMCs.  Yes,
they exist.

> +		set = intf->curr_working_cset;
> +		goto single_ipmb_channel;
> +	}
>  
> -		/*
> -		 * Start scanning the channels to see what is
> -		 * available.
> -		 */
> -		set = !intf->curr_working_cset;
> -		intf->curr_working_cset = set;
> -		memset(&intf->wchannels[set], 0,
> -		       sizeof(struct ipmi_channel_set));
> -
> -		intf->null_user_handler = channel_handler;
> -		intf->curr_channel = 0;
> -		rv = send_channel_info_cmd(intf, 0);
> -		if (rv) {
> -			dev_warn(intf->si_dev,
> -				 "Error sending channel information for channel 0, %d\n",
> -				 rv);
> -			intf->null_user_handler = NULL;
> -			return -EIO;
> -		}
> +	/*
> +	 * Start scanning the channels to see what is
> +	 * available.
> +	 */
> +	set = !intf->curr_working_cset;
> +	intf->curr_working_cset = set;
> +	memset(&intf->wchannels[set], 0, sizeof(struct ipmi_channel_set));
>  
> -		/* Wait for the channel info to be read. */
> -		wait_event(intf->waitq, intf->channels_ready);
> +	intf->null_user_handler = channel_handler;
> +	intf->curr_channel = 0;
> +	rv = send_channel_info_cmd(intf, 0);
> +	if (rv) {
> +		dev_warn(intf->si_dev,
> +			 "Error sending channel information for channel 0, %ld\n",
> +			 rv);
>  		intf->null_user_handler = NULL;
> -	} else {
> -		unsigned int set = intf->curr_working_cset;
> +		return -EIO;
> +	}
>  
> -		/* Assume a single IPMB channel at zero. */
> -		intf->wchannels[set].c[0].medium = IPMI_CHANNEL_MEDIUM_IPMB;
> -		intf->wchannels[set].c[0].protocol = IPMI_CHANNEL_PROTOCOL_IPMB;
> -		intf->channel_list = intf->wchannels + set;
> -		intf->channels_ready = true;
> +	/* Wait for the channel info to be read. */
> +	rv = wait_event_timeout(intf->waitq, intf->channels_ready, 5 * HZ);
> +	if (rv == 0) {
> +		dev_warn(intf->si_dev,
> +			 "Timed out waiting for channel information. Assuming a single IPMB channel at 0.\n");
> +		goto single_ipmb_channel;
>  	}
>  
> +	goto out;
> +
> +single_ipmb_channel:
> +	/* Assume a single IPMB channel at zero. */
> +	intf->wchannels[set].c[0].medium = IPMI_CHANNEL_MEDIUM_IPMB;
> +	intf->wchannels[set].c[0].protocol = IPMI_CHANNEL_PROTOCOL_IPMB;
> +	intf->channel_list = intf->wchannels + set;
> +	intf->channels_ready = true;
> +
> +out:
> +	intf->null_user_handler = NULL;
>  	return 0;
>  }
>  
> -- 
> 2.7.4
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ