lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <509B0013.6080508@gmail.com>
Date:	Thu, 08 Nov 2012 01:42:59 +0100
From:	Daniel Mack <zonque@...il.com>
To:	Takashi Iwai <tiwai@...e.de>
CC:	Alan Stern <stern@...land.harvard.edu>,
	Christof Meerwald <cmeerw@...erw.org>,
	"Artem S. Tashkinov" <t.artem@...os.com>,
	Kernel development list <linux-kernel@...r.kernel.org>,
	USB list <linux-usb@...r.kernel.org>
Subject: Re: A reliable kernel panic (3.6.2) and system crash when visiting
 a particular website

On 07.11.2012 20:19, Takashi Iwai wrote:
> At Wed, 7 Nov 2012 12:34:43 -0500 (EST),
> Alan Stern wrote:
>>
>> On Mon, 5 Nov 2012, Christof Meerwald wrote:
>>
>>> BTW, I have been able to reproduce the problem on a completely
>>> different machine (also running Ubuntu 12.10, but different hardware).
>>> The important thing appears to be that the USB audio device is
>>> connected via a USB 2.0 hub (and then using the test code posted in
>>> http://pastebin.com/aHGe1S1X specifying the audio device as
>>> "plughw:Set" (or whatever it's called) seems to trigger the freeze).
>>
>> Christof: Thank you for that reference, it was a big help.  After
>> crashing my system many times I have tracked the problem, at least in
>> part.  The patch below should prevent your system from freezing.
>>
>>
>> Takashi: It turns out the the problem is triggered when the audio
>> subsystem calls snd_usb_endpoint_stop() with wait == 0 and then calls
>> snd_usb_endpoint_start().  Since the driver doesn't wait for the
>> outstanding URBs to finish, it tries to submit them again while they
>> are still active.
>>
>> Normally the USB core would realize this and fail the submission, but a
>> bug in ehci-hcd prevented this from happening.  (That bug is what the
>> patch below fixes.)  The URB gets added to the active list twice,
>> resulting in list corruption and an oops in interrupt context, which
>> freezes the system.
>>
>> The user program that triggers the problem basically looks like this:
>>
>>   snd_pcm_prepare(rec_pcm);
>>   snd_pcm_start(rec_pcm);
>>   snd_pcm_drop(rec_pcm);
>>
>>   snd_pcm_prepare(rec_pcm);
>>   snd_pcm_start(rec_pcm);
>>
>> The snd_pcm_drop call unlinks the URBs but does not wait for them to
>> finish.  Then the second snd_pcm_start call submits the URBs before
>> they have finished.


Thanks for investigating on this and to everyone who so quickyl tested
the provided patch. Seems like we got the right idea where the problem
really is.

However, the proposed patch seems wrong to me (see below).

>> What is the right solution for this problem?
> 
> How about the patch below?  (It's for 3.6, and won't be applied cleanly
> to 3.7, but easy to adapt.)
> 
> 
> Takashi
> 
> ---
> diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
> index d9de667..38830e2 100644
> --- a/sound/usb/endpoint.c
> +++ b/sound/usb/endpoint.c
> @@ -35,6 +35,7 @@
>  
>  #define EP_FLAG_ACTIVATED	0
>  #define EP_FLAG_RUNNING		1
> +#define EP_FLAG_STOPPING	2
>  
>  /*
>   * snd_usb_endpoint is a model that abstracts everything related to an
> @@ -502,10 +503,19 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep)
>  	if (alive)
>  		snd_printk(KERN_ERR "timeout: still %d active urbs on EP #%x\n",
>  					alive, ep->ep_num);
> +	clear_bit(EP_FLAG_STOPPING, &ep->flags);
>  
>  	return 0;
>  }
>  
> +/* wait until urbs are really dropped */
> +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep)
> +{
> +	if (test_bit(EP_FLAG_STOPPING, &ep->flags))
> +		wait_clear_urbs(ep);
> +}
> +
> +
>  /*
>   * unlink active urbs.
>   */
> @@ -913,6 +923,8 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep,
>  
>  		if (wait)
>  			wait_clear_urbs(ep);
> +		else
> +			set_bit(EP_FLAG_STOPPING, &ep->flags);
>  	}
>  }
>  
> diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h
> index cbbbdf2..c1540a4 100644
> --- a/sound/usb/endpoint.h
> +++ b/sound/usb/endpoint.h
> @@ -16,6 +16,7 @@ int snd_usb_endpoint_set_params(struct snd_usb_endpoint *ep,
>  int  snd_usb_endpoint_start(struct snd_usb_endpoint *ep, int can_sleep);
>  void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep,
>  			   int force, int can_sleep, int wait);
> +void snd_usb_endpoint_sync_stop(struct snd_usb_endpoint *ep);
>  int  snd_usb_endpoint_activate(struct snd_usb_endpoint *ep);
>  int  snd_usb_endpoint_deactivate(struct snd_usb_endpoint *ep);
>  void snd_usb_endpoint_free(struct list_head *head);
> diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
> index f782ce1..aee3ab0 100644
> --- a/sound/usb/pcm.c
> +++ b/sound/usb/pcm.c
> @@ -546,6 +546,11 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream)
>  	if (snd_BUG_ON(!subs->data_endpoint))
>  		return -EIO;
>  
> +	if (subs->sync_endpoint)
> +		snd_usb_endpoint_sync_stop(subs->sync_endpoint);
> +	if (subs->data_endpoint)
> +		snd_usb_endpoint_sync_stop(subs->data_endpoint);

We can't simply stop both endpoints in the prepare callback. The essence
of the new reference-counting system is that we can use endpoints from
multiple contexts, and the logic inside endpoint.c will care about when
to start up and take down the urbs. The idea here is that endoints can
be run for many purposes, and the new implementation that was added
allows capture endpoints to run purely as timing reference for playback.

This bug needs to be fixed in the ehci controller, or we need some other
solution in the snd-usb-audio driver. I'll do some test once I'm back
from ELC.


Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ