[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1555569464.7835.4.camel@suse.com>
Date: Thu, 18 Apr 2019 08:37:44 +0200
From: Oliver Neukum <oneukum@...e.com>
To: Kloetzke Jan <Jan.Kloetzke@...h.de>
Cc: "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH] usbnet: fix kernel crash after disconnect
On Mi, 2019-04-17 at 09:19 +0000, Kloetzke Jan wrote:
> When disconnecting cdc_ncm the kernel sporadically crashes shortly
> after the disconnect:
>
> [ 57.868812] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> ...
> [ 58.006653] PC is at 0x0
> [ 58.009202] LR is at call_timer_fn+0xec/0x1b4
> [ 58.013567] pc : [<0000000000000000>] lr : [<ffffff80080f5130>] pstate: 00000145
> [ 58.020976] sp : ffffff8008003da0
> [ 58.024295] x29: ffffff8008003da0 x28: 0000000000000001
> [ 58.029618] x27: 000000000000000a x26: 0000000000000100
> [ 58.034941] x25: 0000000000000000 x24: ffffff8008003e68
> [ 58.040263] x23: 0000000000000000 x22: 0000000000000000
> [ 58.045587] x21: 0000000000000000 x20: ffffffc68fac1808
> [ 58.050910] x19: 0000000000000100 x18: 0000000000000000
> [ 58.056232] x17: 0000007f885aff8c x16: 0000007f883a9f10
> [ 58.061556] x15: 0000000000000001 x14: 000000000000006e
> [ 58.066878] x13: 0000000000000000 x12: 00000000000000ba
> [ 58.072201] x11: ffffffc69ff1db30 x10: 0000000000000020
> [ 58.077524] x9 : 8000100008001000 x8 : 0000000000000001
> [ 58.082847] x7 : 0000000000000800 x6 : ffffff8008003e70
> [ 58.088169] x5 : ffffffc69ff17a28 x4 : 00000000ffff138b
> [ 58.093492] x3 : 0000000000000000 x2 : 0000000000000000
> [ 58.098814] x1 : 0000000000000000 x0 : 0000000000000000
> ...
> [ 58.205800] [< (null)>] (null)
> [ 58.210521] [<ffffff80080f5298>] expire_timers+0xa0/0x14c
> [ 58.215937] [<ffffff80080f542c>] run_timer_softirq+0xe8/0x128
> [ 58.221702] [<ffffff8008081120>] __do_softirq+0x298/0x348
> [ 58.227118] [<ffffff80080a6304>] irq_exit+0x74/0xbc
> [ 58.232009] [<ffffff80080e17dc>] __handle_domain_irq+0x78/0xac
> [ 58.237857] [<ffffff8008080cf4>] gic_handle_irq+0x80/0xac
> ...
>
> The crash happens roughly 125..130ms after the disconnect. This
> correlates with the 'delay' timer that is started on certain USB tx/rx
> errors in the URB completion handler.
>
> The suspected problem is a race of usbnet_stop() with
> usbnet_start_xmit(). In usbnet_stop() we call usbnet_terminate_urbs()
> to cancel all URBs in flight. This only makes sense if no new URBs are
> submitted concurrently, though. But the usbnet_start_xmit() can run at
> the same time on another CPU which almost unconditionally submits an
> URB. The error callback of the new URB will then schedule the timer
> after it was already stopped.
Hi,
interesting. How sure are you of the details of your analysis?
I am asking because usbnet_stop() does a del_timer_sync().
It is indeed written under the assumption that the upper layer
will have ceased transmission when it stops an interface.
So I am wondering whether the correct fix would not be to make
sure the timer is started.
Regards
Oliver
Powered by blists - more mailing lists