[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d6da6709-d799-4be3-a695-850bddd6eb24@rowland.harvard.edu>
Date: Sun, 13 Jul 2025 11:21:24 -0400
From: Alan Stern <stern@...land.harvard.edu>
To: Hillf Danton <hdanton@...a.com>
Cc: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
syzbot+592e2ab8775dbe0bf09a@...kaller.appspotmail.com,
LKML <linux-kernel@...r.kernel.org>, Sean Young <sean@...s.org>,
Mauro Carvalho Chehab <mchehab@...nel.org>
Subject: Re: [PATCH] media: imon: make send_packet() more robust
On Sun, Jul 13, 2025 at 04:11:47PM +0800, Hillf Danton wrote:
> [loop Alan in]
I assume you're interested in the question of when to avoid resubmitting
URBs.
> On Sun, 13 Jul 2025 16:50:08 +0900 Tetsuo Handa wrote:
> > syzbot is reporting that imon has three problems which result in hung tasks
> > due to forever holding device lock.
> >
> > First problem is that when usb_rx_callback_intf0() once got -EPROTO error
> > after ictx->dev_present_intf0 became true, usb_rx_callback_intf0()
> > resubmits urb after printk(), and resubmitted urb causes
> > usb_rx_callback_intf0() to again get -EPROTO error. This results in
> > printk() flooding (RCU stalls).
> >
> > Commit 92f461517d22 ("media: ir_toy: do not resubmit broken urb") changed
> > ir_toy module not to resubmit when irtoy_in_callback() got -EPROTO error.
> > We should do similar thing for imon.
> >
> > Basically, I think that imon should refrain from resubmitting urb when
> > callback function got an error. But since I don't know which error codes
> > should retry resubmitting urb, this patch handles only union of error codes
> > chosen from modules in drivers/media/rc/ directory which handles -EPROTO
> > error (i.e. ir_toy, mceusb and igorplugusb).
In theory it's okay to resubmit _if_ the driver has a robust
error-recovery scheme (such as giving up after some fixed limit on the
number of errors or after some fixed time has elapsed, perhaps with a
time delay to prevent a flood of errors). Most drivers don't bother to
do this; they simply give up right away. This makes them more
vulnerable to short-term noise interference during USB transfers, but in
reality such interference is quite rare. There's nothing really wrong
with giving up right away.
As to which error codes drivers should pay attention to... In most
cases they only look at -EPROTO. According to
Documentation/driver-api/usb/error-codes.rst, -EILSEQ and -ETIME are
also possible errors when a device has been unplugged, so it wouldn't
hurt to check for them too. But most host controller drivers don't
bother to issue them; -EPROTO is by far the most common error code
following an unplug.
> > We need to decide whether to call usb_unlink_urb() when we got -EPROTO
> > error. ir_toy and mceusb call usb_unlink_urb() but igorplugusb does not
> > due to commit 5e4029056263 ("media: igorplugusb: remove superfluous
> > usb_unlink_urb()"). This patch calls usb_unlink_urb() because description
> > of usb_unlink_urb() suggests that it is OK to call.
If the error occurred because the device was unplugged then unlinking
the outstanding URBs isn't necessary; the USB core will unlink them for
you after the device's parent hub reports that the unplug took place.
> > Second problem is that when usb_rx_callback_intf0() once got -EPROTO error
> > before ictx->dev_present_intf0 becomes true, usb_rx_callback_intf0() always
> > resubmits urb due to commit 8791d63af0cf ("[media] imon: don't wedge
> > hardware after early callbacks"). If some errors should stop resubmitting
> > urb regardless of whether configuring the hardware has completed or not,
> > what that commit is doing is wrong. The ictx->dev_present_intf0 test was
> > introduced by commit 6f6b90c9231a ("[media] imon: don't parse scancodes
> > until intf configured"), but that commit did not call usb_unlink_urb()
> > when usb_rx_callback_intf0() got an error. Move the ictx->dev_present_intf0
> > test to immediately before imon_incoming_packet() so that we can call
> > usb_unlink_urb() as needed, or the first problem explained above happens
> > without printk() flooding (i.e. hung task).
It seems odd for a driver to set up normal communications with a device
before the device has been configured, but of course that decision is up
to the creators and maintainers of the driver.
Alan Stern
Powered by blists - more mailing lists