lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 6 Jul 2022 19:26:28 +0900 From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp> To: Greg KH <gregkh@...uxfoundation.org>, "Rafael J. Wysocki" <rafael@...nel.org> Cc: Len Brown <len.brown@...el.com>, Pavel Machek <pavel@....cz>, Arnd Bergmann <arnd@...db.de>, linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org, Wedson Almeida Filho <wedsonaf@...gle.com>, Dmitry Vyukov <dvyukov@...gle.com>, Arjan van de Ven <arjan@...ux.intel.com> Subject: Re: [PATCH] char: misc: make misc_open() and misc_register() killable On 2022/07/06 15:34, Greg KH wrote: > On Wed, Jul 06, 2022 at 03:21:15PM +0900, Tetsuo Handa wrote: >> How should we fix this problem? > > We can decrease the timeout in usb_stor_msg_common(). I imagine that if > that timeout is ever hit in this sequence, then all will recover, right? > Try decreasing it to a sane number and see what happens. Yes, all recovers with below diff. ------------------------------------------------------------ diff --git a/drivers/usb/storage/transport.c b/drivers/usb/storage/transport.c index 1928b3918242..d2a192306e0c 100644 --- a/drivers/usb/storage/transport.c +++ b/drivers/usb/storage/transport.c @@ -164,7 +164,7 @@ static int usb_stor_msg_common(struct us_data *us, int timeout) /* wait for the completion of the URB */ timeleft = wait_for_completion_interruptible_timeout( - &urb_done, timeout ? : MAX_SCHEDULE_TIMEOUT); + &urb_done, timeout ? : 5 * HZ); clear_bit(US_FLIDX_URB_ACTIVE, &us->dflags); ------------------------------------------------------------ But >> Anyway, >> >> /* >> * Resuming. We may need to wait for the image device to >> * appear. >> */ >> wait_for_device_probe(); >> >> in snapshot_open() will sleep forever if some device became unresponsive. >> wait_for_device_probe() in snapshot_open() was added by commit c751085943362143 ("PM/Hibernate: Wait for SCSI devices scan to complete during resume"), and that commit did not take into account possibility of unresponsive hardware. "In addition, if the resume from hibernation is userland-driven, it's better to wait for all device probes in the kernel to complete before attempting to open the resume device." It is trivial to make e.g. atomic_read(&probe_count) == 10, which means that acceptable timeout for usb_stor_msg_common() may be no longer acceptable timeout for wait_for_device_probe(). Unlike flush_workqueue(), wait_for_device_probe() can wait forever if new probe requests keep coming in while waiting for existing probe requests to complete. Therefore, I think we should introduce timeout on wait_for_device_probe() side as well. I would like to propose below changes in 3 patches as fixes for this problem. Since there are 13 wait_for_device_probe() callers, maybe we want both killable and uninterruptible versions and pass timeout as an argument... ------------------------------------------------------------ drivers/base/dd.c | 3 ++- drivers/char/misc.c | 9 ++++++--- drivers/usb/storage/transport.c | 2 +- 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 3fc3b5940bb3..67e08b381ee2 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -723,7 +723,8 @@ void wait_for_device_probe(void) flush_work(&deferred_probe_work); /* wait for the known devices to complete their probing */ - wait_event(probe_waitqueue, atomic_read(&probe_count) == 0); + wait_event_killable_timeout(probe_waitqueue, + atomic_read(&probe_count) == 0, 60 * HZ); async_synchronize_full(); } EXPORT_SYMBOL_GPL(wait_for_device_probe); diff --git a/drivers/char/misc.c b/drivers/char/misc.c index ca5141ed5ef3..6430c534a1cb 100644 --- a/drivers/char/misc.c +++ b/drivers/char/misc.c @@ -104,7 +104,8 @@ static int misc_open(struct inode *inode, struct file *file) int err = -ENODEV; const struct file_operations *new_fops = NULL; - mutex_lock(&misc_mtx); + if (mutex_lock_killable(&misc_mtx)) + return -EINTR; list_for_each_entry(c, &misc_list, list) { if (c->minor == minor) { @@ -116,7 +117,8 @@ static int misc_open(struct inode *inode, struct file *file) if (!new_fops) { mutex_unlock(&misc_mtx); request_module("char-major-%d-%d", MISC_MAJOR, minor); - mutex_lock(&misc_mtx); + if (mutex_lock_killable(&misc_mtx)) + return -EINTR; list_for_each_entry(c, &misc_list, list) { if (c->minor == minor) { @@ -178,7 +180,8 @@ int misc_register(struct miscdevice *misc) INIT_LIST_HEAD(&misc->list); - mutex_lock(&misc_mtx); + if (mutex_lock_killable(&misc_mtx)) + return -EINTR; if (is_dynamic) { int i = find_first_zero_bit(misc_minors, DYNAMIC_MINORS); diff --git a/drivers/usb/storage/transport.c b/drivers/usb/storage/transport.c index 1928b3918242..d2a192306e0c 100644 --- a/drivers/usb/storage/transport.c +++ b/drivers/usb/storage/transport.c @@ -164,7 +164,7 @@ static int usb_stor_msg_common(struct us_data *us, int timeout) /* wait for the completion of the URB */ timeleft = wait_for_completion_interruptible_timeout( - &urb_done, timeout ? : MAX_SCHEDULE_TIMEOUT); + &urb_done, timeout ? : 60 * HZ); clear_bit(US_FLIDX_URB_ACTIVE, &us->dflags); ------------------------------------------------------------
Powered by blists - more mailing lists