lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <617f64e3-74c8-f98b-3430-bd476867e483@I-love.SAKURA.ne.jp>
Date:   Tue, 5 Jul 2022 14:21:17 +0900
From:   Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To:     Greg KH <gregkh@...uxfoundation.org>
Cc:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Len Brown <len.brown@...el.com>, Pavel Machek <pavel@....cz>,
        arnd@...db.de, linux-kernel@...r.kernel.org,
        linux-pm@...r.kernel.org,
        Wedson Almeida Filho <wedsonaf@...gle.com>
Subject: Re: [PATCH] char: misc: make misc_open() and misc_register() killable

On 2022/07/04 23:31, Greg KH wrote:
> I don't understand what you are trying to "fix" here.  What is userspace
> doing (as a normal user) that is causing a problem, and what problem is
> it causing and for what device/hardware/driver is this a problem?

Currently the root cause is unknown.
This might be another example of deadlock hidden by device_initialize().

We can see from https://syzkaller.appspot.com/text?tag=CrashReport&x=11feb7e0080000 that
when khungtaskd reports that a process is blocked waiting for misc_mtx at misc_open(),
there is a process which is holding system_transition_mutex from snapshot_open().

----------------------------------------
INFO: task syz-executor.4:21922 blocked for more than 143 seconds.
      Not tainted 5.19.0-rc4-syzkaller-00187-g089866061428 #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.4  state:D stack:28408 pid:21922 ppid:  3666 flags:0x00000004

2 locks held by syz-executor.5/21906:
 #0: ffffffff8c82f708 (misc_mtx){+.+.}-{3:3}, at: misc_open+0x5f/0x4a0 drivers/char/misc.c:107
 #1: ffffffff8bc536e8 (system_transition_mutex){+.+.}-{3:3}, at: snapshot_open+0x3b/0x2a0 kernel/power/user.c:54
1 lock held by syz-executor.4/21922:
 #0: ffffffff8c82f708 (misc_mtx){+.+.}-{3:3}, at: misc_open+0x5f/0x4a0 drivers/char/misc.c:107
----------------------------------------

Possible locations where snapshot_open() might sleep with system_transition_mutex held are
pm_notifier_call_chain_robust()/wait_for_device_probe()/create_basic_memory_bitmaps().
But I think we can exclude pm_notifier_call_chain_robust() because lockdep does not report
that that process is holding "struct blocking_notifier_head"->rwsem. I suspect that
that process is sleeping at wait_for_device_probe(), for it waits for probe operations.

----------------------------------------
void wait_for_device_probe(void)
{
	/* wait for the deferred probe workqueue to finish */
	flush_work(&deferred_probe_work);

	/* wait for the known devices to complete their probing */
	wait_event(probe_waitqueue, atomic_read(&probe_count) == 0);
	async_synchronize_full();
}
----------------------------------------

> 
> Yes, you can sleep in open(), but you shouldn't sleep long, if at all
> possible as it can be annoying.  So why not fix up the offending driver
> not to sleep to long?

We can't predict how long snapshot_open() sleeps inside wait_for_device_probe().

Looking at abovementioned report again, it seems to be common that one process is
inside input_register_handle() and another process is inside input_close_device(),
and these two processes are holding the same &dev->mutex#2 object. Guessing from
the code that input_register_handle() will not sleep with dev->mutex held,
input_close_device() is holding dev->mutex and input_register_handle() is
waiting for input_close_device() to release dev->mutex.

Therefore, there might be a race or deadlock between these two processes.
If &dev->mutex#2 were subjected to device_initialize() magic, lockdep won't be
able to catch the deadlock. But I'm not familiar with device management code...

Maybe input_close_device() is failing to release dev->mutex for some reason?
Maybe nothing but too slow to wait?

----------------------------------------
7 locks held by kworker/1:0/22:
 #0: ffff888011a65d38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888011a65d38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: arch_atomic_long_set include/linux/atomic/atomic-long.h:41 [inline]
 #0: ffff888011a65d38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: atomic_long_set include/linux/atomic/atomic-instrumented.h:1280 [inline]
 #0: ffff888011a65d38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:636 [inline]
 #0: ffff888011a65d38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:663 [inline]
 #0: ffff888011a65d38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: process_one_work+0x87a/0x1610 kernel/workqueue.c:2260
 #1: ffffc900001c7da8 ((work_completion)(&hub->events)){+.+.}-{0:0}, at: process_one_work+0x8ae/0x1610 kernel/workqueue.c:2264
 #2: ffff8881479d4190 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:835 [inline]
 #2: ffff8881479d4190 (&dev->mutex){....}-{3:3}, at: hub_event+0x1c1/0x4690 drivers/usb/core/hub.c:5691
 #3: ffff888044782190 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:835 [inline]
 #3: ffff888044782190 (&dev->mutex){....}-{3:3}, at: __device_attach+0x76/0x530 drivers/base/dd.c:964
 #4: ffff8880447d2118 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:835 [inline]
 #4: ffff8880447d2118 (&dev->mutex){....}-{3:3}, at: __device_attach+0x76/0x530 drivers/base/dd.c:964
 #5: ffffffff8ceafca8 (input_mutex){+.+.}-{3:3}, at: input_register_device.cold+0x34/0x304 drivers/input/input.c:2378
 #6: ffff8880447d52c0 (&dev->mutex#2){+.+.}-{3:3}, at: input_register_handle+0x6d/0x510 drivers/input/input.c:2544

2 locks held by acpid/2962:
 #0: ffff888042a28158 (&joydev->mutex){+.+.}-{3:3}, at: joydev_close_device drivers/input/joydev.c:220 [inline]
 #0: ffff888042a28158 (&joydev->mutex){+.+.}-{3:3}, at: joydev_release+0x187/0x290 drivers/input/joydev.c:252
 #1: ffff8880447d52c0 (&dev->mutex#2){+.+.}-{3:3}, at: input_close_device+0x42/0x1f0 drivers/input/input.c:726

7 locks held by kworker/1:11/5743:
 #0: ffff888011a65d38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888011a65d38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: arch_atomic_long_set include/linux/atomic/atomic-long.h:41 [inline]
 #0: ffff888011a65d38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: atomic_long_set include/linux/atomic/atomic-instrumented.h:1280 [inline]
 #0: ffff888011a65d38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:636 [inline]
 #0: ffff888011a65d38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:663 [inline]
 #0: ffff888011a65d38 ((wq_completion)usb_hub_wq){+.+.}-{0:0}, at: process_one_work+0x87a/0x1610 kernel/workqueue.c:2260
 #1: ffffc900153c7da8 ((work_completion)(&hub->events)){+.+.}-{0:0}, at: process_one_work+0x8ae/0x1610 kernel/workqueue.c:2264
 #2: ffff888021384190 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:835 [inline]
 #2: ffff888021384190 (&dev->mutex){....}-{3:3}, at: hub_event+0x1c1/0x4690 drivers/usb/core/hub.c:5691
 #3: ffff8880468a4190 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:835 [inline]
 #3: ffff8880468a4190 (&dev->mutex){....}-{3:3}, at: __device_attach+0x76/0x530 drivers/base/dd.c:964
 #4: ffff8880468a6118 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:835 [inline]
 #4: ffff8880468a6118 (&dev->mutex){....}-{3:3}, at: __device_attach+0x76/0x530 drivers/base/dd.c:964
 #5: ffff8880255f1a20 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:835 [inline]
 #5: ffff8880255f1a20 (&dev->mutex){....}-{3:3}, at: __device_attach+0x76/0x530 drivers/base/dd.c:964
 #6: ffffffff8ceafca8 (input_mutex){+.+.}-{3:3}, at: input_register_device.cold+0x34/0x304 drivers/input/input.c:2378
----------------------------------------

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ