lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 5 Jul 2022 16:16:35 +0200
From:   Greg KH <gregkh@...uxfoundation.org>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Len Brown <len.brown@...el.com>, Pavel Machek <pavel@....cz>,
        arnd@...db.de, linux-kernel@...r.kernel.org,
        linux-pm@...r.kernel.org,
        Wedson Almeida Filho <wedsonaf@...gle.com>,
        Dmitry Vyukov <dvyukov@...gle.com>
Subject: Re: [PATCH] char: misc: make misc_open() and misc_register() killable

On Tue, Jul 05, 2022 at 11:01:38PM +0900, Tetsuo Handa wrote:
> On 2022/07/05 14:21, Tetsuo Handa wrote:
> > Possible locations where snapshot_open() might sleep with system_transition_mutex held are
> > pm_notifier_call_chain_robust()/wait_for_device_probe()/create_basic_memory_bitmaps().
> > But I think we can exclude pm_notifier_call_chain_robust() because lockdep does not report
> > that that process is holding "struct blocking_notifier_head"->rwsem. I suspect that
> > that process is sleeping at wait_for_device_probe(), for it waits for probe operations.
> > 
> > ----------------------------------------
> > void wait_for_device_probe(void)
> > {
> > 	/* wait for the deferred probe workqueue to finish */
> > 	flush_work(&deferred_probe_work);
> > 
> > 	/* wait for the known devices to complete their probing */
> > 	wait_event(probe_waitqueue, atomic_read(&probe_count) == 0);
> > 	async_synchronize_full();
> > }
> > ----------------------------------------
> 
> syzbot confirmed that snapshot_open() is unable to proceed due to
> atomic_read(&probe_count) == 2 for 145 seconds.
> 
> ----------------------------------------
> [   86.794300][ T4209] Held system_transition_mutex.
> [   86.821486][ T4209] Calling wait_for_device_probe()
> [   86.841374][ T4209] Calling flush_work(&deferred_probe_work)
> [   86.867398][ T4209] Calling wait_event(probe_waitqueue)
> [   87.966188][ T4209] Calling probe_count=2
> (...snipped...)
> [  233.554473][ T4209] Calling probe_count=2
> [  234.444800][   T28] INFO: task syz-executor.4:4146 blocked for more than 143 seconds.
> ----------------------------------------
> 
> Apart from whether we should fuzz snapshot code or not,
> there seems to be a bug that causes wait_for_device_probe() to hung.

What else is going on in the system at this point in time?  Are devices
still being added as part of boot init sequences?  Or has boot finished
properly and these are devices being removed?

Some device is being probed at the moment, maybe we have a deadlock
somewhere here...

thanks,

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ