linux-kernel - Re: [RFC] deadlock with flush

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.44L0.1906181156370.1659-100000@iolanthe.rowland.org>
Date:   Tue, 18 Jun 2019 11:59:39 -0400 (EDT)
From:   Alan Stern <stern@...land.harvard.edu>
To:     Oliver Neukum <oneukum@...e.com>, Tejun Heo <tj@...nel.org>
cc:     USB list <linux-usb@...r.kernel.org>,
        Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] deadlock with flush_work() in UAS

Tejun and other workqueue maintainers:

On Tue, 18 Jun 2019, Oliver Neukum wrote:

> Am Dienstag, den 18.06.2019, 11:29 -0400 schrieb Alan Stern:
> > On Tue, 18 Jun 2019, Oliver Neukum wrote:
> > 
> > > Hi,
> > > 
> > > looking at those deadlocks it looks to me like UAS can
> > > deadlock on itself. What do you think?
> > > 
> > >       Regards
> > >               Oliver
> > > 
> > > From 2d497f662e6c03fe9e0a75e05b64d52514e890b3 Mon Sep 17 00:00:00 2001
> > > From: Oliver Neukum <oneukum@...e.com>
> > > Date: Tue, 18 Jun 2019 15:03:56 +0200
> > > Subject: [PATCH] UAS: fix deadlock in error handling and PM flushing work
> > > 
> > > A SCSI error handler and block runtime PM must not allocate
> > > memory with GFP_KERNEL. Furthermore they must not wait for
> > > tasks allocating memory with GFP_KERNEL.
> > > That means that they cannot share a workqueue with arbitrary tasks.
> > > 
> > > Fix this for UAS using a private workqueue.
> > 
> > I'm not so sure that one long-running task in a workqueue will block 
> > other tasks.  Workqueues have variable numbers of threads, added and 
> > removed on demand.  (On the other hand, when new threads need to be 
> > added the workqueue manager probably uses GFP_KERNEL.)
> 
> Do we have a guarantee it will reschedule already scheduled works?
> The deadlock would be something like
> 
> uas_pre_reset() -> uas_wait_for_pending_cmnds() ->
> flush_work(&devinfo->work) -> kmalloc() -> DEADLOCK
> 
> You can also make this chain with uas_suspend()
> 
> > Even if you disagree, perhaps we should have a global workqueue with a
> > permanently set noio flag.  It could be shared among multiple drivers
> > such as uas and the hub driver for purposes like this.  (In fact, the 
> > hub driver already has its own dedicated workqueue.)
> 
> That is a good idea. But does UAS need WQ_MEM_RECLAIM?

These are good questions, and I don't have the answers.  Perhaps Tejun 
or someone else on LKML can help.

Alan Stern