lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 12 Apr 2021 09:31:49 -0300
From:   Jason Gunthorpe <jgg@...pe.ca>
To:     Daniel Wagner <dwagner@...e.de>
Cc:     linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org,
        Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...com>,
        Christoph Hellwig <hch@....de>,
        Sagi Grimberg <sagi@...mberg.me>,
        Steve Wise <swise@...ngridcomputing.com>,
        Leon Romanovsky <leon@...nel.org>,
        Potnuri Bharat Teja <bharat@...lsio.com>
Subject: Re: [PATCH] nvme: Drop WQ_MEM_RECLAIM flag from core workqueues

On Mon, Apr 12, 2021 at 02:23:30PM +0200, Daniel Wagner wrote:
> Drop the WQ_MEM_RECLAIM flag as it is not needed and introduces
> warnings.
> 
> The documentation says "all wq which might be used in the memory
> reclaim paths MUST have this flag set. The wq is guaranteed to have at
> least one execution context regardless of memory pressure."
> 
> By setting WQ_MEM_RECLAIM the threads are ready be running during
> early init. The claim it guarantees at least one execution context
> regardless of memory pressure is not supported by the implementation.
> 
> As the nvme core does not depend on early init we can remove the
> WQ_MEM_RECLAIM flag. This resolves a warning in the rdma path:

What does early init have to do with WQ_MEM_RECLAIM?

WQ_MEM_RECLIAM is required when any thread in a reclaim context goes
to sleep waiting for a WQ to complete. For instance by calling
flush_workqueue() or many other things.

The sleeping reclaim context must be guarenteed that the work can be
completed without the work, work queue machinery, or anything the work
has become interconnected with, recursing back into a reclaim.

IIRC the issue here was some destroy or flush work in some error
condition that happened to be under a reclaim context?

I don't see the kind of analysis I'd expect in this commit message to
justify this change.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ