lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200219130455.GL31668@ziepe.ca>
Date:   Wed, 19 Feb 2020 09:04:55 -0400
From:   Jason Gunthorpe <jgg@...pe.ca>
To:     Yunsheng Lin <linyunsheng@...wei.com>
Cc:     Leon Romanovsky <leon@...nel.org>,
        Lang Cheng <chenglang@...wei.com>, dledford@...hat.com,
        davem@...emloft.net, salil.mehta@...wei.com,
        yisen.zhuang@...wei.com, linuxarm@...wei.com,
        netdev@...r.kernel.org, linux-rdma@...r.kernel.org,
        Saeed Mahameed <saeedm@...lanox.com>, bhaktipriya96@...il.com,
        tj@...nel.org, Jeff Kirsher <jeffrey.t.kirsher@...el.com>
Subject: Re: [RFC rdma-next] RDMA/core: Add attribute WQ_MEM_RECLAIM to
 workqueue "infiniband"

On Wed, Feb 19, 2020 at 03:40:59PM +0800, Yunsheng Lin wrote:
> +cc Bhaktipriya, Tejun and Jeff
> 
> On 2020/2/19 14:45, Leon Romanovsky wrote:
> > On Wed, Feb 19, 2020 at 09:13:23AM +0800, Yunsheng Lin wrote:
> >> On 2020/2/18 23:31, Jason Gunthorpe wrote:
> >>> On Tue, Feb 18, 2020 at 11:35:35AM +0800, Lang Cheng wrote:
> >>>> The hns3 driver sets "hclge_service_task" workqueue with
> >>>> WQ_MEM_RECLAIM flag in order to guarantee forward progress
> >>>> under memory pressure.
> >>>
> >>> Don't do that. WQ_MEM_RECLAIM is only to be used by things interlinked
> >>> with reclaimed processing.
> >>>
> >>> Work on queues marked with WQ_MEM_RECLAIM can't use GFP_KERNEL
> >>> allocations, can't do certain kinds of sleeps, can't hold certain
> >>> kinds of locks, etc.
> 
> By the way, what kind of sleeps and locks can not be done in the work
> queued to wq marked with WQ_MEM_RECLAIM?

Anything that recurses back into a blocking allocation function.

If we are freeing memory because an allocation failed (eg GFP_KERNEL)
then we cannot go back into a blockable allocation while trying to
progress the first failing allocation. That is a deadlock.

So a WQ cannot hold any locks that enclose GFP_KERNEL in any other
threads.

Unfortunately we don't have a lockdep test for this by default.

> >> hns3 ethernet driver may be used as the low level transport of a
> >> network file system, memory reclaim data path may depend on the
> >> worker in hns3 driver to bring back the ethernet link so that it flush
> >> the some cache to network based disk.
> > 
> > Unlikely that this "network file system" dependency on ethernet link is correct.
> 
> Ok, I may be wrong about the above usecase.  but the below commit
> explicitly state that network devices may be used in memory reclaim
> path.

I don't really know how this works when the networking stacks
intersect with the block stack.

Forward progress on something like a NVMeOF requires a lot of stuff to
be working, and presumably under reclaim.

But, we can't make everything WQ_MEM_RECLAIM safe, because we could
never do a GFP_KERNEL allocation..

I have never seen specific guidance what to do here, I assume it is
broken.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ