lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 19 Feb 2020 09:13:23 +0800
From:   Yunsheng Lin <linyunsheng@...wei.com>
To:     Jason Gunthorpe <jgg@...pe.ca>, Lang Cheng <chenglang@...wei.com>
CC:     <dledford@...hat.com>, <leon@...nel.org>, <davem@...emloft.net>,
        <salil.mehta@...wei.com>, <yisen.zhuang@...wei.com>,
        <linuxarm@...wei.com>, <netdev@...r.kernel.org>,
        <linux-rdma@...r.kernel.org>
Subject: Re: [RFC rdma-next] RDMA/core: Add attribute WQ_MEM_RECLAIM to
 workqueue "infiniband"

On 2020/2/18 23:31, Jason Gunthorpe wrote:
> On Tue, Feb 18, 2020 at 11:35:35AM +0800, Lang Cheng wrote:
>> The hns3 driver sets "hclge_service_task" workqueue with
>> WQ_MEM_RECLAIM flag in order to guarantee forward progress
>> under memory pressure.
> 
> Don't do that. WQ_MEM_RECLAIM is only to be used by things interlinked
> with reclaimed processing.
> 
> Work on queues marked with WQ_MEM_RECLAIM can't use GFP_KERNEL
> allocations, can't do certain kinds of sleeps, can't hold certain
> kinds of locks, etc.

>From mlx5 driver, it seems that there is GFP_KERNEL allocations
on wq marked with WQ_MEM_RECLAIM too:

mlx5e_tx_timeout_work() -> mlx5e_safe_reopen_channels() ->
mlx5e_safe_switch_channels() -> mlx5e_open_channels()

kcalloc() is called with GFP_KERNEL in mlx5e_open_channels(),
and mlx5e_tx_timeout_work() is queued with priv->wq, which is
allocated with WQ_MEM_RECLAIM flags. see:

mlx5e_netdev_init() -> create_singlethread_workqueue()


>From the comment in kernel/workqueue.c, the work queued with
wq with WQ_MEM_RECLAIM flag set seems to be executed without
blocking under some rare case. I still not quite understand
the comment, and I can not find any doc that point out the
GFP_KERNEL allocations can not be done in wq with WQ_MEM_RECLAIM
yet. Is there any doc that mentions that GFP_KERNEL allocations
can not be done in wq with WQ_MEM_RECLAIM?


/**
 * rescuer_thread - the rescuer thread function
 * @__rescuer: self
 *
 * Workqueue rescuer thread function.  There's one rescuer for each
 * workqueue which has WQ_MEM_RECLAIM set.
 *
 * Regular work processing on a pool may block trying to create a new
 * worker which uses GFP_KERNEL allocation which has slight chance of
 * developing into deadlock if some works currently on the same queue
 * need to be processed to satisfy the GFP_KERNEL allocation.  This is
 * the problem rescuer solves.
 *
 * When such condition is possible, the pool summons rescuers of all
 * workqueues which have works queued on the pool and let them process
 * those works so that forward progress can be guaranteed.
 *
 * This should happen rarely.
 *
 * Return: 0
 */


The below is the reason we add the sets "hclge_service_task" workqueue
with WQ_MEM_RECLAIM through analysing why other ethernet drivers has
allocated wq with WQ_MEM_RECLAIM flag, I may be wrong about that:

hns3 ethernet driver may be used as the low level transport of a
network file system, memory reclaim data path may depend on the
worker in hns3 driver to bring back the ethernet link so that it flush
the some cache to network based disk.

> 
> Jason
> 
> 

Powered by blists - more mailing lists