[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <82bf9e5f-b798-4d29-8473-c074a34f15b0@linux.dev>
Date: Thu, 16 May 2024 09:37:19 +0200
From: Zhu Yanjun <zyjzyj2000@...il.com>
To: Håkon Bugge <haakon.bugge@...cle.com>,
linux-rdma@...r.kernel.org, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, rds-devel@....oracle.com
Cc: Jason Gunthorpe <jgg@...pe.ca>, Leon Romanovsky <leon@...nel.org>,
Saeed Mahameed <saeedm@...dia.com>, Tariq Toukan <tariqt@...dia.com>,
"David S . Miller" <davem@...emloft.net>, Eric Dumazet
<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Tejun Heo <tj@...nel.org>,
Lai Jiangshan <jiangshanlai@...il.com>,
Allison Henderson <allison.henderson@...cle.com>,
Manjunath Patil <manjunath.b.patil@...cle.com>,
Mark Zhang <markzhang@...dia.com>, Chuck Lever <chuck.lever@...cle.com>,
Shiraz Saleem <shiraz.saleem@...el.com>, Yang Li <yang.lee@...ux.alibaba.com>
Subject: Re: [PATCH v2 3/6] RDMA/cma: Brute force GFP_NOIO
On 15.05.24 14:53, Håkon Bugge wrote:
> In cma_init(), we call memalloc_noio_{save,restore} in a parenthetic
> fashion when enabled by the module parameter force_noio.
>
> This in order to conditionally enable rdma_cm to work aligned with
> block I/O devices. Any work queued later on work-queues created during
> module initialization will inherit the PF_MEMALLOC_{NOIO,NOFS}
> flag(s), due to commit ("workqueue: Inherit NOIO and NOFS alloc
> flags").
>
> We do this in order to enable ULPs using the RDMA stack to be used as
> a network block I/O device. This to support a filesystem on top of a
> raw block device which uses said ULP(s) and the RDMA stack as the
> network transport layer.
>
> Under intense memory pressure, we get memory reclaims. Assume the
> filesystem reclaims memory, goes to the raw block device, which calls
> into the ULP in question, which calls the RDMA stack. Now, if
> regular GFP_KERNEL allocations in the ULP or the RDMA stack require
> reclaims to be fulfilled, we end up in a circular dependency.
>
> We break this circular dependency by:
>
> 1. Force all allocations in the ULP and the relevant RDMA stack to use
> GFP_NOIO, by means of a parenthetic use of
> memalloc_noio_{save,restore} on all relevant entry points.
>
> 2. Make sure work-queues inherits current->flags
> wrt. PF_MEMALLOC_{NOIO,NOFS}, such that work executed on the
> work-queue inherits the same flag(s).
>
> Signed-off-by: Håkon Bugge <haakon.bugge@...cle.com>
> ---
> drivers/infiniband/core/cma.c | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 1e2cd7c8716e8..23a50cc3e81cb 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL");
> #define CMA_IBOE_PACKET_LIFETIME 16
> #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
>
> +static bool cma_force_noio;
> +module_param_named(force_noio, cma_force_noio, bool, 0444);
> +MODULE_PARM_DESC(force_noio, "Force the use of GFP_NOIO (Y/N)");
> +
> static const char * const cma_events[] = {
> [RDMA_CM_EVENT_ADDR_RESOLVED] = "address resolved",
> [RDMA_CM_EVENT_ADDR_ERROR] = "address error",
> @@ -5424,6 +5428,10 @@ static struct pernet_operations cma_pernet_operations = {
> static int __init cma_init(void)
> {
> int ret;
> + unsigned int noio_flags;
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/maintainer-netdev.rst?h=v6.9#n376
"
Netdev has a convention for ordering local variables in functions.
Order the variable declaration lines longest to shortest, e.g.::
struct scatterlist *sg;
struct sk_buff *skb;
int err, i;
If there are dependencies between the variables preventing the ordering
move the initialization out of line.
"
Zhu Yanjun
> +
> + if (cma_force_noio)
> + noio_flags = memalloc_noio_save();
>
> /*
> * There is a rare lock ordering dependency in cma_netdev_callback()
> @@ -5439,8 +5447,10 @@ static int __init cma_init(void)
> }
>
> cma_wq = alloc_ordered_workqueue("rdma_cm", WQ_MEM_RECLAIM);
> - if (!cma_wq)
> - return -ENOMEM;
> + if (!cma_wq) {
> + ret = -ENOMEM;
> + goto out;
> + }
>
> ret = register_pernet_subsys(&cma_pernet_operations);
> if (ret)
> @@ -5458,7 +5468,8 @@ static int __init cma_init(void)
> if (ret)
> goto err_ib;
>
> - return 0;
> + ret = 0;
> + goto out;
>
> err_ib:
> ib_unregister_client(&cma_client);
> @@ -5469,6 +5480,9 @@ static int __init cma_init(void)
> unregister_pernet_subsys(&cma_pernet_operations);
> err_wq:
> destroy_workqueue(cma_wq);
> +out:
> + if (cma_force_noio)
> + memalloc_noio_restore(noio_flags);
> return ret;
> }
>
Powered by blists - more mailing lists