[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkYZiz4SaPsqVai1HQzwOV3Y8yoVt9V8m7+dSKi_VfSseg@mail.gmail.com>
Date: Tue, 7 Jan 2025 20:16:55 -0800
From: Yosry Ahmed <yosryahmed@...gle.com>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>,
"hannes@...xchg.org" <hannes@...xchg.org>, "nphamcs@...il.com" <nphamcs@...il.com>,
"chengming.zhou@...ux.dev" <chengming.zhou@...ux.dev>,
"usamaarif642@...il.com" <usamaarif642@...il.com>, "ryan.roberts@....com" <ryan.roberts@....com>,
"21cnbao@...il.com" <21cnbao@...il.com>, "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>,
"herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>, "davem@...emloft.net" <davem@...emloft.net>,
"clabbe@...libre.com" <clabbe@...libre.com>, "ardb@...nel.org" <ardb@...nel.org>,
"ebiggers@...gle.com" <ebiggers@...gle.com>, "surenb@...gle.com" <surenb@...gle.com>,
"Accardi, Kristen C" <kristen.c.accardi@...el.com>,
"Feghali, Wajdi K" <wajdi.k.feghali@...el.com>, "Gopal, Vinodh" <vinodh.gopal@...el.com>
Subject: Re: [PATCH v5 10/12] mm: zswap: Allocate pool batching resources if
the crypto_alg supports batching.
[..]
> >
> > > + }
> > > +
> > > + acomp_ctx->buffers = kmalloc_node(nr_reqs * sizeof(u8 *),
> > GFP_KERNEL, cpu_to_node(cpu));
> >
> > Can we use kcalloc_node() here?
>
> I was wondering if the performance penalty of the kcalloc_node() is acceptable
> because the cpu onlining happens infrequently? If so, it appears zero-initializing
> the allocated memory will help in the cleanup code suggestion in your subsequent
> comment.
I don't think zeroing in this path would be a problem.
>
> >
> > > + if (!acomp_ctx->buffers)
> > > + goto buf_fail;
> > > +
> > > + for (i = 0; i < nr_reqs; ++i) {
> > > + acomp_ctx->buffers[i] = kmalloc_node(PAGE_SIZE * 2,
> > GFP_KERNEL, cpu_to_node(cpu));
> > > + if (!acomp_ctx->buffers[i]) {
> > > + for (j = 0; j < i; ++j)
> > > + kfree(acomp_ctx->buffers[j]);
> > > + kfree(acomp_ctx->buffers);
> > > + ret = -ENOMEM;
> > > + goto buf_fail;
> > > + }
> > > + }
> > > +
> > > + acomp_ctx->reqs = kmalloc_node(nr_reqs * sizeof(struct acomp_req
> > *), GFP_KERNEL, cpu_to_node(cpu));
> >
> > Ditto.
>
> Sure.
>
> >
> > > + if (!acomp_ctx->reqs)
> > > goto req_fail;
> > > +
> > > + for (i = 0; i < nr_reqs; ++i) {
> > > + acomp_ctx->reqs[i] = acomp_request_alloc(acomp_ctx->acomp);
> > > + if (!acomp_ctx->reqs[i]) {
> > > + pr_err("could not alloc crypto acomp_request reqs[%d]
> > %s\n",
> > > + i, pool->tfm_name);
> > > + for (j = 0; j < i; ++j)
> > > + acomp_request_free(acomp_ctx->reqs[j]);
> > > + kfree(acomp_ctx->reqs);
> > > + ret = -ENOMEM;
> > > + goto req_fail;
> > > + }
> > > }
> > > - acomp_ctx->req = req;
> > >
> > > + /*
> > > + * The crypto_wait is used only in fully synchronous, i.e., with scomp
> > > + * or non-poll mode of acomp, hence there is only one "wait" per
> > > + * acomp_ctx, with callback set to reqs[0], under the assumption that
> > > + * there is at least 1 request per acomp_ctx.
> > > + */
> > > crypto_init_wait(&acomp_ctx->wait);
> > > /*
> > > * if the backend of acomp is async zip, crypto_req_done() will wakeup
> > > * crypto_wait_req(); if the backend of acomp is scomp, the callback
> > > * won't be called, crypto_wait_req() will return without blocking.
> > > */
> > > - acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
> > > + acomp_request_set_callback(acomp_ctx->reqs[0],
> > CRYPTO_TFM_REQ_MAY_BACKLOG,
> > > crypto_req_done, &acomp_ctx->wait);
> > >
> > > + acomp_ctx->nr_reqs = nr_reqs;
> > > return 0;
> > >
> > > req_fail:
> > > + for (i = 0; i < nr_reqs; ++i)
> > > + kfree(acomp_ctx->buffers[i]);
> > > + kfree(acomp_ctx->buffers);
> >
> > The cleanup code is all over the place. Sometimes it's done in the
> > loops allocating the memory and sometimes here. It's a bit hard to
> > follow. Please have all the cleanups here. You can just initialize the
> > arrays to 0s, and then if the array is not-NULL you can free any
> > non-NULL elements (kfree() will handle NULLs gracefully).
>
> Sure, if performance of kzalloc_node() is an acceptable trade-off for the
> cleanup code simplification.
>
> >
> > There may be even potential for code reuse with zswap_cpu_comp_dead().
>
> I assume the reuse will be through copy-and-paste the same lines of code as
> against a common procedure being called by zswap_cpu_comp_prepare()
> and zswap_cpu_comp_dead()?
Well, I meant we can possibly introduce the helper that will be used
by both zswap_cpu_comp_prepare() and zswap_cpu_comp_dead() (for
example see __mem_cgroup_free() called from both the freeing path and
the allocation path to do cleanup).
I didn't look too closely into it though, maybe it's best to keep them
separate, depending on how the code ends up looking like.
Powered by blists - more mailing lists