[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
<BN8PR15MB2513B863DCE33BF222FA6D1B9978A@BN8PR15MB2513.namprd15.prod.outlook.com>
Date: Tue, 24 Jun 2025 10:47:25 +0000
From: Bernard Metzler <BMT@...ich.ibm.com>
To: Arnd Bergmann <arnd@...nel.org>, Jason Gunthorpe <jgg@...pe.ca>,
Leon
Romanovsky <leon@...nel.org>,
Nathan Chancellor <nathan@...nel.org>
CC: Arnd Bergmann <arnd@...db.de>,
Nick Desaulniers
<nick.desaulniers+lkml@...il.com>,
Bill Wendling <morbo@...gle.com>,
Justin
Stitt <justinstitt@...gle.com>,
Potnuri Bharat Teja <bharat@...lsio.com>,
Showrya M N <showrya@...lsio.com>, Eric Biggers <ebiggers@...gle.com>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"llvm@...ts.linux.dev" <llvm@...ts.linux.dev>
Subject: RE: [PATCH] RDMA/siw: work around clang stack size warning
> -----Original Message-----
> From: Arnd Bergmann <arnd@...nel.org>
> Sent: Friday, June 20, 2025 1:43 PM
> To: Bernard Metzler <BMT@...ich.ibm.com>; Jason Gunthorpe <jgg@...pe.ca>;
> Leon Romanovsky <leon@...nel.org>; Nathan Chancellor <nathan@...nel.org>
> Cc: Arnd Bergmann <arnd@...db.de>; Nick Desaulniers
> <nick.desaulniers+lkml@...il.com>; Bill Wendling <morbo@...gle.com>; Justin
> Stitt <justinstitt@...gle.com>; Potnuri Bharat Teja <bharat@...lsio.com>;
> Showrya M N <showrya@...lsio.com>; Eric Biggers <ebiggers@...gle.com>;
> linux-rdma@...r.kernel.org; linux-kernel@...r.kernel.org;
> llvm@...ts.linux.dev
> Subject: [EXTERNAL] [PATCH] RDMA/siw: work around clang stack size warning
>
> From: Arnd Bergmann <arnd@...db.de>
>
> clang inlines a lot of functions into siw_qp_sq_process(), with the
> aggregate stack frame blowing the warning limit in some configurations:
>
> drivers/infiniband/sw/siw/siw_qp_tx.c:1014:5: error: stack frame size
> (1544) exceeds limit (1280) in 'siw_qp_sq_process' [-Werror,-Wframe-larger-
> than]
>
> The real problem here is the array of kvec structures in siw_tx_hdt that
> makes up the majority of the consumed stack space.
>
> Ideally there would be a way to avoid allocating the array on the
> stack, but that would require a larger rework. Add a noinline_for_stack
> annotation to avoid the warning for now, and make clang behave the same
> way as gcc here. The combined stack usage is still similar, but is spread
> over multiple functions now.
>
> Signed-off-by: Arnd Bergmann <arnd@...db.de>
> ---
> drivers/infiniband/sw/siw/siw_qp_tx.c | 22 ++++++++++++++++------
> 1 file changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> b/drivers/infiniband/sw/siw/siw_qp_tx.c
> index 6432bce7d083..3a08f57d2211 100644
> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> @@ -277,6 +277,15 @@ static int siw_qp_prepare_tx(struct siw_iwarp_tx
> *c_tx)
> return PKT_FRAGMENTED;
> }
>
> +static noinline_for_stack int
> +siw_sendmsg(struct socket *sock, unsigned int msg_flags,
> + struct kvec *vec, size_t num, size_t len)
> +{
> + struct msghdr msg = { .msg_flags = msg_flags };
> +
> + return kernel_sendmsg(sock, &msg, vec, num, len);
> +}
> +
> /*
> * Send out one complete control type FPDU, or header of FPDU carrying
> * data. Used for fixed sized packets like Read.Requests or zero length
> @@ -285,12 +294,11 @@ static int siw_qp_prepare_tx(struct siw_iwarp_tx
> *c_tx)
> static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s,
> int flags)
> {
> - struct msghdr msg = { .msg_flags = flags };
> struct kvec iov = { .iov_base =
> (char *)&c_tx->pkt.ctrl + c_tx->ctrl_sent,
> .iov_len = c_tx->ctrl_len - c_tx->ctrl_sent };
>
> - int rv = kernel_sendmsg(s, &msg, &iov, 1, iov.iov_len);
> + int rv = siw_sendmsg(s, flags, &iov, 1, iov.iov_len);
>
> if (rv >= 0) {
> c_tx->ctrl_sent += rv;
> @@ -427,13 +435,13 @@ static void siw_unmap_pages(struct kvec *iov,
> unsigned long kmap_mask, int len)
> * Write out iov referencing hdr, data and trailer of current FPDU.
> * Update transmit state dependent on write return status
> */
> -static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s)
> +static noinline_for_stack int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
> + struct socket *s)
> {
> struct siw_wqe *wqe = &c_tx->wqe_active;
> struct siw_sge *sge = &wqe->sqe.sge[c_tx->sge_idx];
> struct kvec iov[MAX_ARRAY];
> struct page *page_array[MAX_ARRAY];
> - struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_EOR };
>
> int seg = 0, do_crc = c_tx->do_crc, is_kva = 0, rv;
> unsigned int data_len = c_tx->bytes_unsent, hdr_len = 0, trl_len = 0,
> @@ -586,14 +594,16 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
> struct socket *s)
> rv = siw_0copy_tx(s, page_array, &wqe->sqe.sge[c_tx->sge_idx],
> c_tx->sge_off, data_len);
> if (rv == data_len) {
> - rv = kernel_sendmsg(s, &msg, &iov[seg], 1, trl_len);
> +
> + rv = siw_sendmsg(s, MSG_DONTWAIT | MSG_EOR, &iov[seg],
> + 1, trl_len);
> if (rv > 0)
> rv += data_len;
> else
> rv = data_len;
> }
> } else {
> - rv = kernel_sendmsg(s, &msg, iov, seg + 1,
> + rv = siw_sendmsg(s, MSG_DONTWAIT | MSG_EOR, iov, seg + 1,
> hdr_len + data_len + trl_len);
> siw_unmap_pages(iov, kmap_mask, seg);
> }
> --
> 2.39.5
looks good and was tested.
Acked-by: Bernard Metzler <bmt@...ich.ibm.com>
Powered by blists - more mailing lists