[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKpGwej5X_noxU+N7Y4o30dpfEFX_Ao6qZeahScvM7qGQ@mail.gmail.com>
Date: Wed, 7 Dec 2022 14:06:15 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Tariq Toukan <ttoukan.linux@...il.com>
Cc: "David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Tariq Toukan <tariqt@...dia.com>, Wei Wang <weiwan@...gle.com>,
netdev@...r.kernel.org, eric.dumazet@...il.com
Subject: Re: [PATCH net-next 2/3] net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends
on MAX_SKB_FRAGS
On Wed, Dec 7, 2022 at 1:53 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Wed, Dec 7, 2022 at 1:40 PM Tariq Toukan <ttoukan.linux@...il.com> wrote:
> >
> >
> >
> > On 12/6/2022 7:50 AM, Eric Dumazet wrote:
> > > Google production kernel has increased MAX_SKB_FRAGS to 45
> > > for BIG-TCP rollout.
> > >
> > > Unfortunately mlx4 TX bounce buffer is not big enough whenever
> > > an skb has up to 45 page fragments.
> > >
> > > This can happen often with TCP TX zero copy, as one frag usually
> > > holds 4096 bytes of payload (order-0 page).
> > >
> > > Tested:
> > > Kernel built with MAX_SKB_FRAGS=45
> > > ip link set dev eth0 gso_max_size 185000
> > > netperf -t TCP_SENDFILE
> > >
> > > I made sure that "ethtool -G eth0 tx 64" was properly working,
> > > ring->full_size being set to 16.
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > > Reported-by: Wei Wang <weiwan@...gle.com>
> > > Cc: Tariq Toukan <tariqt@...dia.com>
> > > ---
> > > drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 16 ++++++++++++----
> > > 1 file changed, 12 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> > > index 7cc288db2a64f75ffe64882e3c25b90715e68855..120b8c361e91d443f83f100a1afabcabc776a92a 100644
> > > --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> > > +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> > > @@ -89,8 +89,18 @@
> > > #define MLX4_EN_FILTER_HASH_SHIFT 4
> > > #define MLX4_EN_FILTER_EXPIRY_QUOTA 60
> > >
> > > -/* Typical TSO descriptor with 16 gather entries is 352 bytes... */
> > > -#define MLX4_TX_BOUNCE_BUFFER_SIZE 512
> > > +#define CTRL_SIZE sizeof(struct mlx4_wqe_ctrl_seg)
> > > +#define DS_SIZE sizeof(struct mlx4_wqe_data_seg)
> > > +
> > > +/* Maximal size of the bounce buffer:
> > > + * 256 bytes for LSO headers.
> > > + * CTRL_SIZE for control desc.
> > > + * DS_SIZE if skb->head contains some payload.
> > > + * MAX_SKB_FRAGS frags.
> > > + */
> > > +#define MLX4_TX_BOUNCE_BUFFER_SIZE (256 + CTRL_SIZE + DS_SIZE + \
> > > + MAX_SKB_FRAGS * DS_SIZE)
> > > +
> > > #define MLX4_MAX_DESC_TXBBS (MLX4_TX_BOUNCE_BUFFER_SIZE / TXBB_SIZE)
> > >
> >
> > Now as MLX4_TX_BOUNCE_BUFFER_SIZE might not be a multiple of TXBB_SIZE,
> > simple integer division won't work to calculate the max num of TXBBs.
> > Roundup is needed.
>
> I do not see why a roundup is needed. This seems like obfuscation to me.
>
> A divide by TXBB_SIZE always "works".
>
> A round up is already done in mlx4_en_xmit()
>
> /* Align descriptor to TXBB size */
> desc_size = ALIGN(real_size, TXBB_SIZE);
> nr_txbb = desc_size >> LOG_TXBB_SIZE;
>
> Then the check is :
>
> if (unlikely(nr_txbb > MLX4_MAX_DESC_TXBBS)) {
> if (netif_msg_tx_err(priv))
> en_warn(priv, "Oversized header or SG list\n");
> goto tx_drop_count;
> }
>
> If we allocate X extra bytes (in case MLX4_TX_BOUNCE_BUFFER_SIZE %
> TXBB_SIZE == X),
> we are not going to use them anyway.
I guess you are worried about not having exactly 256 bytes for the headers ?
Currently, the amount of space for headers is 208 bytes.
If MAX_SKB_FRAGS is 17, MLX4_TX_BOUNCE_BUFFER_SIZE would be 0x230
after my patch,
so the same usable space as before the patch.
Powered by blists - more mailing lists