netdev - Re: [PATCH net-next 2/3] net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends on MAX_SKB

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iKUYMb_4vJ5GAE0-BUmM7JNuHo_p8oHbfJfatYKBX8ouw@mail.gmail.com>
Date:   Wed, 7 Dec 2022 13:53:35 +0100
From:   Eric Dumazet <edumazet@...gle.com>
To:     Tariq Toukan <ttoukan.linux@...il.com>
Cc:     "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Tariq Toukan <tariqt@...dia.com>, Wei Wang <weiwan@...gle.com>,
        netdev@...r.kernel.org, eric.dumazet@...il.com
Subject: Re: [PATCH net-next 2/3] net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends
 on MAX_SKB_FRAGS

On Wed, Dec 7, 2022 at 1:40 PM Tariq Toukan <ttoukan.linux@...il.com> wrote:
>
>
>
> On 12/6/2022 7:50 AM, Eric Dumazet wrote:
> > Google production kernel has increased MAX_SKB_FRAGS to 45
> > for BIG-TCP rollout.
> >
> > Unfortunately mlx4 TX bounce buffer is not big enough whenever
> > an skb has up to 45 page fragments.
> >
> > This can happen often with TCP TX zero copy, as one frag usually
> > holds 4096 bytes of payload (order-0 page).
> >
> > Tested:
> >   Kernel built with MAX_SKB_FRAGS=45
> >   ip link set dev eth0 gso_max_size 185000
> >   netperf -t TCP_SENDFILE
> >
> > I made sure that "ethtool -G eth0 tx 64" was properly working,
> > ring->full_size being set to 16.
> >
> > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > Reported-by: Wei Wang <weiwan@...gle.com>
> > Cc: Tariq Toukan <tariqt@...dia.com>
> > ---
> >   drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 16 ++++++++++++----
> >   1 file changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> > index 7cc288db2a64f75ffe64882e3c25b90715e68855..120b8c361e91d443f83f100a1afabcabc776a92a 100644
> > --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> > +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
> > @@ -89,8 +89,18 @@
> >   #define MLX4_EN_FILTER_HASH_SHIFT 4
> >   #define MLX4_EN_FILTER_EXPIRY_QUOTA 60
> >
> > -/* Typical TSO descriptor with 16 gather entries is 352 bytes... */
> > -#define MLX4_TX_BOUNCE_BUFFER_SIZE 512
> > +#define CTRL_SIZE    sizeof(struct mlx4_wqe_ctrl_seg)
> > +#define DS_SIZE              sizeof(struct mlx4_wqe_data_seg)
> > +
> > +/* Maximal size of the bounce buffer:
> > + * 256 bytes for LSO headers.
> > + * CTRL_SIZE for control desc.
> > + * DS_SIZE if skb->head contains some payload.
> > + * MAX_SKB_FRAGS frags.
> > + */
> > +#define MLX4_TX_BOUNCE_BUFFER_SIZE (256 + CTRL_SIZE + DS_SIZE +              \
> > +                                 MAX_SKB_FRAGS * DS_SIZE)
> > +
> >   #define MLX4_MAX_DESC_TXBBS    (MLX4_TX_BOUNCE_BUFFER_SIZE / TXBB_SIZE)
> >
>
> Now as MLX4_TX_BOUNCE_BUFFER_SIZE might not be a multiple of TXBB_SIZE,
> simple integer division won't work to calculate the max num of TXBBs.
> Roundup is needed.

I do not see why a roundup is needed. This seems like obfuscation to me.

A divide by TXBB_SIZE always "works".

A round up is already done in mlx4_en_xmit()

/* Align descriptor to TXBB size */
desc_size = ALIGN(real_size, TXBB_SIZE);
nr_txbb = desc_size >> LOG_TXBB_SIZE;

Then the check is :

if (unlikely(nr_txbb > MLX4_MAX_DESC_TXBBS)) {
   if (netif_msg_tx_err(priv))
       en_warn(priv, "Oversized header or SG list\n");
   goto tx_drop_count;
}

If we allocate X extra bytes (in case MLX4_TX_BOUNCE_BUFFER_SIZE %
TXBB_SIZE == X),
we are not going to use them anyway.