lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 03 Feb 2022 09:26:02 -0800
From:   Alexander H Duyck <alexander.duyck@...il.com>
To:     Eric Dumazet <eric.dumazet@...il.com>,
        "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>
Cc:     netdev <netdev@...r.kernel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Coco Li <lixiaoyan@...gle.com>
Subject: Re: [PATCH net-next 09/15] net: increase MAX_SKB_FRAGS

On Wed, 2022-02-02 at 17:51 -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@...gle.com>
> 
> Currently, MAX_SKB_FRAGS value is 17.
> 
> For standard tcp sendmsg() traffic, no big deal because tcp_sendmsg()
> attempts order-3 allocations, stuffing 32768 bytes per frag.
> 
> But with zero copy, we use order-0 pages.
> 
> For BIG TCP to show its full potential, we increase MAX_SKB_FRAGS
> to be able to fit 45 segments per skb.
> 
> This is also needed for BIG TCP rx zerocopy, as zerocopy currently
> does not support skbs with frag list.
> 
> We have used this MAX_SKB_FRAGS value for years at Google before
> we deployed 4K MTU, with no adverse effect.
> Back then, goal was to be able to receive full size (64KB) GRO
> packets without the frag_list overhead.
> 
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>

So a big issue I see with this patch is the potential queueing issues
it may introduce on Tx queues. I suspect it will cause a number of
performance regressions and deadlocks as it will change the Tx queueing
behavior for many NICs.

As I recall many of the Intel drivers are using MAX_SKB_FRAGS as one of
the ingredients for DESC_NEEDED in order to determine if the Tx queue
needs to stop. With this change the value for igb for instance is
jumping from 21 to 49, and the wake threshold is twice that, 98. As
such the minimum Tx descriptor threshold for the driver would need to
be updated beyond 80 otherwise it is likely to deadlock the first time
it has to pause.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ