lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 30 Sep 2022 10:45:27 -0700
From:   Eric Dumazet <edumazet@...gle.com>
To:     Paolo Abeni <pabeni@...hat.com>
Cc:     patchwork-bot+netdevbpf@...nel.org,
        netdev <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Alexander Duyck <alexanderduyck@...com>
Subject: Re: [PATCH net-next v4] net: skb: introduce and use a single page
 frag cache

On Fri, Sep 30, 2022 at 10:30 AM Paolo Abeni <pabeni@...hat.com> wrote:
>
> Hello,
>
> On Fri, 2022-09-30 at 09:43 -0700, Eric Dumazet wrote:
> > On Thu, Sep 29, 2022 at 7:21 PM <patchwork-bot+netdevbpf@...nel.org> wrote:
> > >
> > > Hello:
> > >
> > > This patch was applied to netdev/net-next.git (master)
> > > by Jakub Kicinski <kuba@...nel.org>:
> > >
> > > On Wed, 28 Sep 2022 10:43:09 +0200 you wrote:
> > > > After commit 3226b158e67c ("net: avoid 32 x truesize under-estimation
> > > > for tiny skbs") we are observing 10-20% regressions in performance
> > > > tests with small packets. The perf trace points to high pressure on
> > > > the slab allocator.
> > > >
> > > > This change tries to improve the allocation schema for small packets
> > > > using an idea originally suggested by Eric: a new per CPU page frag is
> > > > introduced and used in __napi_alloc_skb to cope with small allocation
> > > > requests.
> > > >
> > > > [...]
> > >
> > > Here is the summary with links:
> > >   - [net-next,v4] net: skb: introduce and use a single page frag cache
> > >     https://git.kernel.org/netdev/net-next/c/dbae2b062824
> > >
> >
> > Paolo, this patch adds a regression for TCP RPC workloads (aka TCP_RR)
> >
> > Before the patch, cpus servicing NIC interrupts were allocating
> > SLAB/SLUB objects for incoming packets,
> > but they were also freeing skbs from TCP rtx queues when ACK packets
> > were processed. SLAB/SLUB caches
> > were efficient (hit ratio close to 100%)
>
> Thank you for the report. Is that reproducible with netperf TCP_RR and
> CONFIG_DEBUG_SLAB, I guess? Do I need specific request/response sizes?

No CONFIG_DEBUG_SLAB, simply standard SLAB, and tcp_rr tests on an AMD
host with 256 cpus...


>
> Do you think a revert will be needed for 6.1?

No need for a revert, I am sure we can add a followup.

>
> > After the patch, these CPU only free skbs from TCP rtx queues and
> > constantly have to drain their alien caches,
> > thus competing with the mm spinlocks. RX skbs allocations being done
> > by page frag allocation only left kfree(~1KB) calls.
> >
> > One way to avoid the asymmetric behavior would be to switch TCP to
> > also use page frags for TX skbs,
> > allocated from tcp_stream_alloc_skb()
>
> I guess we should have:
>

Note that typical skb allocated from tcp sendmsg() have size==0 (all
payload is put in skb frag, not in skb->head)

>         if (<alloc size is small and NAPI_HAS_SMALL_PAGE>)
>                 <use small page frag>
>         else
>                 <use current allocator>
>
> right in tcp_stream_alloc_skb()? or all the way down to __alloc_skb()?

We could first try in tcp_stream_alloc_skb()

>
> Thanks!
>
> Paolo
>
>
>
> >
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ