[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cdbfe4615ffec2bcfde94268dbc77dfa98143f39.camel@redhat.com>
Date: Fri, 30 Sep 2022 19:30:08 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Eric Dumazet <edumazet@...gle.com>,
patchwork-bot+netdevbpf@...nel.org
Cc: netdev <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Alexander Duyck <alexanderduyck@...com>
Subject: Re: [PATCH net-next v4] net: skb: introduce and use a single page
frag cache
Hello,
On Fri, 2022-09-30 at 09:43 -0700, Eric Dumazet wrote:
> On Thu, Sep 29, 2022 at 7:21 PM <patchwork-bot+netdevbpf@...nel.org> wrote:
> >
> > Hello:
> >
> > This patch was applied to netdev/net-next.git (master)
> > by Jakub Kicinski <kuba@...nel.org>:
> >
> > On Wed, 28 Sep 2022 10:43:09 +0200 you wrote:
> > > After commit 3226b158e67c ("net: avoid 32 x truesize under-estimation
> > > for tiny skbs") we are observing 10-20% regressions in performance
> > > tests with small packets. The perf trace points to high pressure on
> > > the slab allocator.
> > >
> > > This change tries to improve the allocation schema for small packets
> > > using an idea originally suggested by Eric: a new per CPU page frag is
> > > introduced and used in __napi_alloc_skb to cope with small allocation
> > > requests.
> > >
> > > [...]
> >
> > Here is the summary with links:
> > - [net-next,v4] net: skb: introduce and use a single page frag cache
> > https://git.kernel.org/netdev/net-next/c/dbae2b062824
> >
>
> Paolo, this patch adds a regression for TCP RPC workloads (aka TCP_RR)
>
> Before the patch, cpus servicing NIC interrupts were allocating
> SLAB/SLUB objects for incoming packets,
> but they were also freeing skbs from TCP rtx queues when ACK packets
> were processed. SLAB/SLUB caches
> were efficient (hit ratio close to 100%)
Thank you for the report. Is that reproducible with netperf TCP_RR and
CONFIG_DEBUG_SLAB, I guess? Do I need specific request/response sizes?
Do you think a revert will be needed for 6.1?
> After the patch, these CPU only free skbs from TCP rtx queues and
> constantly have to drain their alien caches,
> thus competing with the mm spinlocks. RX skbs allocations being done
> by page frag allocation only left kfree(~1KB) calls.
>
> One way to avoid the asymmetric behavior would be to switch TCP to
> also use page frags for TX skbs,
> allocated from tcp_stream_alloc_skb()
I guess we should have:
if (<alloc size is small and NAPI_HAS_SMALL_PAGE>)
<use small page frag>
else
<use current allocator>
right in tcp_stream_alloc_skb()? or all the way down to __alloc_skb()?
Thanks!
Paolo
>
Powered by blists - more mailing lists