[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iK0OWswFFHH10PLzFdcFxZXodWorR5YJSdPq+P6+Qsu1Q@mail.gmail.com>
Date: Tue, 14 Oct 2025 01:25:05 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Barry Song <21cnbao@...il.com>
Cc: corbet@....net, davem@...emloft.net, hannes@...xchg.org, horms@...nel.org,
jackmanb@...gle.com, kuba@...nel.org, kuniyu@...gle.com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linyunsheng@...wei.com, mhocko@...e.com, netdev@...r.kernel.org,
pabeni@...hat.com, surenb@...gle.com, v-songbaohua@...o.com, vbabka@...e.cz,
willemb@...gle.com, zhouhuacai@...o.com, ziy@...dia.com
Subject: Re: [RFC PATCH] mm: net: disable kswapd for high-order network buffer allocation
On Tue, Oct 14, 2025 at 1:17 AM Barry Song <21cnbao@...il.com> wrote:
>
> On Tue, Oct 14, 2025 at 3:01 PM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > On Mon, Oct 13, 2025 at 11:43 PM Barry Song <21cnbao@...il.com> wrote:
> > >
> > > > >
> > > > > A problem with the existing sysctl is that it only covers the TX path;
> > > > > for the RX path, we also observe that kswapd consumes significant power.
> > > > > I could add the patch below to make it support the RX path, but it feels
> > > > > like a bit of a layer violation, since the RX path code resides in mm
> > > > > and is intended to serve generic users rather than networking, even
> > > > > though the current callers are primarily network-related.
> > > >
> > > > You might have a buggy driver.
> > >
> > > We are observing the RX path as follows:
> > >
> > > do_softirq
> > > taskset_hi_action
> > > kalPacketAlloc
> > > __netdev_alloc_skb
> > > page_frag_alloc_align
> > > __page_frag_cache_refill
> > >
> > > This appears to be a fairly common stack.
> > >
> > > So it is a buggy driver?
> >
> > No idea, kalPacketAlloc is not in upstream trees.
> >
> > It apparently needs high order allocations. It will fail at some point.
> >
> > >
> > > >
> > > > High performance drivers use order-0 allocations only.
> > > >
> > >
> > > Do you have an example of high-performance drivers that use only order-0 memory?
> >
> > About all drivers using XDP, and/or using napi_get_frags()
> >
> > XDP has been using order-0 pages from the very beginning.
>
> Thanks! But there are still many drivers using netdev_alloc_skb()—we
> shouldn’t overlook them, right?
>
> net % git grep netdev_alloc_skb | wc -l
> 359
Only the ones that are using 16KB allocations like some WAN drivers :)
Some networks use MTU=9000
If a hardware does not provide SG support on receive, a kmalloc()
based will use 16KB of memory.
By using a frag allocator, we can pack 3 allocations per 32KB instead of 2.
TCP can go 50% faster.
If memory is short, it will fail no matter what.
Powered by blists - more mailing lists