[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4x5v=M0=jYGOqy1rHL9aVg-76OgiE0qQMdEu70FhZcmUg@mail.gmail.com>
Date: Wed, 15 Oct 2025 04:17:44 +0800
From: Barry Song <21cnbao@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Matthew Wilcox <willy@...radead.org>, netdev@...r.kernel.org, linux-mm@...ck.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
Barry Song <v-songbaohua@...o.com>, Jonathan Corbet <corbet@....net>,
Kuniyuki Iwashima <kuniyu@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Willem de Bruijn <willemb@...gle.com>, "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Simon Horman <horms@...nel.org>, Vlastimil Babka <vbabka@...e.cz>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Brendan Jackman <jackmanb@...gle.com>, Johannes Weiner <hannes@...xchg.org>, Zi Yan <ziy@...dia.com>,
Yunsheng Lin <linyunsheng@...wei.com>, Huacai Zhou <zhouhuacai@...o.com>
Subject: Re: [RFC PATCH] mm: net: disable kswapd for high-order network buffer allocation
On Tue, Oct 14, 2025 at 6:39 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Tue, Oct 14, 2025 at 3:19 AM Barry Song <21cnbao@...il.com> wrote:
> >
> > > >
> > > > >
> > > > > I think you are missing something to control how much memory can be
> > > > > pushed on each TCP socket ?
> > > > >
> > > > > What is tcp_wmem on your phones ? What about tcp_mem ?
> > > > >
> > > > > Have you looked at /proc/sys/net/ipv4/tcp_notsent_lowat
> > > >
> > > > # cat /proc/sys/net/ipv4/tcp_wmem
> > > > 524288 1048576 6710886
> > >
> > > Ouch. That is insane tcp_wmem[0] .
> > >
> > > Please stick to 4096, or risk OOM of various sorts.
> > >
> > > >
> > > > # cat /proc/sys/net/ipv4/tcp_notsent_lowat
> > > > 4294967295
> > > >
> > > > Any thoughts on these settings?
> > >
> > > Please look at
> > > https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
> > >
> > > tcp_notsent_lowat - UNSIGNED INTEGER
> > > A TCP socket can control the amount of unsent bytes in its write queue,
> > > thanks to TCP_NOTSENT_LOWAT socket option. poll()/select()/epoll()
> > > reports POLLOUT events if the amount of unsent bytes is below a per
> > > socket value, and if the write queue is not full. sendmsg() will
> > > also not add new buffers if the limit is hit.
> > >
> > > This global variable controls the amount of unsent data for
> > > sockets not using TCP_NOTSENT_LOWAT. For these sockets, a change
> > > to the global variable has immediate effect.
> > >
> > >
> > > Setting this sysctl to 2MB can effectively reduce the amount of memory
> > > in TCP write queues by 66 %,
> > > or allow you to increase tcp_wmem[2] so that only flows needing big
> > > BDP can get it.
> >
> > We obtained these settings from our hardware vendors.
>
> Tell them they are wrong.
Well, we checked Qualcomm and MTK, and it seems both set these values
relatively high. In other words, all the AOSP products we examined also
use high values for these settings. Nobody is using tcp_wmem[0]=4096.
We’ll need some time to understand why these are configured this way in
AOSP hardware.
>
> >
> > It might be worth exploring these settings further, but I can’t quite see
> > their connection to high-order allocations, since high-order allocations are
> > kernel macros.
> >
> > #define SKB_FRAG_PAGE_ORDER get_order(32768)
> > #define PAGE_FRAG_CACHE_MAX_SIZE __ALIGN_MASK(32768, ~PAGE_MASK)
> > #define PAGE_FRAG_CACHE_MAX_ORDER get_order(PAGE_FRAG_CACHE_MAX_SIZE)
> >
> > Is there anything I’m missing?
>
> What is your question exactly ? You read these macros just fine. What
> is your point ?
My question is whether these settings influence how often high-order
allocations occur. In other words, would lowering these values make
high-order allocations less frequent? If so, why?
I’m not a network expert, apologies if the question sounds naive.
>
> We had in the past something dynamic that we removed
>
> commit d9b2938aabf757da2d40153489b251d4fc3fdd18
> Author: Eric Dumazet <edumazet@...gle.com>
> Date: Wed Aug 27 20:49:34 2014 -0700
>
> net: attempt a single high order allocation
Thanks
Barry
Powered by blists - more mailing lists