lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpGf8Hj1QAgNtbRwsBwTOZTidt9sGLwX8PYhiHWYyE9Z1A@mail.gmail.com>
Date: Wed, 15 Oct 2025 09:39:03 -0700
From: Suren Baghdasaryan <surenb@...gle.com>
To: Barry Song <21cnbao@...il.com>
Cc: Eric Dumazet <edumazet@...gle.com>, Matthew Wilcox <willy@...radead.org>, netdev@...r.kernel.org, 
	linux-mm@...ck.org, linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Barry Song <v-songbaohua@...o.com>, Jonathan Corbet <corbet@....net>, 
	Kuniyuki Iwashima <kuniyu@...gle.com>, Paolo Abeni <pabeni@...hat.com>, 
	Willem de Bruijn <willemb@...gle.com>, "David S. Miller" <davem@...emloft.net>, 
	Jakub Kicinski <kuba@...nel.org>, Simon Horman <horms@...nel.org>, Vlastimil Babka <vbabka@...e.cz>, 
	Michal Hocko <mhocko@...e.com>, Brendan Jackman <jackmanb@...gle.com>, 
	Johannes Weiner <hannes@...xchg.org>, Zi Yan <ziy@...dia.com>, 
	Yunsheng Lin <linyunsheng@...wei.com>, Huacai Zhou <zhouhuacai@...o.com>
Subject: Re: [RFC PATCH] mm: net: disable kswapd for high-order network buffer allocation

On Wed, Oct 15, 2025 at 12:35 AM Barry Song <21cnbao@...il.com> wrote:
>
> On Wed, Oct 15, 2025 at 2:39 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> > > >
> > > > Tell them they are wrong.
> > >
> > > Well, we checked Qualcomm and MTK, and it seems both set these values
> > > relatively high. In other words, all the AOSP products we examined also
> > > use high values for these settings. Nobody is using tcp_wmem[0]=4096.
> > >
> >
> > The (fine and safe) default should be PAGE_SIZE.
> >
> > Perhaps they are dealing with systems with PAGE_SIZE=65536, but then
> > the skb_page_frag_refill() would be a non issue there, because it would
> > only allocate order-0 pages.
>
> I am 100% sure that all of them handle PAGE_SIZE=4096. Google is working on
> 16KB page size for Android, but it is not ready yet(Please correct me
> if 16KB has been
> ready, Suren).

It is ready but it is new, so it will take some time before we see it
in production devices.

>
> >
> > > We’ll need some time to understand why these are configured this way in
> > > AOSP hardware.
> > >
> > > >
> > > > >
> > > > > It might be worth exploring these settings further, but I can’t quite see
> > > > > their connection to high-order allocations, since high-order allocations are
> > > > > kernel macros.
> > > > >
> > > > > #define SKB_FRAG_PAGE_ORDER     get_order(32768)
> > > > > #define PAGE_FRAG_CACHE_MAX_SIZE        __ALIGN_MASK(32768, ~PAGE_MASK)
> > > > > #define PAGE_FRAG_CACHE_MAX_ORDER       get_order(PAGE_FRAG_CACHE_MAX_SIZE)
> > > > >
> > > > > Is there anything I’m missing?
> > > >
> > > > What is your question exactly ? You read these macros just fine. What
> > > > is your point ?
> > >
> > > My question is whether these settings influence how often high-order
> > > allocations occur. In other words, would lowering these values make
> > > high-order allocations less frequent? If so, why?
> >
> > Because almost all of the buffers stored in TCP write queues are using
> > order-3 pages
> > on arches with 4K pages.
> >
> > I am a bit confused because you posted a patch changing skb_page_frag_refill()
> > without realizing its first user is TCP.
> >
> > Look for sk_page_frag_refill() in tcp_sendmsg_locked()
>
> Sure. Let me review the code further. The problem was observed on the MM
> side, causing over-reclamation and phone heating, while the source of the
> allocations lies in network activity. I am not a network expert and may be
> missing many network details, so I am raising this RFC to both lists to see
> if the network and MM folks can discuss together to find a solution.
>
> As you can see, the discussion has absolutely forked into two branches. :-)
>
> Thanks
> Barry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ