lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 29 Feb 2024 18:07:36 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: "Christoph Lameter (Ampere)" <cl@...ux.com>
Cc: Shijie Huang <shijie@...eremail.onmicrosoft.com>, 
	Huang Shijie <shijie@...amperecomputing.com>, kuba@...nel.org, 
	patches@...erecomputing.com, davem@...emloft.net, horms@...nel.org, 
	ast@...nel.org, dhowells@...hat.com, linyunsheng@...wei.com, 
	aleksander.lobakin@...el.com, linux-kernel@...r.kernel.org, 
	netdev@...r.kernel.org, cl@...amperecomputing.com
Subject: Re: [PATCH v2] net: skbuff: set FLAG_SKB_NO_MERGE for skbuff_fclone_cache

On Thu, Feb 29, 2024 at 6:01 PM Christoph Lameter (Ampere) <cl@...ux.com> wrote:
>
> On Wed, 28 Feb 2024, Shijie Huang wrote:
>
> >>
> >> Using SLAB_NO_MERGE does not help, I am still seeing wrong allocations
> >> on a dual socket
> >> host with plenty of available memory.
> >> (either sk_buff or skb->head being allocated on the other node).
> >
> > Do you mean you still can see the wrong fclone after using SLAB_NO_MERGE?
> >
> > If so, I guess there is bug in the slub.
>
> Mergin has nothing to do with memory locality.
>
> >> fclones might be allocated from a cpu running on node A, and freed
> >> from a cpu running on node B.
> >> Maybe SLUB is not properly handling this case ?
> >
> > Maybe.
>
> Basic functionality is broken??? Really?

It seems so.

>
> >> I think we need help from mm/slub experts, instead of trying to 'fix'
> >> networking stacks.
> >
> > @Christopher
> >
> > Any idea about this?
>
>
> If you want to force a local allocation then use GFP_THISNODE as a flag.
>
> If you do not specify a node or GFP_THISNODE then the slub allocator will
> opportunistically allocate sporadically from other nodes to avoid
> fragmentation of slabs. The page allocator also will sporadically go off
> node in order to avoid reclaim. The page allocator may go off node
> extensively if there is a imbalance of allocation between node. The page
> allocator has knobs to tune off node vs reclaim options. Doing more
> reclaim will slow things down but give you local data.

Maybe, maybe not.

Going back to CONFIG_SLAB=y removes all mismatches, without having to
use GFP_THISNODE at all,
on hosts with plenty of available memory on all nodes.

I think that is some kind of evidence that something is broken in SLUB land.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ