[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iJpZ6udACMC9EF=zgYJU5rqLFiTuYJRf1UNA3UKu7CxJg@mail.gmail.com>
Date: Thu, 29 Feb 2024 18:07:36 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: "Christoph Lameter (Ampere)" <cl@...ux.com>
Cc: Shijie Huang <shijie@...eremail.onmicrosoft.com>,
Huang Shijie <shijie@...amperecomputing.com>, kuba@...nel.org,
patches@...erecomputing.com, davem@...emloft.net, horms@...nel.org,
ast@...nel.org, dhowells@...hat.com, linyunsheng@...wei.com,
aleksander.lobakin@...el.com, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, cl@...amperecomputing.com
Subject: Re: [PATCH v2] net: skbuff: set FLAG_SKB_NO_MERGE for skbuff_fclone_cache
On Thu, Feb 29, 2024 at 6:01 PM Christoph Lameter (Ampere) <cl@...ux.com> wrote:
>
> On Wed, 28 Feb 2024, Shijie Huang wrote:
>
> >>
> >> Using SLAB_NO_MERGE does not help, I am still seeing wrong allocations
> >> on a dual socket
> >> host with plenty of available memory.
> >> (either sk_buff or skb->head being allocated on the other node).
> >
> > Do you mean you still can see the wrong fclone after using SLAB_NO_MERGE?
> >
> > If so, I guess there is bug in the slub.
>
> Mergin has nothing to do with memory locality.
>
> >> fclones might be allocated from a cpu running on node A, and freed
> >> from a cpu running on node B.
> >> Maybe SLUB is not properly handling this case ?
> >
> > Maybe.
>
> Basic functionality is broken??? Really?
It seems so.
>
> >> I think we need help from mm/slub experts, instead of trying to 'fix'
> >> networking stacks.
> >
> > @Christopher
> >
> > Any idea about this?
>
>
> If you want to force a local allocation then use GFP_THISNODE as a flag.
>
> If you do not specify a node or GFP_THISNODE then the slub allocator will
> opportunistically allocate sporadically from other nodes to avoid
> fragmentation of slabs. The page allocator also will sporadically go off
> node in order to avoid reclaim. The page allocator may go off node
> extensively if there is a imbalance of allocation between node. The page
> allocator has knobs to tune off node vs reclaim options. Doing more
> reclaim will slow things down but give you local data.
Maybe, maybe not.
Going back to CONFIG_SLAB=y removes all mismatches, without having to
use GFP_THISNODE at all,
on hosts with plenty of available memory on all nodes.
I think that is some kind of evidence that something is broken in SLUB land.
Powered by blists - more mailing lists