lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bea860f8-a196-4dff-a655-4da920e2ebfa@amperemail.onmicrosoft.com>
Date: Tue, 20 Feb 2024 14:26:01 +0800
From: Shijie Huang <shijie@...eremail.onmicrosoft.com>
To: Eric Dumazet <edumazet@...gle.com>,
 Huang Shijie <shijie@...amperecomputing.com>
Cc: kuba@...nel.org, patches@...erecomputing.com, davem@...emloft.net,
 horms@...nel.org, ast@...nel.org, dhowells@...hat.com,
 linyunsheng@...wei.com, aleksander.lobakin@...el.com,
 linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
 cl@...amperecomputing.com
Subject: Re: [PATCH] net: skbuff: allocate the fclone in the current NUMA node


在 2024/2/20 13:32, Eric Dumazet 写道:
> On Tue, Feb 20, 2024 at 3:18 AM Huang Shijie
> <shijie@...amperecomputing.com> wrote:
>> The current code passes NUMA_NO_NODE to __alloc_skb(), we found
>> it may creates fclone SKB in remote NUMA node.
> This is intended (WAI)

Okay. thanks a lot.

It seems I should fix the issue in other code, not the networking.

>
> What about the NUMA policies of the current thread ?

We use "numactl -m 0" for memcached, the NUMA policy should allocate 
fclone in

node 0, but we can see many fclones were allocated in node 1.

We have enough memory to allocate these fclones in node 0.

>
> Has NUMA_NO_NODE behavior changed recently?
I guess not.
>
> What means : "it may creates" ? Please be more specific.

When we use the memcached for testing in NUMA, there are maybe 20% ~ 30% 
fclones were allocated in

remote NUMA node.

After this patch, all the fclones are allocated correctly.


>> So use numa_node_id() to limit the allocation to current NUMA node.
> We prefer the allocation to succeed, instead of failing if the current
> NUMA node has no available memory.

Got it.


Thanks

Huang Shijie

>
> Please check:
>
> grep . /sys/devices/system/node/node*/numastat
>
> Are you going to change ~700 uses of  NUMA_NO_NODE in the kernel ?
>
> Just curious.
>
>> Signed-off-by: Huang Shijie <shijie@...amperecomputing.com>
>> ---
>>   include/linux/skbuff.h | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 2dde34c29203..ebc42b2604ad 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -1343,7 +1343,7 @@ static inline bool skb_fclone_busy(const struct sock *sk,
>>   static inline struct sk_buff *alloc_skb_fclone(unsigned int size,
>>                                                 gfp_t priority)
>>   {
>> -       return __alloc_skb(size, priority, SKB_ALLOC_FCLONE, NUMA_NO_NODE);
>> +       return __alloc_skb(size, priority, SKB_ALLOC_FCLONE, numa_node_id());
>>   }
>>
>>   struct sk_buff *skb_morph(struct sk_buff *dst, struct sk_buff *src);
>> --
>> 2.40.1
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ