[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <71ac08d3-9f36-e0de-870e-3e252abcb66a@bytedance.com>
Date: Fri, 22 Sep 2023 18:10:06 +0800
From: Abel Wu <wuyun.abel@...edance.com>
To: Shakeel Butt <shakeelb@...gle.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Kuniyuki Iwashima <kuniyu@...zon.com>,
Breno Leitao <leitao@...ian.org>,
Alexander Mikhalitsyn <alexander@...alicyn.com>,
David Howells <dhowells@...hat.com>,
Jason Xing <kernelxing@...cent.com>,
Xin Long <lucien.xin@...il.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujtsu.com>,
"open list:NETWORKING [GENERAL]" <netdev@...r.kernel.org>,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next 2/2] sock: Fix improper heuristic on raising
memory
On 9/22/23 4:36 PM, Abel Wu wrote:
> On 9/22/23 3:01 AM, Shakeel Butt wrote:
>> On Wed, Sep 20, 2023 at 09:25:41PM +0800, Abel Wu wrote:
>>> Before sockets became aware of net-memcg's memory pressure since
>>> commit e1aab161e013 ("socket: initial cgroup code."), the memory
>>> usage would be granted to raise if below average even when under
>>> protocol's pressure. This provides fairness among the sockets of
>>> same protocol.
>>>
>>> That commit changes this because the heuristic will also be
>>> effective when only memcg is under pressure which makes no sense.
>>> Fix this by skipping this heuristic when under memcg pressure.
>>>
>>> Fixes: e1aab161e013 ("socket: initial cgroup code.")
>>> Signed-off-by: Abel Wu <wuyun.abel@...edance.com>
>>> ---
>>> net/core/sock.c | 10 +++++++++-
>>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/core/sock.c b/net/core/sock.c
>>> index 379eb8b65562..ef5cf6250f17 100644
>>> --- a/net/core/sock.c
>>> +++ b/net/core/sock.c
>>> @@ -3093,8 +3093,16 @@ int __sk_mem_raise_allocated(struct sock *sk,
>>> int size, int amt, int kind)
>>> if (sk_has_memory_pressure(sk)) {
>>> u64 alloc;
>>> - if (!sk_under_memory_pressure(sk))
>>> + if (memcg && mem_cgroup_under_socket_pressure(memcg))
>>> + goto suppress_allocation;
>>> +
>>> + if (!sk_under_global_memory_pressure(sk))
>>> return 1;
>>
>> I am onboard with replacing sk_under_memory_pressure() with
>> sk_under_global_memory_pressure(). However suppressing on memcg pressure
>> is a behavior change from status quo and need more thought and testing.
>>
>> I think there are three options for this hunk:
>>
>> 1. proposed patch
>> 2. Consider memcg pressure only for !in_softirq().
>> 3. Don't consider memcg pressure at all.
>>
>> All three options are behavior change from the status quo but with
>> different risk levels. (1) may reintroduce the regression fixed by
>> 720ca52bcef22 ("net-memcg: avoid stalls when under memory pressure").
>
> Just for the record, it is same for the current upstream implementation
> if the socket reaches average usage. Taking option 2 will fix this too.
>
>> (2) is more inlined with 720ca52bcef22. (3) has the risk to making memcg
>> limits ineffective.
>>
>> IMHO we should go with (2) as there is already a precedence in
>> 720ca52bcef22.
>
> Yes, I agree. Actually applying option(2) would make this patch quite
> similar to the previous version[a], except the below part:
>
> /* Under limit. */
> if (allocated <= sk_prot_mem_limits(sk, 0)) {
> sk_leave_memory_pressure(sk);
> - return 1;
> + if (!under_memcg_pressure)
> + return 1;
> }
After a second thought, it is still vague to me about the position
the memcg pressure should be in socket memory allocation. It lacks
convincing design. I think the above hunk helps, but not much.
I wonder if we should take option (3) first. Thoughts?
Thanks,
Abel
>
> My original thought is to inherit the behavior of tcpmem pressure.
> There are also 3 levels of memcg pressure named low/medium/critical,
> but considering that the 'low' level is too much conservative for
> socket allocation, I made the following match:
>
> PROTOCOL MEMCG ACTION
> -----------------------------------------------------
> low <medium allow allocation
> pressure medium be more conservative
> high critical throttle
>
> which also seems align with the design[b] of memcg pressure. Anyway
> I will take option (2) and post v2.
>
> Thanks & Best,
> Abel
>
> [a]
> https://lore.kernel.org/lkml/20230901062141.51972-4-wuyun.abel@bytedance.com/
> [b]
> https://docs.kernel.org/admin-guide/cgroup-v1/memory.html#memory-pressure
Powered by blists - more mailing lists