lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <79945d0c-2430-c094-f3ba-12ee428eb8c7@bytedance.com>
Date:   Tue, 23 May 2023 19:29:20 +0800
From:   Abel Wu <wuyun.abel@...edance.com>
To:     Paolo Abeni <pabeni@...hat.com>,
        "David S . Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>
Cc:     Simon Horman <simon.horman@...igine.com>, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: Re: [PATCH v3 4/5] sock: Consider memcg pressure when raising
 sockmem

On 5/23/23 6:26 PM, Paolo Abeni wrote:
> On Tue, 2023-05-23 at 17:46 +0800, Abel Wu wrote:
>> For now __sk_mem_raise_allocated() mainly considers global socket
>> memory pressure and allows to raise if no global pressure observed,
>> including the sockets whose memcgs are in pressure, which might
>> result in longer memcg memstall.
>>
>> So take net-memcg's pressure into consideration when allocating
>> socket memory to alleviate long tail latencies.
>>
>> Signed-off-by: Abel Wu <wuyun.abel@...edance.com>
>> ---
>>   net/core/sock.c | 23 ++++++++++++++++-------
>>   1 file changed, 16 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/core/sock.c b/net/core/sock.c
>> index 801df091e37a..b899e0b9feda 100644
>> --- a/net/core/sock.c
>> +++ b/net/core/sock.c
>> @@ -2976,22 +2976,31 @@ EXPORT_SYMBOL(sk_wait_data);
>>   int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
>>   {
>>   	bool memcg_charge = mem_cgroup_sockets_enabled && sk->sk_memcg;
>> +	bool charged = true, pressured = false;
>>   	struct proto *prot = sk->sk_prot;
>> -	bool charged = true;
>>   	long allocated;
>>   
>>   	sk_memory_allocated_add(sk, amt);
>>   	allocated = sk_memory_allocated(sk);
>> -	if (memcg_charge &&
>> -	    !(charged = mem_cgroup_charge_skmem(sk->sk_memcg, amt,
>> -						gfp_memcg_charge())))
>> -		goto suppress_allocation;
>> +
>> +	if (memcg_charge) {
>> +		charged = mem_cgroup_charge_skmem(sk->sk_memcg, amt,
>> +						  gfp_memcg_charge());
>> +		if (!charged)
>> +			goto suppress_allocation;
>> +		if (mem_cgroup_under_socket_pressure(sk->sk_memcg))
>> +			pressured = true;
>> +	}
>>   
>>   	/* Under limit. */
>> -	if (allocated <= sk_prot_mem_limits(sk, 0)) {
>> +	if (allocated <= sk_prot_mem_limits(sk, 0))
>>   		sk_leave_memory_pressure(sk);
>> +	else
>> +		pressured = true;
> 
> The above looks not correct to me.
> 
> 	allocated > sk_prot_mem_limits(sk, 0)
> 
> does not mean the protocol has memory pressure. Such condition is
> checked later with:
> 
> 	if (allocated > sk_prot_mem_limits(sk, 1))

Yes, this condition stands means the global socket memory is absolutely
under pressure, and the status is sustained until @allocated falls down
to sk_prot_mem_limits(sk, 0). I see some places in the source tree call
it 'soft pressure' if usage between index [0] and [1].

The idea behind this patch is to allow the socket memory to raise if
there is no pressure neither in global nor net-memcg. With the condition

	@allocated > sk_prot_mem_limits(sk, 0)

we can't be sure whether there is pressure or not in global. And this
also aligns with the original logic if net-memcg is not used.

I am thinking changing the name of this variable to @might_pressured or
something to better illustrate the status of memory pressure. What do
you think?

> 
> Here an allocation could fail even if memcg charge is successful and
> the protocol is not under pressure, which in turn sounds quite (too
> much?) conservative.

IIUC the failure can only be due to its memcg under vmpressure. In this
case allowing the allocation would burden the mm subsys with increased
fragmented unmovable/unreclaimable memory.

Thanks & Best,
	Abel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ