netdev - Re: [PATCH net-next] net-memcg: pass in gfp_t mask to mem_cgroup_charge

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20221013005431.wzjurocrdoozykl7@google.com>
Date:   Thu, 13 Oct 2022 00:54:31 +0000
From:   Shakeel Butt <shakeelb@...gle.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Wei Wang <weiwan@...gle.com>, Eric Dumazet <edumazet@...gle.com>,
        netdev@...r.kernel.org, "David S . Miller" <davem@...emloft.net>,
        cgroups@...r.kernel.org, linux-mm@...ck.org,
        Roman Gushchin <roman.gushchin@...ux.dev>
Subject: Re: [PATCH net-next] net-memcg: pass in gfp_t mask to mem_cgroup_charge_skmem()

On Wed, Oct 12, 2022 at 05:38:25PM -0700, Jakub Kicinski wrote:
> On Wed, 12 Oct 2022 17:17:38 -0700 Shakeel Butt wrote:
> > Did the revert of this patch fix the issue you are seeing? The reason
> > I am asking is because this patch should not change the behavior.
> > Actually someone else reported the similar issue for UDP RX at [1] and
> > they tested the revert as well. The revert did not fix the issue for
> > them.
> > 
> > Wei has a better explanation at [2] why this patch is not the cause
> > for these issues.
> 
> We're talking TCP here, to be clear. I haven't tested a revert, yet (not
> that easy to test with a real workload) but I'm relatively confident the
> change did introduce an "unforced" call, specifically this bit:
> 
> @@ -2728,10 +2728,12 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
>  {
>  	struct proto *prot = sk->sk_prot;
>  	long allocated = sk_memory_allocated_add(sk, amt);
> +	bool memcg_charge = mem_cgroup_sockets_enabled && sk->sk_memcg;
>  	bool charged = true;
>  
> -	if (mem_cgroup_sockets_enabled && sk->sk_memcg &&
> -	    !(charged = mem_cgroup_charge_skmem(sk->sk_memcg, amt)))
> +	if (memcg_charge &&
> +	    !(charged = mem_cgroup_charge_skmem(sk->sk_memcg, amt,
> +						gfp_memcg_charge())))
> 
> where gfp_memcg_charge() is GFP_NOWAIT in softirq.
> 
> The above gets called from (inverted stack):
>  tcp_data_queue()
>  tcp_try_rmem_schedule(sk, skb, skb->truesize)
>  tcp_try_rmem_schedule()
>  sk_rmem_schedule()
>  __sk_mem_schedule()
>  __sk_mem_raise_allocated()
> 
> Is my confidence unjustified? :)
> 

Let me add Wei's explanation inline which is protocol independent:

	__sk_mem_raise_allocated() BEFORE the above patch is:
	- mem_cgroup_charge_skmem() gets called:
	    - try_charge() with GFP_NOWAIT gets called and  failed
	    - try_charge() with __GFP_NOFAIL
	    - return false
	- goto suppress_allocation:
	    - mem_cgroup_uncharge_skmem() gets called
	- return 0 (which means failure)

	AFTER the above patch, what happens in __sk_mem_raise_allocated() is:
	- mem_cgroup_charge_skmem() gets called:
	    - try_charge() with GFP_NOWAIT gets called and failed
	    - return false
	- goto suppress_allocation:
	    - We no longer calls mem_cgroup_uncharge_skmem()
	- return 0

So, before the patch, the memcg code may force charges but it will
return false and make the networking code to uncharge memcg for
SK_MEM_RECV.