netdev - Re: [PATCH net] net: memcontrol: charge allocated memory after mem_cgroup_sk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180131215401.GA8956@castle>
Date:   Wed, 31 Jan 2018 21:54:08 +0000
From:   Roman Gushchin <guro@...com>
To:     David Miller <davem@...emloft.net>
CC:     <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <kernel-team@...com>, <edumazet@...gle.com>, <hannes@...xchg.org>,
        <tj@...nel.org>
Subject: Re: [PATCH net] net: memcontrol: charge allocated memory after
 mem_cgroup_sk_alloc()

On Thu, Jan 25, 2018 at 12:03:02PM -0500, David Miller wrote:
> From: Roman Gushchin <guro@...com>
> Date: Thu, 25 Jan 2018 00:19:11 +0000
> 
> > @@ -476,6 +477,10 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
> >  		spin_unlock_bh(&queue->fastopenq.lock);
> >  	}
> >  	mem_cgroup_sk_alloc(newsk);
> > +	amt = sk_memory_allocated(newsk);
> > +	if (amt && newsk->sk_memcg)
> > +		mem_cgroup_charge_skmem(newsk->sk_memcg, amt);
> > +
> 
> This looks confusing to me.
> 
> sk_memory_allocated() is the total amount of memory used by all
> sockets for a particular "struct proto", not just for that specific
> socket.
> 
> Maybe I don't understand how this socket memcg stuff works, but it
> seems like you should be looking instead at how much memory is
> allocated to this specific socket.

So, the patch below takes the per-socket charge into account,
and it _almost_ works: css leak is weaker by a couple orders
of magnitude, but still exists. I believe, the problem is
that we need additional synchronization for sk_memcg and
sk_forward_alloc fields; and I'm really out of ideas how
to do it without heavy artillery like introducing a new
field for unaccounted memcg charge. As I can see, we
check the sk_memcg field without socket lock; and we
do set it from a concurrent context.
Most likely, I do miss something...

So I really start thinking that reverting 9f1c2674b328
("net: memcontrol: defer call to mem_cgroup_sk_alloc()")
and fixing the original issue differently might be easier
and a proper way to go. Does it makes sense?

Thanks!

--

diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 4ca46dc08e63..287de1501a30 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -476,6 +476,12 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
                spin_unlock_bh(&queue->fastopenq.lock);
        }
        mem_cgroup_sk_alloc(newsk);
+       if (mem_cgroup_sockets_enabled && newsk->sk_memcg) {
+               int amt = sk_mem_pages(newsk->sk_forward_alloc);
+               if (amt > 0)
+                       mem_cgroup_charge_skmem(newsk->sk_memcg, amt);
+       }
+
 out:
        release_sock(sk);
        if (req)