lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 15 May 2019 17:02:33 -0700
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     David Miller <davem@...emloft.net>, viro@...iv.linux.org.uk
Cc:     netdev@...r.kernel.org
Subject: Re: [RFC] folding socket->wq into struct socket



On 5/5/19 11:25 AM, David Miller wrote:
> From: Al Viro <viro@...iv.linux.org.uk>
> Date: Sun, 5 May 2019 18:59:43 +0100
> 
>> On Sun, May 05, 2019 at 10:04:21AM -0700, David Miller wrote:
>>> From: Al Viro <viro@...iv.linux.org.uk>
>>> Date: Thu, 2 May 2019 17:32:23 +0100
>>>
>>>> it appears that we might take freeing the socket itself to the
>>>> RCU-delayed part, along with socket->wq.  And doing that has
>>>> an interesting benefit - the only reason to do two separate
>>>> allocation disappears.
>>>
>>> I'm pretty sure we looked into RCU freeing the socket in the
>>> past but ended up not doing so.
>>>
>>> I think it had to do with the latency in releasing sock related
>>> objects.
>>>
>>> However, I might be confusing "struct socket" with "struct sock"
>>
>> Erm...  the only object with changed release time is the memory
>> occupied by struct sock_alloc.  Currently:
>> final iput of socket
>> 	schedule RCU-delayed kfree() of socket->wq
>> 	kfree() of socket
>> With this change:
>> final iput of socket
>> 	schedule RCU-delayed kfree() of coallocated socket and socket->wq
>>
>> So it would have to be a workload where tons of sockets are created and
>> torn down, where RCU-delayed freeing of socket_wq is an inevitable evil,
>> but freeing struct socket_alloc itself must be done immediately, to
>> reduce the memory pressure.  Or am I misreading you?
> 
> I think I was remembering trying to RCU "struct sock" release because
> those 'sk' refer to SKBs and stuff like that.
> 
> So, what you are proposing looks fine.
> 

It will also allow us to no longer use sk_callback_lock in some cases since sock,
and file will both be rcu protected.

Random example :

diff --git a/net/netfilter/nf_log_common.c b/net/netfilter/nf_log_common.c
index 3a0d6880b7c9f4710f27840c9119b48982ce201c..62fae9d9843befe95b6de1610e9d6f4baa011201 100644
--- a/net/netfilter/nf_log_common.c
+++ b/net/netfilter/nf_log_common.c
@@ -135,17 +135,22 @@ EXPORT_SYMBOL_GPL(nf_log_dump_tcp_header);
 void nf_log_dump_sk_uid_gid(struct net *net, struct nf_log_buf *m,
                            struct sock *sk)
 {
+       struct socket *sock;
+
        if (!sk || !sk_fullsock(sk) || !net_eq(net, sock_net(sk)))
                return;
 
-       read_lock_bh(&sk->sk_callback_lock);
-       if (sk->sk_socket && sk->sk_socket->file) {
-               const struct cred *cred = sk->sk_socket->file->f_cred;
+       rcu_read_lock();
+       /* could use rcu_dereference() if sk_socket was __rcu annotated */
+       sock = READ_ONCE(sk->sk_socket);
+       if (sock && sock->file) {
+               const struct cred *cred = sock->file->f_cred;
+
                nf_log_buf_add(m, "UID=%u GID=%u ",
                        from_kuid_munged(&init_user_ns, cred->fsuid),
                        from_kgid_munged(&init_user_ns, cred->fsgid));
        }
-       read_unlock_bh(&sk->sk_callback_lock);
+       rcu_read_unlock();
 }
 EXPORT_SYMBOL_GPL(nf_log_dump_sk_uid_gid);
 


Powered by blists - more mailing lists