[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0e61fa9e-fec6-5579-76f4-317b77ce80aa@gmail.com>
Date: Wed, 15 May 2019 17:02:33 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: David Miller <davem@...emloft.net>, viro@...iv.linux.org.uk
Cc: netdev@...r.kernel.org
Subject: Re: [RFC] folding socket->wq into struct socket
On 5/5/19 11:25 AM, David Miller wrote:
> From: Al Viro <viro@...iv.linux.org.uk>
> Date: Sun, 5 May 2019 18:59:43 +0100
>
>> On Sun, May 05, 2019 at 10:04:21AM -0700, David Miller wrote:
>>> From: Al Viro <viro@...iv.linux.org.uk>
>>> Date: Thu, 2 May 2019 17:32:23 +0100
>>>
>>>> it appears that we might take freeing the socket itself to the
>>>> RCU-delayed part, along with socket->wq. And doing that has
>>>> an interesting benefit - the only reason to do two separate
>>>> allocation disappears.
>>>
>>> I'm pretty sure we looked into RCU freeing the socket in the
>>> past but ended up not doing so.
>>>
>>> I think it had to do with the latency in releasing sock related
>>> objects.
>>>
>>> However, I might be confusing "struct socket" with "struct sock"
>>
>> Erm... the only object with changed release time is the memory
>> occupied by struct sock_alloc. Currently:
>> final iput of socket
>> schedule RCU-delayed kfree() of socket->wq
>> kfree() of socket
>> With this change:
>> final iput of socket
>> schedule RCU-delayed kfree() of coallocated socket and socket->wq
>>
>> So it would have to be a workload where tons of sockets are created and
>> torn down, where RCU-delayed freeing of socket_wq is an inevitable evil,
>> but freeing struct socket_alloc itself must be done immediately, to
>> reduce the memory pressure. Or am I misreading you?
>
> I think I was remembering trying to RCU "struct sock" release because
> those 'sk' refer to SKBs and stuff like that.
>
> So, what you are proposing looks fine.
>
It will also allow us to no longer use sk_callback_lock in some cases since sock,
and file will both be rcu protected.
Random example :
diff --git a/net/netfilter/nf_log_common.c b/net/netfilter/nf_log_common.c
index 3a0d6880b7c9f4710f27840c9119b48982ce201c..62fae9d9843befe95b6de1610e9d6f4baa011201 100644
--- a/net/netfilter/nf_log_common.c
+++ b/net/netfilter/nf_log_common.c
@@ -135,17 +135,22 @@ EXPORT_SYMBOL_GPL(nf_log_dump_tcp_header);
void nf_log_dump_sk_uid_gid(struct net *net, struct nf_log_buf *m,
struct sock *sk)
{
+ struct socket *sock;
+
if (!sk || !sk_fullsock(sk) || !net_eq(net, sock_net(sk)))
return;
- read_lock_bh(&sk->sk_callback_lock);
- if (sk->sk_socket && sk->sk_socket->file) {
- const struct cred *cred = sk->sk_socket->file->f_cred;
+ rcu_read_lock();
+ /* could use rcu_dereference() if sk_socket was __rcu annotated */
+ sock = READ_ONCE(sk->sk_socket);
+ if (sock && sock->file) {
+ const struct cred *cred = sock->file->f_cred;
+
nf_log_buf_add(m, "UID=%u GID=%u ",
from_kuid_munged(&init_user_ns, cred->fsuid),
from_kgid_munged(&init_user_ns, cred->fsgid));
}
- read_unlock_bh(&sk->sk_callback_lock);
+ rcu_read_unlock();
}
EXPORT_SYMBOL_GPL(nf_log_dump_sk_uid_gid);
Powered by blists - more mailing lists