[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87sfialu2n.fsf@cloudflare.com>
Date: Tue, 22 Nov 2022 19:02:31 +0100
From: Jakub Sitnicki <jakub@...udflare.com>
To: Eric Dumazet <edumazet@...gle.com>,
Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc: Boqun Feng <boqun.feng@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
Waiman Long <longman@...hat.com>, netdev@...r.kernel.org,
syzbot <syzbot+94cc2a66fc228b23f360@...kaller.appspotmail.com>,
syzkaller-bugs@...glegroups.com
Subject: Re: WARNING: locking bug in inet_autobind
On Tue, Sep 27, 2022 at 10:00 PM +09, Tetsuo Handa wrote:
> On 2022/09/19 14:02, Tetsuo Handa wrote:
>> But unfortunately reordering
>>
>> tunnel->sock = sk;
>> ...
>> lockdep_set_class_and_name(&sk->sk_lock.slock,...);
>>
>> by
>>
>> lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class, "l2tp_sock");
>> smp_store_release(&tunnel->sock, sk);
>>
>> does not help, for connect() on AF_INET6 socket is not finding this "sk" by
>> accessing tunnel->sock.
>>
>
> I considered something like below diff, but I came to think that this problem
> cannot be solved unless l2tp_tunnel_register() stops using userspace-supplied
> file descriptor and starts always calling l2tp_tunnel_sock_create(), for
> userspace can continue using userspace-supplied file descriptor as if a normal
> socket even after lockdep_set_class_and_name() told that this is a tunneling
> socket.
>
> Since userspace-supplied file descriptor has to be a datagram socket,
> can we somehow copy the source/destination addresses from
> userspace-supplied socket to kernel-created socket?
>
>
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 7499c51b1850..07429bed7c4c 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1382,8 +1382,6 @@ static int l2tp_tunnel_sock_create(struct net *net,
> return err;
> }
>
> -static struct lock_class_key l2tp_socket_class;
> -
> int l2tp_tunnel_create(int fd, int version, u32 tunnel_id, u32 peer_tunnel_id,
> struct l2tp_tunnel_cfg *cfg, struct l2tp_tunnel **tunnelp)
> {
> @@ -1509,8 +1507,20 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
>
> tunnel->old_sk_destruct = sk->sk_destruct;
> sk->sk_destruct = &l2tp_tunnel_destruct;
> - lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class,
> - "l2tp_sock");
> + if (IS_ENABLED(CONFIG_LOCKDEP)) {
> + static struct lock_class_key l2tp_socket_class;
> +
> + /* Changing class/name of an already visible sock might race
> + * with first lock_sock() call on that sock. In order to make
> + * sure that register_lock_class() has completed before
> + * lockdep_set_class_and_name() changes class/name, explicitly
> + * lock/release that sock.
> + */
> + lock_sock(sk);
> + release_sock(sk);
> + lockdep_set_class_and_name(&sk->sk_lock.slock,
> + &l2tp_socket_class, "l2tp_sock");
> + }
> sk->sk_allocation = GFP_ATOMIC;
>
> trace_register_tunnel(tunnel);
What if we revisit Eric's lockdep splat fix in 37159ef2c1ae ("l2tp: fix
a lockdep splat") and:
1. remove the lockdep_set_class_and_name(...) call in l2tp; it looks
like an odd case within the network stack, and
2. switch to bh_lock_sock_nested in l2tp_xmit_core so that we don't
break what has been fixed in 37159ef2c1ae.
Eric, WDYT?
Powered by blists - more mailing lists