[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLD25rKsK9YoNexwh+fX4KOZcEjP_JC5w3DSn8y-UvFSg@mail.gmail.com>
Date: Thu, 17 Nov 2022 01:54:22 -0800
From: Eric Dumazet <edumazet@...gle.com>
To: Jakub Sitnicki <jakub@...udflare.com>
Cc: patchwork-bot+netdevbpf@...nel.org, netdev@...r.kernel.org,
davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
tparkin@...alix.com, g1042620637@...il.com
Subject: Re: [PATCH net v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
On Thu, Nov 17, 2022 at 1:45 AM Jakub Sitnicki <jakub@...udflare.com> wrote:
>
> On Thu, Nov 17, 2022 at 01:07 AM -08, Eric Dumazet wrote:
> > On Wed, Nov 16, 2022 at 5:30 AM <patchwork-bot+netdevbpf@...nel.org> wrote:
> >>
> >> Hello:
> >>
> >> This patch was applied to netdev/net.git (master)
> >> by David S. Miller <davem@...emloft.net>:
> >>
> >> On Mon, 14 Nov 2022 20:16:19 +0100 you wrote:
> >> > sk->sk_user_data has multiple users, which are not compatible with each
> >> > other. Writers must synchronize by grabbing the sk->sk_callback_lock.
> >> >
> >> > l2tp currently fails to grab the lock when modifying the underlying tunnel
> >> > socket fields. Fix it by adding appropriate locking.
> >> >
> >> > We err on the side of safety and grab the sk_callback_lock also inside the
> >> > sk_destruct callback overridden by l2tp, even though there should be no
> >> > refs allowing access to the sock at the time when sk_destruct gets called.
> >> >
> >> > [...]
> >>
> >> Here is the summary with links:
> >> - [net,v4] l2tp: Serialize access to sk_user_data with sk_callback_lock
> >> https://git.kernel.org/netdev/net/c/b68777d54fac
> >>
> >>
> >
> > I guess this patch has not been tested with LOCKDEP, right ?
> >
> > sk_callback_lock always needs _bh safety.
> >
> > I will send something like:
> >
> > diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> > index 754fdda8a5f52e4e8e2c0f47331c3b22765033d0..a3b06a3cf68248f5ec7ae8be2a9711d0f482ac36
> > 100644
> > --- a/net/l2tp/l2tp_core.c
> > +++ b/net/l2tp/l2tp_core.c
> > @@ -1474,7 +1474,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
> > *tunnel, struct net *net,
> > }
> >
> > sk = sock->sk;
> > - write_lock(&sk->sk_callback_lock);
> > + write_lock_bh(&sk->sk_callback_lock);
> >
> > ret = l2tp_validate_socket(sk, net, tunnel->encap);
> > if (ret < 0)
> > @@ -1522,7 +1522,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
> > *tunnel, struct net *net,
> > if (tunnel->fd >= 0)
> > sockfd_put(sock);
> >
> > - write_unlock(&sk->sk_callback_lock);
> > + write_unlock_bh(&sk->sk_callback_lock);
> > return 0;
> >
> > err_sock:
> > @@ -1531,7 +1531,7 @@ int l2tp_tunnel_register(struct l2tp_tunnel
> > *tunnel, struct net *net,
> > else
> > sockfd_put(sock);
> >
> > - write_unlock(&sk->sk_callback_lock);
> > + write_unlock_bh(&sk->sk_callback_lock);
> > err:
> > return ret;
> > }
>
> Hmm, weird. I double checked - I have PROVE_LOCKING enabled.
> Didn't see any lockdep reports when running selftests/net/l2tp.sh.
>
> I my defense - I thought _bh was not needed because
> l2tp_tunnel_register() gets called only in the process context. I mean,
> it's triggered by Netlink sendmsg, but that gets processed in-line
> AFAIU:
>
> netlink_sendmsg
> netlink_unicast
> ->netlink_rcv
> genl_rcv
> genl_rcv_msg
> genl_family_rcv_msg
> genl_family_rcv_msg_doit
> ->doit
> l2tp_nl_cmd_tunnel_create
> l2tp_tunnel_register
Three different syzbot reports will help to better understand the
issue, sorry it is 2am for me, I am not sure in which time zone you
are in ...
Powered by blists - more mailing lists