lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zpv/5bnA0giMJDIy@pop-os.localdomain>
Date: Sat, 20 Jul 2024 11:20:21 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: John Fastabend <john.fastabend@...il.com>
Cc: Vincent Whitchurch <vincent.whitchurch@...adoghq.com>,
	Jason Xing <kerneljasonxing@...il.com>,
	Jakub Sitnicki <jakub@...udflare.com>,
	Jason Xing <kernelxing@...cent.com>, netdev@...r.kernel.org,
	bpf@...r.kernel.org
Subject: Re: Recursive locking in sockmap

On Thu, Jun 13, 2024 at 12:08:25PM -0700, John Fastabend wrote:
> Cong Wang wrote:
> > On Fri, Jun 07, 2024 at 02:09:59PM +0200, Vincent Whitchurch wrote:
> > > On Thu, Jun 6, 2024 at 2:47 PM Jason Xing <kerneljasonxing@...il.com> wrote:
> > > > On Thu, Jun 6, 2024 at 6:00 PM Vincent Whitchurch
> > > > <vincent.whitchurch@...adoghq.com> wrote:
> > > > > With a socket in the sockmap, if there's a parser callback installed
> > > > > and the verdict callback returns SK_PASS, the kernel deadlocks
> > > > > immediately after the verdict callback is run. This started at commit
> > > > > 6648e613226e18897231ab5e42ffc29e63fa3365 ("bpf, skmsg: Fix NULL
> > > > > pointer dereference in sk_psock_skb_ingress_enqueue").
> > > > >
> > > > > It can be reproduced by running ./test_sockmap -t ping
> > > > > --txmsg_pass_skb.  The --txmsg_pass_skb command to test_sockmap is
> > > > > available in this series:
> > > > > https://lore.kernel.org/netdev/20240606-sockmap-splice-v1-0-4820a2ab14b5@datadoghq.com/.
> > > >
> > > > I don't have time right now to look into this issue carefully until
> > > > this weekend. BTW, did you mean the patch [2/5] in the link that can
> > > > solve the problem?
> > > 
> > > No.  That patch set addresses a different problem which occurs even if
> > > only a verdict callback is used. But patch 4/5 in that patch set adds
> > > the --txmsg_pass_skb option to the test_sockmap test program, and that
> > > option can be used to reproduce this deadlock too.
> > 
> > I think we can remove that write_lock_bh(&sk->sk_callback_lock). Can you
> > test the following patch?
> > 
> > ------------>
> > 
> > diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> > index fd20aae30be2..da64ded97f3a 100644
> > --- a/net/core/skmsg.c
> > +++ b/net/core/skmsg.c
> > @@ -1116,9 +1116,7 @@ static void sk_psock_strp_data_ready(struct sock *sk)
> >  		if (tls_sw_has_ctx_rx(sk)) {
> >  			psock->saved_data_ready(sk);
> >  		} else {
> > -			write_lock_bh(&sk->sk_callback_lock);
> >  			strp_data_ready(&psock->strp);
> > -			write_unlock_bh(&sk->sk_callback_lock);
> >  		}
> >  	}
> >  	rcu_read_unlock();
> 
> Its not obvious to me that we can run the strp parser without the
> sk_callback lock here. I believe below is the correct fix. It
> fixes the splat above with test.

The lock is still there, but just read lock. And I don't see any writing to
psock->strp in strp_data_ready(), so using read lock makes sense to me.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ